Tech

Mystery solved: Anthropic reveals changes to Claude’s harness and operating instructions that may have caused the collapse

0 5 3 minutes read

Mystery solved: Anthropic reveals changes to Claude’s harness and operating instructions that may have caused the collapse

Within weeks, a growing chorus of developers and users of AI power say that Anthropic’s advanced models are losing their edge. Users across GitHub, X, and Reddit reported what they described as "AI inflation"-perceived corruption where Claude seems incapable of critical thinking, prone to hallucinations, and wasteful of tokens.

Critics point to a measurable change in behavior, saying the model has moved from a "research-first" approach is lazy, "edit-first" a style that could no longer be trusted with complex engineering.

While the company initially pushed back against the claims of "fear" demand management model, growing evidence from high-profile users and third-party benchmarks has created a huge trust gap.

Today, Anthropic addressed these concerns head-on, publishing a post-mortem report that identified three different product layer changes responsible for the reported quality issues.

"We take reports of defamation seriously," read Anthropic’s blog post on the matter. "We never intentionally degrade our models, and we were able to quickly ensure that our API and inference layer were not affected."

Anthropic claims to have resolved the issues by changing the logic effort and command verbosity, while also fixing a temporary save bug in version v2.1.116.

Increasing evidence of degradation

The controversy gained momentum in early April 2026, fueled by detailed technical analysis from the engineering community. Stella Laurenzo, Senior Director in AMD’s AI team, published a comprehensive audit of 6,852 Claude Code session files and more than 234,000 tool calls on Github that showed performance drops in her previous implementation.

His findings suggested that Claude’s depth of thinking was significantly reduced, leading to thought loops and a tendency to make choices. "easy fix" in the right place.

This anecdotal frustration appeared to be confirmed by third-party benchmarks. BridgeMind reported that Claude Opus 4.6’s accuracy dropped from 83.3% to 68.3% in their tests, causing its ranking to drop from No. 2 to No. 10.

Although some researchers argue that this particular benchmark comparison was flawed due to the inconsistency of the test sites, the account of Claude "a hill" it became a viral talking point. Users also reported that usage limits were stretching faster than expected, leading to suspicions that Anthropic was deliberately suppressing performance to manage growing demand.

Causes

In its post-morem blog post, Anthropic clarified that while the underlying model’s weights have not changed, three specific changes "harnesses" around models that have inadvertently interfered with their performance:

Automatic Thinking Effort: On March 4, Anthropic changed the automatic thinking effort from high to medium for Code Claude to deal with UI lag issues. This change was intended to prevent the interface from appearing "snow" while the model was thinking, but caused a significant decrease in the intelligence of complex tasks.
Caching Logic Bug: Posted on March 26, Revision of archive aimed at old pruning "thinking" from idle sessions contains a critical error. Instead of clearing the thought history once after an hour of inactivity, it cleared it for all subsequent turns, causing the model to lose "short-term memory" and repeat or forget.
Limitations of the Verbosity Prompt System: On April 16, Anthropic added instructions to the system to keep text between tool calls under 25 words and end responses under 100 words. This effort to reduce verbosity in Opus 4.7 has been reversed, resulting in a 3% drop in coding quality tests.

Impact and future protections

Quality issues extend beyond the Claude CLI Code, affecting the Claude Agent SDK again Claude Coworkalthough Claude API it was not affected.

Anthropic admitted that these changes made the model appear to have "little intelligence," who agreed that it was not the experience users should expect.

To regain user trust and prevent future backsliding, Anthropic implements several performance changes:

Indoor Dogfooding: A large part of the internal staff will need to use Claude Code specific community structures to ensure that they enjoy the product as users.
Advanced Testing Suites: The company will now run a comprehensive checklist for each model once "dissolution" for all fast switching systems to isolate the impact of specific commands.
Strict controls: New tools are designed to make rapid changes easier to research, and model-specific changes will be tightly gated on their targets.
Subscriber compensation: To address the token waste and performance crashes caused by these bugs, Anthropic has reset usage limits for all subscribers as of April 23rd.

The company intends to use its new @ClaudeDevs account on Series X and GitHub to provide critical thinking about future product decisions and maintain a more open dialogue with its developer base.

Mosegas 3 hours ago

0 5 3 minutes read

Mystery solved: Anthropic reveals changes to Claude’s harness and operating instructions that may have caused the collapse

Increasing evidence of degradation

Causes

Impact and future protections

Mosegas

Leave a Reply Cancel reply

April 18, 2023 – Russia-Ukraine news

The governor’s race is not expected until two weeks before Californians go to the polls

2 US Embassy officials among 4 killed in car crash following raid on drug den in Mexico

Why Harvey Weinstein is headed to New York court for the third time

Canadian woman killed in shooting at Mexican tourist resort: authorities – National

John Ternus Named Successor from 1 September 2026

Increasing evidence of degradation

Causes

Impact and future protections

Mosegas

Gael Monfils says goodbye to Madrid: 'We are not giving up' | ATP Tour

Trump says Israel, Lebanon agree to extend Hezbollah ceasefire by three weeks - National

Related Articles

Parents Can Now See What Their Kids Ask Meta AI About

OpenAI is pushing ChatGPT to independent functionality with GPT-5.5

The Titanium Court combines genres and cultural references to tell an unusual, humorous story

Note, Whoop: I’ll switch to Google’s new Fitbit tracker if these features are true

Leave a Reply Cancel reply

April 18, 2023 – Russia-Ukraine news

The governor’s race is not expected until two weeks before Californians go to the polls

2 US Embassy officials among 4 killed in car crash following raid on drug den in Mexico

Why Harvey Weinstein is headed to New York court for the third time

Canadian woman killed in shooting at Mexican tourist resort: authorities – National

John Ternus Named Successor from 1 September 2026