Why Is Claude Code Slow? Official Causes and Developer Fixes
Why Is Claude Code Slow? Official Causes and Developer Fixes
If Claude Code has felt slow or degraded recently, you are not imagining it — and the cause is not a single bug. Claude Code did not simply "get worse" in one clean way. Recent developer complaints mixed together several different problems: lower coding quality, slower-feeling sessions, forgetfulness, repeated tool choices, faster quota drain, actual service errors, and local context bloat.
The most important official explanation is Anthropic's April 23, 2026 engineering postmortem. Anthropic said recent Claude Code quality issues came from three separate product-layer changes affecting Claude Code, Claude Agent SDK, and Claude Cowork, while the API and inference layer were not impacted. Anthropic said all three issues were resolved by April 20 in version 2.1.116.
This article turns that postmortem and the official Claude Code docs into a developer troubleshooting guide.
Short answer: why Claude Code recently felt slower or worse
Claude Code recently felt worse for some developers because three changes compounded: Anthropic lowered default reasoning effort for some models, introduced a caching-related bug that repeatedly dropped older reasoning in some stale sessions, and shipped a system prompt change intended to reduce verbosity that hurt coding quality. Anthropic says those issues were fixed by April 20, 2026. (Postmortem)
But not every "Claude Code is slow" report has the same cause.
A slow or bad Claude Code session can come from:
- Anthropic-side incidents or model capacity.
- A lower reasoning effort setting.
- A stale or bloated session context.
- Large files or tool output filling the context window — Claude Code's working memory for the current session, which grows with every tool call and conversation turn.
- Local environment problems, especially search or filesystem issues.
- An outdated Claude Code version.
- Account quota or API rate limits.
Jump to: Short answer · What Anthropic said · Classify the failure · 8 fixes · Diagnosis matrix · Common mistakes · FAQ · Takeaways
What did Anthropic officially say happened to Claude Code?
Anthropic's postmortem identified three separate changes. The important point for developers is that these were not all "latency bugs." Some affected answer quality, some affected memory-like continuity inside a session, and some affected token/cache behavior.
Reasoning effort changed from high to medium
Anthropic said it changed Claude Code's default reasoning effort from high to medium on March 4, 2026 for Sonnet 4.6 and Opus 4.6. The reason was latency: some users on high effort were seeing very long thinking times, enough that the UI could appear frozen. Anthropic later called this the wrong tradeoff and reverted it on April 7.
In Claude Code, reasoning effort controls the tradeoff between capability, latency, and token usage. Anthropic's model configuration docs say low is for short, scoped, latency-sensitive tasks; medium can reduce token usage while trading off some intelligence; high is a minimum for intelligence-sensitive work; and xhigh is recommended for most coding and agentic tasks on Opus 4.7.
For developers, this means a "slow" experience can have two opposite causes:
- The model is thinking more, producing better output but taking longer.
- The model is thinking less, responding faster but producing weaker code.
A caching optimization dropped older reasoning repeatedly
Anthropic said it shipped a March 26 optimization intended to reduce latency when users resumed sessions that had been idle for more than an hour. The intended behavior was to clear older thinking once after a session became stale. A bug caused older thinking to be cleared on every turn for the rest of the session.
That matters because Claude Code relies on conversation history, prior tool calls, prior edits, and previous reasoning to continue a multi-step coding task. Anthropic said the bug made Claude appear forgetful, repetitive, and prone to odd tool choices. It also caused cache misses, which Anthropic believes drove reports of usage limits draining faster than expected. (Postmortem)
For developers, this is the key lesson:
If a Claude Code session has become stale, repetitive, confused, or expensive, do not keep arguing with it inside the same thread. Compact, clear, rewind, or start a new focused session.
Claude Code's own docs make the same operational point from another angle: as context fills up, Claude Code clears older tool outputs first and summarizes conversation history if needed, but detailed instructions from early in the conversation can be lost. Persistent rules should go in CLAUDE.md, not only in the chat history.
A system prompt change reduced coding quality
Anthropic said it added a system prompt instruction on April 16 to reduce verbosity. One instruction limited text between tool calls and final responses. After broader ablation testing, Anthropic found a measurable drop in an evaluation and reverted the prompt as part of the April 20 release.
This is the most important product-design lesson from the incident: shorter answers are not automatically better answers for coding agents.
Coding work often needs enough reasoning, enough plan context, and enough explanation between tool calls to avoid shallow edits. The better instruction is:
"Be concise, but do not omit reasoning needed to make safe code changes."
That phrasing preserves the quality constraint.
Is Claude Code slow, overloaded, or just in a bad session?
Before changing settings, classify the failure.
| Symptom | Likely cause | What to check first |
|---|---|---|
| Claude responds slowly but eventually gives strong answers | High effort, large context, long tool run, or large output | /effort, /context, task size |
| Claude responds quickly but makes shallow mistakes | Effort too low or weak prompt | /effort, model selection |
| Claude forgets earlier choices or repeats itself | Stale/bloated session or context compaction issue | /context, /compact, /clear, /rewind |
| Claude burns usage faster than expected | Large context, cache misses, too many tool calls, agent teams | /usage, /context, session structure |
| Claude shows 529 overload errors | Anthropic capacity issue | Claude status, retry, switch model |
| Claude shows 429 errors | Rate limit or credential/provider limit | /status, provider console, concurrency |
| Search misses files or is slow on WSL | Filesystem/search issue | Project location, ripgrep, WSL filesystem |
| Claude Code crashes or resume fails | Version-specific bug | claude --version, changelog, update |
How to fix Claude Code slowness as a developer
1. Update Claude Code and check your version
Start with version sanity.
claude --version
claude updateThis matters because some issues are version-specific. The Claude status page has reported crashes when resuming prior sessions with --resume or --continue in specific versions.
Do not debug prompts before you eliminate a bad client version.
2. Check the active model
Inside Claude Code, run:
/modelConfirm you are using the model you intended. A previous /model choice or environment variable may have selected a smaller model. Also check whether your plan supports the model you selected.
3. Raise or verify reasoning effort
Inside Claude Code, run:
/effortFor difficult debugging, multi-file refactors, architecture decisions, concurrency bugs, security-sensitive changes, or test-repair loops, do not run on low or medium effort unless you are deliberately optimizing for speed.
You can set effort using /effort, the /model picker, --effort, CLAUDE_CODE_EFFORT_LEVEL, settings, or skill/subagent frontmatter. The environment variable has the highest precedence. (Effort docs)
Practical defaults:
| Task type | Suggested effort |
|---|---|
| Rename variable, small copy edit, simple command | low or medium |
| Normal coding task | high |
| Multi-file debugging or agentic work | xhigh where available |
| Hard architecture/debugging session | xhigh or one-off ultrathink |
| Cost-sensitive bulk scripting | medium, but verify output carefully |
max can help demanding tasks but may show diminishing returns and is prone to overthinking.4. Use ultrathink only for hard one-off tasks
For a single hard turn, include:
ultrathinkClaude Code recognizes ultrathink as a one-off request for deeper reasoning without changing the session effort setting. Other phrases like "think hard" are passed through as normal prompt text and are not recognized as special keywords.
Use it for:
- "Find the real root cause across these logs."
- "Design the migration plan before editing code."
- "Review this authentication flow for security bugs."
- "Explain why the test passes locally but fails in CI."
5. Clear or compact stale context
Check context:
/contextCompact at a natural breakpoint:
/compact focus on the current bug, files changed, test output, and next stepsClear when switching tasks:
/clearClaude Code docs recommend /clear when switching to unrelated work because stale context wastes tokens on every later message. For a deeper look at how context accumulates mechanically over a session, see Why Claude Code gets slower the longer you use it. For long coding sessions, follow this rule:
One task, one session. New task, new context.
If you are debugging authentication middleware, do not keep the same Claude Code session alive when you move to frontend CSS, then database migrations, then CI config. That creates context pollution.
6. Reduce oversized files and tool output
If Claude Code reads giant files, dependency directories, generated build output, test logs, minified bundles, or massive JSON files, your context fills quickly.
Better prompt:
Read only src/auth/jwt_validator.py lines 80-180.
Do not scan the whole repository yet.
Identify why expired tokens are passing validation.Worse prompt:
Search the whole repo and fix auth.For auto-compaction thrashing, ask Claude to read oversized files in smaller chunks, use /compact with a focused instruction, move large-file work to a subagent, or run /clear if earlier conversation is no longer needed.
7. Diagnose local performance problems
Run:
/doctor/doctor checks installation health, settings validity, MCP configuration, and context usage. If Claude Code will not start, run claude doctor from the shell. (Troubleshooting docs)
If search is slow or incomplete on WSL, filesystem read penalties across Windows/Linux boundaries may reduce search results. The recommended fixes are more specific searches, moving the project to the Linux filesystem under /home/, or running Claude Code natively on Windows.
If file discovery is broken, install ripgrep and configure Claude Code to use the system version:
brew install ripgrep # macOS
sudo apt install ripgrep # Ubuntu/Debian
winget install BurntSushi.ripgrep.MSVC # WindowsThen set:
export USE_BUILTIN_RIPGREP=08. Check Anthropic status before debugging your setup
Check the Claude Status page before rewriting prompts or reinstalling tools. If the status page shows model errors, do not waste time changing prompts. Switch models, wait, or do local non-AI work until the service stabilizes.
| Error | Meaning | Developer action |
|---|---|---|
| 529 overload | Anthropic capacity issue | Check status, retry later, switch model |
| 429 rate limit | Your API/provider/account limit | Check /status, reduce concurrency, request higher limits |
| Session/weekly limit | Subscription quota exhausted | Check /usage, wait for reset, buy/request extra usage |
| Timeout | Large response, high load, network/proxy issue | Retry, split prompt, raise timeout only if network/proxy is the issue |
Claude Code slow-response diagnosis matrix
Use this as a practical runbook.
| Step | Command/check | What it tells you | Action |
|---|---|---|---|
| 1 | Claude Status page | Whether Anthropic has active incidents | If degraded, switch model or wait |
| 2 | claude --version | Whether your CLI may be outdated | Run claude update |
| 3 | /model | Whether you are on the intended model | Switch to expected model |
| 4 | /effort | Whether reasoning effort is too low/high | Raise for hard work, lower for simple work |
| 5 | /context | Whether session context is bloated | Compact, clear, or start fresh |
| 6 | /usage | Whether quota is near exhaustion | Reduce context, wait, or buy/request extra usage |
| 7 | /doctor | Whether install/config/MCP/context has issues | Fix reported local issues |
| 8 | /feedback | Sends reproducible issue to Anthropic | Use when issue persists after checks |
Common mistakes developers make when Claude Code feels worse
Mistake 1: Treating quality degradation as a prompting problem only
Sometimes the problem is not your prompt. Anthropic's April 23 postmortem confirms that product-layer changes caused real quality issues for some users.
Still, do not jump from "Claude made a bad edit" to "the model is broken." First check model, effort, context, version, and status.
Mistake 2: Correcting bad output repeatedly in the same thread
If Claude makes a bad turn, replying with corrections can keep the bad attempt in context and anchor later answers.
Anthropic's error reference recommends rewinding when a response goes wrong. Press Esc twice or run /rewind, then rephrase the prompt with more specifics.
Better:
/rewindThen:
Focus only on the failing test in tests/test_auth.py.
Do not edit production code yet.
First explain the failure path.Mistake 3: Letting huge context accumulate across unrelated tasks
Long-running sessions feel convenient, but stale context costs tokens and can degrade relevance. Claude Code docs explicitly recommend /clear when switching to unrelated work.
A clean session is often faster than a heroic compacted session.
Mistake 4: Using Opus or high effort for every task
High effort and stronger models are not always the right default for every turn. For small edits, they can be slower and more expensive than needed.
Use effort as a control knob:
- Simple task: lower effort.
- Hard reasoning task: higher effort.
- One hard turn:
ultrathink. - Long-running task: monitor context and usage.
Mistake 5: Ignoring WSL and filesystem penalties
If Claude Code search is weak or slow on WSL, the issue may be filesystem placement, not model quality. Anthropic specifically recommends moving the project to the Linux filesystem under /home/ rather than working across /mnt/c/ when WSL filesystem penalties affect search.
Frequently Asked Questions About Claude Code Slowness
Why is Claude Code slow?
Claude Code can be slow because the model is using deeper reasoning, the session context is large, tool calls are expensive, the service is under load, or your local environment is slowing search and file access. Recent quality complaints were also tied to three Anthropic product-layer changes that were resolved by April 20, 2026.
Did Anthropic intentionally degrade Claude Code?
Anthropic said it does not intentionally degrade its models and said the API and inference layer were unaffected in the April 23 postmortem. The official explanation was three product-layer issues: effort default changes, a caching bug, and a system prompt change.
What should I check first when Claude Code feels worse?
Check /model, /effort, /context, claude --version, and the Claude Status page. If those are normal, run /doctor and consider whether your session has stale instructions, oversized files, or too much previous tool output.
Should I use high effort or xhigh effort in Claude Code?
Use high or xhigh for intelligence-sensitive coding tasks, debugging, refactoring, and agentic work. Anthropic's docs say xhigh is best for most coding and agentic tasks on Opus 4.7, while medium trades off some intelligence for lower token usage.
What is the difference between /compact and /clear?
/compact summarizes and preserves the useful parts of the current session, while /clear starts fresh. Use /compact when you are continuing the same task; use /clear when switching to unrelated work.
Why is Claude Code using my quota faster than expected?
Quota can drain faster when context is large, tool calls are repeated, cache misses happen, or agent teams/subagents are used heavily. Anthropic's postmortem said the March 26 caching bug likely drove some reports of usage limits draining faster than expected.
What does a 529 error mean in Claude Code?
A 529 overload error means the API is temporarily at capacity across users. It is not your usage limit and does not count against your quota; recommended actions are checking status, retrying later, or switching models.
What does a 429 error mean in Claude Code?
A 429 means you hit the configured rate limit for your API key, Amazon Bedrock project, or Google Vertex AI project. Recommended actions are checking /status, reviewing provider limits, reducing concurrency, or switching to a smaller model for high-volume scripted runs.
Key Takeaways
- Claude Code's recent degradation was not one bug. Anthropic identified three product-layer issues: effort defaults, caching/context handling, and a system prompt change.
- "Slow" is ambiguous. Separate latency, lower answer quality, stale context, quota drain, rate limits, and service overload.
- For hard coding work, verify
/modeland/effortbefore blaming prompts. - For stale or confused sessions, use
/context,/compact,/clear, or/rewind. - For local issues, run
/doctor, check WSL filesystem placement, and install/use systemripgrepif search is broken. - For 529 errors, check status and switch models. For 429 errors, check credentials, provider limits, and concurrency.
- Do not keep unrelated development tasks in one long Claude Code session.
Related reading
- Why Claude Code gets slower the longer you use it — the session mechanics behind tool call latency and context accumulation
- How to cut Claude Code cost — 9 tactics for reducing token burn mid-session
Use this checklist before opening a bug report or rewriting your workflow:
1. Check Claude Status.
- Run claude --version.
- Run claude update.
- Check /model.
- Check /effort.
- Check /context.
- Compact or clear stale sessions.
- Run /doctor.
- Use /feedback only after reproducing the issue with details.
