Claude Extended Thinking Not Working Fix
Extended thinking gives Claude deeper reasoning capabilities, but misconfigured parameters produce 400 errors or empty thinking blocks. This guide covers every failure mode and the exact fix.
The Error
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "thinking.budget_tokens: must be >= 1024 and < max_tokens"
}
}
Quick Fix
- Set
budget_tokensto at least 1024 and strictly less thanmax_tokens. - When using tools with thinking, set
tool_choicetoautoornoneonly. - Pass thinking blocks back unmodified in multi-turn conversations.
What Causes This
Extended thinking fails when:
budget_tokensis less than 1024 or greater than or equal tomax_tokenstool_choiceis set toanyor a specific tool name (onlyautoandnonework)- Thinking blocks are modified or stripped when passing them back in multi-turn conversations
- Thinking parameters change between turns, invalidating cached messages
Full Solution
Basic Extended Thinking Setup
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": 10000},
messages=[{"role": "user", "content": "Solve this step by step: What is 127 * 389?"}]
)
for block in response.content:
if block.type == "thinking":
print(f"Thinking: {block.thinking[:200]}...")
elif block.type == "text":
print(f"Answer: {block.text}")
Fix budget_tokens Validation
# WRONG: budget_tokens >= max_tokens
thinking={"type": "enabled", "budget_tokens": 8000} # with max_tokens=8000
# WRONG: budget_tokens < 1024
thinking={"type": "enabled", "budget_tokens": 500}
# CORRECT: 1024 <= budget_tokens < max_tokens
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": 10000}
Fix Tool Choice Conflicts
# WRONG: tool_choice "any" with thinking
tool_choice={"type": "any"} # Error!
# CORRECT: tool_choice "auto" with thinking
tool_choice={"type": "auto"} # OK
Multi-Turn Thinking Continuity
Pass thinking blocks back unmodified to maintain reasoning continuity:
response1 = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": 10000},
messages=[{"role": "user", "content": "What is 127 * 389?"}]
)
# Second turn -- pass ALL content blocks back unmodified
messages = [
{"role": "user", "content": "What is 127 * 389?"},
{"role": "assistant", "content": response1.content},
{"role": "user", "content": "Now multiply that result by 2"}
]
TypeScript Example
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 16000,
thinking: { type: "enabled", budget_tokens: 10000 },
messages: [{ role: "user", content: "Solve step by step: What is 127 * 389?" }]
});
Prevention
- Always set max_tokens > budget_tokens + expected output: A good rule is
max_tokens = budget_tokens + 4096. - Default to tool_choice auto: When combining tools with thinking.
- Never modify thinking blocks: Return them exactly as received.
- Keep thinking params stable: Changing parameters between turns invalidates cached messages.
Paste your error into our Error Diagnostic for an instant fix.
Master Claude Code
Get lifetime access to all ClaudHQ tools, advanced workflows, and production-grade templates.
Get Lifetime AccessWritten by the ClaudHQ team ยท Expert Claude Code guides and tools