Claude Extended Thinking Not Working Fix

May 2026 · ClaudHQ

Extended thinking gives Claude deeper reasoning capabilities, but misconfigured parameters produce 400 errors or empty thinking blocks. This guide covers every failure mode and the exact fix.

The Error

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "thinking.budget_tokens: must be >= 1024 and < max_tokens"
  }
}

Quick Fix

Set budget_tokens to at least 1024 and strictly less than max_tokens.
When using tools with thinking, set tool_choice to auto or none only.
Pass thinking blocks back unmodified in multi-turn conversations.

What Causes This

Extended thinking fails when:

budget_tokens is less than 1024 or greater than or equal to max_tokens
tool_choice is set to any or a specific tool name (only auto and none work)
Thinking blocks are modified or stripped when passing them back in multi-turn conversations
Thinking parameters change between turns, invalidating cached messages

Full Solution

Basic Extended Thinking Setup

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[{"role": "user", "content": "Solve this step by step: What is 127 * 389?"}]
)

for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
    elif block.type == "text":
        print(f"Answer: {block.text}")

Fix budget_tokens Validation

# WRONG: budget_tokens >= max_tokens
thinking={"type": "enabled", "budget_tokens": 8000}  # with max_tokens=8000

# WRONG: budget_tokens < 1024
thinking={"type": "enabled", "budget_tokens": 500}

# CORRECT: 1024 <= budget_tokens < max_tokens
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": 10000}

Fix Tool Choice Conflicts

# WRONG: tool_choice "any" with thinking
tool_choice={"type": "any"}  # Error!

# CORRECT: tool_choice "auto" with thinking
tool_choice={"type": "auto"}  # OK

Multi-Turn Thinking Continuity

Pass thinking blocks back unmodified to maintain reasoning continuity:

response1 = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[{"role": "user", "content": "What is 127 * 389?"}]
)

# Second turn -- pass ALL content blocks back unmodified
messages = [
    {"role": "user", "content": "What is 127 * 389?"},
    {"role": "assistant", "content": response1.content},
    {"role": "user", "content": "Now multiply that result by 2"}
]

TypeScript Example

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

const response = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 16000,
    thinking: { type: "enabled", budget_tokens: 10000 },
    messages: [{ role: "user", content: "Solve step by step: What is 127 * 389?" }]
});

Prevention

Always set max_tokens > budget_tokens + expected output: A good rule is max_tokens = budget_tokens + 4096.
Default to tool_choice auto: When combining tools with thinking.
Never modify thinking blocks: Return them exactly as received.
Keep thinking params stable: Changing parameters between turns invalidates cached messages.

Paste your error into our Error Diagnostic for an instant fix.

Master Claude Code

Get lifetime access to all ClaudHQ tools, advanced workflows, and production-grade templates.

Get Lifetime Access

Written by the ClaudHQ team · Expert Claude Code guides and tools