Fix “Context Window Exceeded” Mid-Conversation in Claude Code

Q: What does the /compact command do in Claude Code?

The /compact command asks Claude to summarize the entire conversation into a condensed form, then replaces the full conversation history with that summary. This typically reduces context usage from 90%+ down to 30-50%, giving you room to continue working without losing the important context of what was accomplished.

Q: How can I prevent hitting the Claude Code context window limit?

Three strategies: (1) Use /compact proactively at 60% usage before you hit the wall. (2) Start focused sub-sessions for each distinct task instead of one long session. (3) Avoid reading entire large files — specify line ranges or ask Claude to read only the relevant function. A 5,000-line file dump costs 15,000+ tokens; a targeted 50-line read costs under 200.

June 2026 · ClaudHQ

The Error

Warning: Context window usage at 95% (190,000 / 200,000 tokens).
  Claude Code will begin compacting conversation history.

Error: Context window exceeded. Cannot process this request.
  Total tokens (system + messages + tools): 201,347
  Maximum context window: 200,000 tokens
  Please start a new conversation or reduce your prompt.

TL;DR — Quick Fix

Type /compact inside Claude Code. This summarizes your conversation and frees 50-70% of context space instantly.

The Fix

Step 1: Use /compact to summarize and free context space

# Inside Claude Code, type:
/compact

# Or with a focus hint:
/compact Focus on the current task: fixing the auth module

Step 2: Start a sub-session for the next task

# Exit and start fresh, referencing only what you need
claude -p "Read src/auth/login.ts and fix the null pointer on line 42"

# Or use --continue to resume with compacted context
claude --continue

Step 3: Verify context usage dropped

# Check context usage after compacting
# The status bar shows token usage
# Expected: Context usage drops to 30-50% after /compact

Why This Happens

Claude Code accumulates context with every message exchange: your prompts, Claude's responses, tool calls, tool results (file contents, command output), and internal system prompts. Reading large files, running commands with verbose output, and iterating on code changes all consume tokens rapidly. A single cat of a 5,000-line file can use 15,000+ tokens. After extended sessions with many file reads and edits, the 200K token window fills up.

If That Does Not Work

Start a fresh session with a precise prompt that includes only the necessary context: files, error messages, and the exact task.
Split large tasks into smaller, independent sessions — each with a narrow focus.
Use /cost or check the status bar to see current token usage before starting expensive operations.
Avoid reading entire large files — use line ranges like Read lines 100-150 of src/app.ts instead.

Prevention

Add this to your CLAUDE.md:

Use /compact proactively when context exceeds 60%.
Avoid reading entire large files - use line ranges instead.
Prefer focused sub-tasks over long sprawling sessions.
Keep CLAUDE.md under 2K tokens to preserve context for actual work.

FAQ

What does /compact actually do?

The /compact command asks Claude to summarize the entire conversation into a condensed form, then replaces the full conversation history with that summary. This typically reduces context usage from 90%+ down to 30-50%, giving you room to continue working without losing the important context of what was accomplished.

Does compacting lose important context?

The summary preserves key decisions, file paths, and task progress. However, exact code snippets and verbose command outputs are condensed. If you need precise details from earlier in the conversation, you may need to re-read specific files after compacting. Adding a focus hint like /compact Focus on the auth module changes helps preserve the most relevant context.

How can I prevent hitting the context limit?

Three strategies: (1) Use /compact proactively at 60% usage before you hit the wall. (2) Start focused sub-sessions for each distinct task instead of one long session. (3) Avoid reading entire large files — specify line ranges or ask Claude to read only the relevant function. A 5,000-line file dump costs 15,000+ tokens; a targeted 50-line read costs under 200.

Try it: Estimate your token usage with the Cost Calculator

I used to hit context limits every 20 minutes. Then I wrote a CLAUDE.md that enforces compact-first workflows. Now sessions last hours without interruption.

I run 5 Claude Max subs, 16 Chrome extensions serving 50K users, and bill $500K+ on Upwork. These CLAUDE.md templates are what I actually use.

Grab the templates — $99 once, free forever →

Built by Michael Lip — solo dev, Da Nang.