Claude Temperature Settings Guide (2026)

May 2026 · ClaudHQ

Yes, you can change Claude's temperature — but only through the Anthropic API. Claude.ai (the web interface) and Claude Code (the CLI) do not expose a temperature slider or flag. This guide covers what temperature does, how to set it in the API, recommended values for different tasks, and alternative sampling parameters.

What Temperature Does

Temperature controls the randomness of the model's output. It modifies the probability distribution that Claude uses to select each token (word or word-piece) in its response.

Technical Explanation

When Claude generates text, it produces a probability distribution over all possible next tokens. Temperature scales these probabilities before sampling:

Temperature = 0.0 — The model always picks the highest-probability token. Output is deterministic. Same input produces the same output every time.
Temperature = 0.5 — Moderate randomness. High-probability tokens are still favored, but there is some variation between runs.
Temperature = 1.0 — Full probability distribution is used as-is. Output is creative and varied. Default for Claude models.
Temperature > 1.0 — Not supported by the Anthropic API. The valid range is 0.0 to 1.0.

Mathematically, temperature divides the logits (raw model scores) before the softmax function converts them to probabilities. Lower temperature makes the distribution sharper (concentrated on top choices). Higher temperature makes it flatter (more spread across options).

Practical Impact

Temperature	Behavior	Best For
0.0	Deterministic, consistent	Code generation, factual answers, data extraction
0.1-0.3	Mostly consistent, slight variation	Technical writing, code review, analysis
0.4-0.6	Balanced creativity and coherence	General conversation, explanations
0.7-0.9	Creative, varied responses	Brainstorming, fiction, marketing copy
1.0	Maximum variety	Creative exploration, poetry, ideation

Setting Temperature in the API

Python SDK

import anthropic

client = anthropic.Anthropic()

# Low temperature for code generation
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=2048,
    temperature=0.0,
    messages=[
        {
            "role": "user",
            "content": "Write a Python function to merge two sorted arrays."
        }
    ],
)

print(response.content[0].text)

# High temperature for brainstorming
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=2048,
    temperature=0.9,
    messages=[
        {
            "role": "user",
            "content": "Give me 10 creative names for a productivity app."
        }
    ],
)

print(response.content[0].text)

TypeScript / JavaScript SDK

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

// Deterministic output for data extraction
const response = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  temperature: 0.0,
  messages: [
    {
      role: 'user',
      content: 'Extract all email addresses from this text: ...',
    },
  ],
});

console.log(response.content[0].text);

// Creative output for content generation
const creative = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 2048,
  temperature: 0.8,
  messages: [
    {
      role: 'user',
      content: 'Write a short story about a robot learning to paint.',
    },
  ],
});

console.log(creative.content[0].text);

cURL (Direct API Call)

curl https://api.anthropic.com/v1/messages \
  -H "content-type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "temperature": 0.0,
    "messages": [
      {
        "role": "user",
        "content": "Explain the CAP theorem in one paragraph."
      }
    ]
  }'

Valid Range

Minimum: 0.0
Maximum: 1.0
Default: 1.0

Values outside this range will return an API error. The API does not support temperatures above 1.0.

Claude Code and Temperature

No Direct Temperature Flag

Claude Code does not have a --temperature CLI flag. When you run claude in the terminal, the temperature is managed internally by the Claude Code client.

Influencing Output Style via CLAUDE.md

While you cannot set a numeric temperature in Claude Code, you can influence the output style through your CLAUDE.md instructions:

For more deterministic behavior, add to your project's CLAUDE.md:

# Code Style
- Always produce the most standard, conventional solution
- Avoid creative or unconventional approaches
- Match existing code patterns exactly
- Do not improvise when a standard pattern exists

For more creative behavior:

# Approach
- Explore multiple solution approaches before picking one
- Consider unconventional or novel solutions
- Propose creative alternatives when the standard approach has drawbacks

This is not temperature control — it is prompt engineering. But it achieves a similar practical effect.

API Provider Mode

If you are using Claude Code with the --api-key flag connected to the Anthropic API, the temperature is handled by the Claude Code client. To get true temperature control, build a custom integration using the API directly.

Recommended Temperatures by Task

Complete Temperature Reference Table

Task Type	Temperature	Rationale
Code generation	0.0	Deterministic, conventional solutions
Code review	0.0	Consistent issue detection across runs
Unit test generation	0.0	Reliable test structure and assertions
Bug fix suggestions	0.0-0.1	Accuracy over creativity
Data extraction (JSON, CSV)	0.0	Exact schema adherence
SQL query generation	0.0	Correctness-critical
Translation	0.0-0.1	Fidelity to source material
Summarization	0.0-0.2	Stays close to source, no embellishment
API documentation	0.1-0.2	Natural phrasing with technical accuracy
Technical writing	0.1-0.2	Slight variation for readability
Commit message generation	0.0-0.1	Concise and factual
Error message writing	0.1-0.2	Clear, varied phrasing
Email drafting (professional)	0.3-0.5	Natural tone without randomness
General conversation	0.4-0.6	Balanced and engaging
Explanation / teaching	0.3-0.5	Clear with some variety in examples
Product descriptions	0.5-0.7	Engaging copy, varied vocabulary
Blog post writing	0.5-0.7	Natural flow, avoids repetition
Marketing copy	0.6-0.8	Creative phrasing and hooks
Brainstorming / ideation	0.8-1.0	Maximum idea diversity
Creative writing (fiction)	0.7-1.0	Surprising word choices, varied structure
Poetry	0.8-1.0	Inventive language, unexpected metaphors
Name generation (brands, products)	0.9-1.0	Maximally diverse suggestions
Roleplaying / dialogue	0.6-0.8	Character-appropriate variation
Alternative solution exploration	0.5-0.7	Different approaches per run

Code Generation: 0.0

Deterministic output ensures the same correct solution every time. At temperature 0.0, Claude produces the most likely (usually most conventional and correct) code.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    temperature=0.0,
    messages=[{"role": "user", "content": "Implement a binary search tree in Python with insert, delete, and search methods."}],
)

Code Review: 0.0

Consistency matters for code review. You want the same issues flagged every time.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    temperature=0.0,
    messages=[{"role": "user", "content": f"Review this code for bugs and performance issues:\n\n{code}"}],
)

Technical Documentation: 0.1-0.2

Slight variation produces more natural writing while maintaining accuracy.

Creative Writing: 0.7-1.0

Higher temperatures produce more surprising word choices, varied sentence structures, and creative ideas.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    temperature=0.9,
    messages=[{"role": "user", "content": "Write a poem about debugging at 3 AM."}],
)

Brainstorming and Ideation: 0.8-1.0

Maximum creativity for generating diverse ideas.

Data Analysis and Extraction: 0.0

Precision is critical when extracting structured data from unstructured text.

Summarization: 0.0-0.2

Low temperature ensures the summary sticks to the source material without embellishment.

Translation: 0.0-0.1

Accuracy over creativity for translation tasks.

Temperature Experiments: Same Prompt, Different Results

To illustrate how temperature affects output, here is the same prompt sent at three different temperatures. The prompt is: "Suggest a variable name for a function that calculates the total price including tax."

Temperature 0.0 (three runs produce identical output):

calculate_total_with_tax

Temperature 0.5 (three runs produce similar but slightly varied output):

Run 1: calculate_total_with_tax
Run 2: compute_price_with_tax
Run 3: calculate_total_with_tax

Temperature 1.0 (three runs produce noticeably different output):

Run 1: get_taxed_total
Run 2: compute_final_price_after_tax
Run 3: calculate_gross_amount

At temperature 0.0, every run returns the same answer. At 0.5, the model occasionally picks a different but equally valid phrasing. At 1.0, every run explores a different part of the vocabulary, which is exactly what you want for brainstorming but not for deterministic code generation.

Temperature Experiment: Same Prompt at 5 Settings

Abstract descriptions of temperature are not enough. Here is a concrete experiment: the same coding prompt sent to Claude Sonnet 4 at five temperature settings. The prompt is: "Write a Python function to validate email addresses. Return True if valid, False otherwise."

Temperature 0.0 (Deterministic)

import re

def validate_email(email: str) -> bool:
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

Three consecutive runs produced byte-identical output. The function uses a standard regex pattern found in most Python tutorials. Straightforward, conventional, correct for common cases.

Temperature 0.25 (Slightly Variable)

import re

def validate_email(email: str) -> bool:
    if not email or '@' not in email:
        return False
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

Run 2 produced the same output. Run 3 added a len(email) > 254 check. At 0.25, the core approach stays identical but minor defensive additions appear occasionally. The added guard clause is a genuine improvement over 0.0.

Temperature 0.5 (Balanced)

import re

def validate_email(email: str) -> bool:
    if not isinstance(email, str) or not email.strip():
        return False
    pattern = r'^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z]{2,63}$'
    if not re.match(pattern, email):
        return False
    local, domain = email.rsplit('@', 1)
    if len(local) > 64 or len(domain) > 253:
        return False
    return True

Run 2 used a different approach: splitting on '@' first, then validating each part separately. Run 3 added TLD length checking. At 0.5, the structural approach varies between runs. Each output is correct but emphasizes different validation aspects.

Temperature 0.75 (Creative)

def validate_email(email: str) -> bool:
    try:
        local, domain = email.rsplit('@', 1)
    except (ValueError, AttributeError):
        return False

    if not local or len(local) > 64:
        return False
    if not domain or '.' not in domain:
        return False

    forbidden = set(' "(),:;<>@[\\]')
    if any(c in forbidden for c in local):
        return False

    parts = domain.split('.')
    return all(
        part and len(part) <= 63 and part[0] != '-' and part[-1] != '-'
        for part in parts
    ) and len(parts[-1]) >= 2

No regex at all. Run 2 used the email.utils module from the standard library. Run 3 built a state machine parser. At 0.75, Claude explores fundamentally different implementation strategies. The regex-free approach shown above is more RFC-compliant than the regex version at 0.0.

Temperature 1.0 (Maximum Variety)

from typing import Optional
import idna

def validate_email(email: str, allow_unicode_local: bool = False) -> bool:
    if not isinstance(email, str):
        return False

    at_count = email.count('@')
    if at_count != 1:
        return False

    local, domain = email.split('@')

    if not local or len(local.encode('utf-8')) > 64:
        return False

    try:
        domain = idna.encode(domain, uts46=True).decode('ascii')
    except idna.IDNAError:
        return False

    if not all(
        c.isalnum() or c in '.-'
        for c in domain
    ):
        return False

    labels = domain.split('.')
    return (
        len(labels) >= 2
        and all(0 < len(l) <= 63 for l in labels)
        and len(labels[-1]) >= 2
    )

Run 2 created an async version with DNS MX record checking. Run 3 built a class-based validator with configurable strictness levels. At 1.0, every run produces a substantially different implementation. The IDNA-aware version above handles internationalized domain names, which no other temperature produced.

Scored Results

Setting	Consistency (3 runs)	RFC Compliance	Lines of Code	Defensive Checks	Practical Use Case
0.0	3/3 identical	Basic	4	0	Quick validation, form fields
0.25	2/3 identical	Basic+	6	1-2	Production forms with edge cases
0.5	0/3 identical	Good	10	3-4	Backend API validation
0.75	0/3 identical	Strong	14	5+	Email service infrastructure
1.0	0/3 identical	Comprehensive	18+	6+	Standards-compliant email systems

The takeaway: temperature 0.0 gives you the answer you expect. Temperature 0.5-0.75 gives you the answer you need. For production code where you want the "right" solution selected from multiple valid approaches, run the prompt at 0.5 and review the output. For exploration, run at 0.75-1.0 three times and pick the best result.

Alternative Sampling Parameters

Temperature is not the only way to control output randomness. The Anthropic API supports additional parameters.

top_p (Nucleus Sampling)

top_p limits the model to choosing from the smallest set of tokens whose cumulative probability exceeds the specified value.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    top_p=0.9,
    messages=[{"role": "user", "content": "Explain quantum computing."}],
)

top_p = 1.0 — Consider all tokens (default)
top_p = 0.9 — Only consider tokens in the top 90% of cumulative probability
top_p = 0.1 — Very restrictive, only the most likely tokens

top_k

top_k limits the model to choosing from only the top K most likely tokens.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    top_k=40,
    messages=[{"role": "user", "content": "List common design patterns."}],
)

top_k = 1 — Always pick the single most likely token (equivalent to temperature 0)
top_k = 40 — Consider the top 40 tokens
top_k = -1 or not set — No limit (default)

Combining Parameters

You can use temperature together with top_p and top_k:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=2048,
    temperature=0.7,
    top_p=0.95,
    top_k=50,
    messages=[{"role": "user", "content": "Suggest innovative features for a task management app."}],
)

Anthropic recommends either adjusting temperature or top_p, not both simultaneously in most cases. Using both can produce unexpected interactions.

How Temperature, top_p, and top_k Interact Technically

These three parameters apply in sequence during token sampling:

top_k filter runs first. If top_k=40, the model discards all tokens except the 40 highest-probability ones. The remaining tokens' probabilities are renormalized to sum to 1.0.

top_p filter runs second. From the surviving tokens, it keeps the smallest set whose cumulative probability exceeds top_p. If top_p=0.9, it walks down the ranked list until probabilities sum to 0.9, then discards the rest.

Temperature scaling runs last. The logits of the remaining tokens are divided by the temperature value before the final softmax. Lower temperature sharpens the distribution (the top token gets even more probability), higher temperature flattens it.

This means combining low top_k with high temperature can produce strange results: you restrict the candidate pool to a few tokens, then flatten their probabilities, making the model randomly choose among a very small set. Conversely, combining high top_k with low temperature effectively ignores the top_k because temperature drives the selection toward the top tokens anyway.

Practical recommendation: For most applications, adjust temperature alone and leave top_p and top_k at their defaults. If you need finer control, use top_p as a safety ceiling (set it to 0.95 to prevent extremely unlikely tokens from being selected) while using temperature for the primary creativity control. Reserve top_k for specialized applications where you need an absolute cap on the candidate pool size.

Temperature and Model Selection

Different Claude models may behave differently at the same temperature:

Claude Opus — most capable model, temperature 0.0 produces high-quality deterministic output
Claude Sonnet — balanced speed and quality, good across all temperature ranges
Claude Haiku — fastest model, lower temperatures recommended for accuracy-critical tasks

The temperature parameter works identically across all models — the valid range is always 0.0 to 1.0.

Frequently Asked Questions

Can I set temperature in Claude.ai?

No. The Claude.ai web interface does not expose a temperature control. Claude.ai uses Anthropic's default settings. To control temperature, use the API directly.

What is the default temperature for Claude?

The default temperature is 1.0. If you do not specify a temperature in your API call, Claude uses the full probability distribution for sampling.

Does lower temperature mean better code?

Not necessarily better, but more consistent and conventional. Temperature 0.0 produces the most standard solution, which is usually what you want for production code. But for exploring alternative approaches, a slightly higher temperature (0.2-0.4) can reveal solutions you would not have considered.

Can I set temperature per-message in a conversation?

Temperature is set per API call, not per message. In a multi-turn conversation, you can change the temperature between calls, but you must include the full conversation history each time.

Does temperature affect Claude's reasoning quality?

At very high temperatures (0.9-1.0), Claude may occasionally produce less coherent reasoning because the sampling is more random. For tasks requiring careful logical reasoning, lower temperatures (0.0-0.3) generally produce more reliable results.

Is temperature 0.0 truly deterministic?

Nearly. At temperature 0.0, the API uses greedy decoding (always selecting the highest-probability token). In practice, results are very consistent, though minor variations can occur due to floating-point arithmetic in distributed systems.

How does temperature interact with system prompts?

Temperature and system prompts are independent controls. A well-crafted system prompt can constrain output style regardless of temperature. Using both together gives you fine-grained control — the system prompt sets the frame, and temperature controls variation within it.

Should I use temperature or top_p?

For most use cases, temperature is simpler and more intuitive. Use top_p when you want to cap the randomness without affecting the relative probabilities of the top tokens. Anthropic's general recommendation is to adjust one or the other, not both.

Try it: Try the Error Diagnostic for instant fixes

I hit this exact error six months ago. Then I wrote a CLAUDE.md that tells Claude my stack, my conventions, and my error handling patterns. Haven't seen it since.

I run 5 Claude Max subs, 16 Chrome extensions serving 50K users, and bill $500K+ on Upwork. These CLAUDE.md templates are what I actually use.

Grab the templates — $99 once, free forever →

Built by Michael Lip — solo dev, Da Nang.