D1.3 · Domain 1 · Agentic Architectures · 27% of CCA-F

Stop Reason.

7 min read·10 sections·Tier A

stop_reason is the authoritative struct field that says why Claude stopped, either end_turn (finished) or tool_use (wants to call a tool). Parsing natural-language phrases for termination is the most-tested distractor; it's unreliable and fails in production. Checking stop_reason is the only deterministic loop control. messages API enum

Canonical control signalDomain 1Heavily tested
Stop Reason, hero illustration featuring Loop mascot in a warm gallery scene.
Domain D1Agentic Architectures · 27%
On this page
01 · Summary

TLDR

stop_reason is the authoritative struct field that says why Claude stopped, either end_turn (finished) or tool_use (wants to call a tool). Parsing natural-language phrases for termination is the most-tested distractor; it's unreliable and fails in production. Checking stop_reason is the only deterministic loop control. messages API enum

2
Core values
1
Canonical pattern
3
Anti-patterns
D1
Exam domain
100%
Reliability vs NL parsing
02 · Definition

What it is

stop_reason is a four-value enum field on every messages.create() response that signals why Claude stopped generating. The values you act on: end_turn (model is done), tool_use (model wants tools executed), max_tokens (output budget exhausted), and stop_sequence (custom stop string matched). It is the authoritative termination signal in any agentic loop.

The contract is structural, not linguistic. A response can contain the words "I'm done" while stop_reason is still tool_use, the text is preamble ("let me verify the customer first"), the tool_use is the real action. Conversely, a response can be silent on completion while stop_reason: "end_turn" signals it. Always read the field; never read the text.

The two production branches you write code for every day are end_turn (exit the loop) and tool_use (execute tools, append results, continue). Everything else is either a graceful partial (max_tokens) or a rare edge case (stop_sequence). The exam drills these four branches relentlessly because text-shape parsing is the #1 bug: developers check content[0].type === "text" to decide "the agent is done," but a typical assistant message is [text, tool_use, tool_use].

`max_tokens` is not an error, it's a normal partial-result signal. When the output budget is exhausted mid-task, Claude emits stop_reason: "max_tokens" with whatever was generated so far. Save the partial work, then either raise max_tokens for a retry or chunk the input. Treating it as a crash loses real progress. The exam tests whether you recognize partial completion as a design decision, not a failure mode.

03 · Mechanics

How it works

Every messages.create() call returns a Message object with a stop_reason field. The value is set by Claude during generation: if the model finishes its response, stop_reason is end_turn; if it requests a tool, the first tool_use block is appended to content and stop_reason becomes tool_use; if output reaches max_tokens, generation stops there. The field is always present and always one of these four values.

The canonical branching pattern: if stop_reason == "end_turn" then exit else if stop_reason == "tool_use" then execute and continue else if stop_reason == "max_tokens" then save partial else if stop_reason == "stop_sequence" then inspect match. Missing a branch is a logic bug, it crashes silently or retries indefinitely. TypeScript's discriminated unions make this safer: the compiler catches missed cases.

Tool execution happens only when stop_reason == "tool_use". In that case, content is an array containing one or more tool_use blocks (with name, id, input JSON). Your harness iterates, executes the tool, appends a tool_result block to the message list. The list grows: [user_msg, assistant_resp_1, tool_results_1, assistant_resp_2, tool_results_2, ...]. Growth is bounded by context size and by the stop_reason signal.

The max_tokens parameter sets the output budget per turn, not globally. If you ask for 2048 tokens and the model runs out at 1800, stop_reason: "max_tokens" arrives. The next call with max_tokens=4096 gets a fresh budget. This distinction matters: `max_tokens` is per-turn, not cumulative. Many developers confuse this and assume a single max_tokens value caps the entire loop. It does not.

Stop Reason mechanics, painterly diagram featuring Loop mascot.
04 · In production

Where you'll see it

Chatbot turn termination

User asks for an account balance. Agent calls get_balance. Response: stop_reason='tool_use'. Agent executes, appends result. Next turn: stop_reason='end_turn' with the answer. If code instead checks content[0].type === 'text', it exits at any text presence, even mid-tool-use.

Long-document extraction with token limits

Extracting entities from a 200-page contract. Mid-extraction, stop_reason='max_tokens' arrives. Code must save state, return partial result, and either chunk or raise the limit. Treating max_tokens as success silently truncates output.

Headless CI run

Claude in -p mode emits a single response. CI script reads stop_reason from the JSON output. end_turn → success path; max_tokens → truncated; stop_sequence → matched a custom guard. The script must branch all four outcomes.

05 · Implementation

Code examples

Explicit handler for all four stop_reason values
from anthropic import Anthropic

client = Anthropic()

def handle_response(resp) -> dict:
    """Branch on stop_reason. Never inspect content shape for termination."""

    if resp.stop_reason == "end_turn":
        # Normal completion. Extract text and exit loop.
        text = "".join(b.text for b in resp.content if b.type == "text")
        return {"status": "ok", "text": text}

    if resp.stop_reason == "tool_use":
        # Continue loop: execute tools, append results, resend.
        return {"status": "continue", "tool_calls": [
            {"id": b.id, "name": b.name, "input": b.input}
            for b in resp.content if b.type == "tool_use"
        ]}

    if resp.stop_reason == "max_tokens":
        # Token budget exhausted. Return partial; chunk on retry.
        partial = "".join(b.text for b in resp.content if b.type == "text")
        return {"status": "partial", "text": partial,
                "next_action": "chunk_input_or_raise_max_tokens"}

    if resp.stop_reason == "stop_sequence":
        # A custom stop_sequence matched. Treat as intentional termination.
        return {"status": "stopped", "matched_sequence": resp.stop_sequence}

    # Unknown stop_reason, log and fail safely.
    return {"status": "error", "stop_reason": resp.stop_reason}
Each stop_reason value gets its own branch. max_tokens is NOT an error, it's a normal partial result. Don't treat it like a crash.
06 · Distractor patterns

Looks right, isn't

Each row pairs a plausible-looking pattern with the failure it actually creates. These are the shapes exam distractors are built from.

Looks right

If response has no tool_use blocks, the agent is done.

Actually wrong

A response can have both text and tool_use blocks. stop_reason is the authoritative field. If stop_reason='tool_use', continue the loop even if text is present.

Looks right

Treat max_tokens as a failure and abort.

Actually wrong

max_tokens is a normal partial-result signal. Save what was returned, then either raise max_tokens or chunk the input. Aborting loses the partial work.

Looks right

Stop the agent preemptively when token usage approaches the cap.

Actually wrong

Preemptive stops lose real work and behave unpredictably. Let the agent finish its turn and read stop_reason, Claude already manages this gracefully.

Looks right

Streaming responses don't have a stop_reason, so handle them differently.

Actually wrong

Streaming responses do carry stop_reason, it arrives in the final `message_delta` event of the SSE stream. The shape is {type: "message_delta", delta: {stop_reason: "end_turn"}}. Code that only inspects content deltas and ignores the closing message_delta will miss the termination signal entirely.

Looks right

When stop_reason: "refusal" arrives, log it and abort the loop.

Actually wrong

`refusal` is a fifth value that lands when Claude declines for safety reasons; it is not interchangeable with `end_turn`. Aborting silently hides a policy event from your audit log. Treat it like its own branch: surface to the user, capture in telemetry, and never retry blindly with the same prompt, you'll burn budget on guaranteed refusals.

07 · Compare

Side-by-side

stop_reasonMeaningNext actionProduction risk
end_turnAgent finished naturallyExit loop, present textLow, success path
tool_useAgent wants to call a toolExecute, append result, continueHigh if loop checks text instead
max_tokensOutput token budget hitReturn partial, plan retry/chunkMedium, surfaces undersized max_tokens
stop_sequenceCustom stop string matchedExit, inspect for intentLow, only if you set it intentionally
refusalSafety policy declined the responseSurface to user; do not retry blindlyMedium, signals prompt or policy issue
pause_turnLong-running tool (e.g. server tools) paused mid-turnResume by re-sending the message listLow, only with extended/server tools
08 · When to use

Decision tree

01

Are you building an agentic loop with tools?

YesBranch on stop_reason: tool_use → execute + continue; end_turn → exit. Never check content shape.
NoSingle-turn call. Read stop_reason for diagnostics, but it's not control-flow critical.
02

Could max_tokens fire on long inputs?

YesPlan partial-result handling: save state, chunk input, or raise the limit on retry.
NoFocus on end_turn / tool_use only. max_tokens is a fallback.
03

Are you setting a custom stop_sequence parameter?

YesAdd an explicit branch for stop_sequence. Treat as intentional termination, not error.
NoYou'll only see end_turn / tool_use / max_tokens in production.
04

Are you streaming with stream: true or the SDK's stream helper?

YesRead stop_reason from the final message_delta event, not from content_block_delta events. Buffer the full stream and inspect final_message.stop_reason.
NoRead response.stop_reason directly off the message object.
05

Are you in a regulated domain (medical, legal, finance) where refusals matter?

YesAdd an explicit refusal branch. Log to your audit pipeline; never retry the same prompt automatically; surface a sanitized message to the end user.
NoRefusals are rare in benign domains. Catch them in the default branch with a generic message.
09 · On the exam

Question patterns

Stop Reason exam trap, painterly cautionary scene featuring Loop mascot.

27 V2 questions wired to this concept. Tap an answer to check it instantly — you'll see whether it's right and why — then expand the full breakdown for the mental model and all four rationales.

Your agentic loop keeps running after Claude has clearly finished its task. Which control was most likely missed?

Tap your answer to check it.

An agent loop hits a budget ceiling and you keep raising max_iterations to make it pass. What is the real problem?

Tap your answer to check it.

Your refund agent emits the words I am processing your refund now but then makes another tool call. The harness exits early. Why?

Tap your answer to check it.

Long-document extraction stalls at chunk 18 every single time, regardless of which model you use. What is happening?

Tap your answer to check it.

Your code-review CI bot returns valid JSON with empty findings: [] even on PRs that clearly have issues. Why?

Tap your answer to check it.

A subagent returns stop_reason: max_tokens with a partial summary. Production code aborts. What should it do instead?

Tap your answer to check it.

21 additional questions for this concept live in the practice pillar. Take a mock exam ↗

10 · FAQ

Frequently asked

Can `stop_reason` change between streaming chunks and the final response?
No. `stop_reason` is set once at generation end and arrives in the closing message_delta event. Intermediate content_block_delta events have no stop_reason field. If you read it before the stream closes, you'll get null, that's the bug.
How do `stop_reason` and `stop_sequence` (the parameter) relate?
stop_sequences is a request parameter (an array of strings). stop_reason is a response field. When one of your sequences matches, stop_reason === "stop_sequence" and response.stop_sequence holds the matched string. Without setting the parameter, you'll never see this value.
If `stop_reason: "max_tokens"` arrives mid-tool_use, is the tool call still valid?
It's valid only if the tool_use block is complete (closing } of the input JSON). Truncated tool calls have malformed input, your harness must validate before executing. A partially-streamed tool_use should be discarded; raise max_tokens and re-send.
What's the cost difference between `end_turn` and `max_tokens` exits?
Same input cost; same output cost up to the limit. `max_tokens` doesn't add a penalty, you pay for what was generated. The hidden cost is the retry: if you re-run with a higher limit, you pay for the prefix again unless you cache it via cache_control.
Should I set `max_tokens` as high as possible to avoid `max_tokens` stop_reason?
No. max_tokens reserves capacity but doesn't charge for unused tokens, however, higher values increase latency because the model plans a longer generation. Set it to a realistic ceiling for the task (e.g. 2048 for chat, 8192 for extraction), not the absolute max.
How do I detect that a loop is stuck because I forgot to append `tool_result`?
Symptom: stop_reason: "tool_use" repeats with the same `tool_use.id` across turns. The model is waiting for that specific result. Log the id per iteration; if the same id appears twice, your harness skipped the append. Without tool_result, Claude has no way to advance.
Does `stop_reason` exist on the Batch API responses?
Yes, identically. Each batched message has a stop_reason in its result object. Batch jobs that include tool-using messages are unusual (you can't continue a loop async), but the field is still populated for diagnostics.
What happens to `stop_reason` when extended thinking is enabled?
Same four-plus values. Extended thinking emits a thinking block before the response, but stop_reason reflects the final generation state. Thinking blocks don't change termination semantics, they're additional content, not a separate signal.
Can a single response have `stop_reason: "end_turn"` AND a `tool_use` block?
No, never. If a tool_use block is present, stop_reason is always tool_use. The mutual exclusion is structural: Claude commits to one or the other per turn. If you see end_turn with a tool block in your code, it's a parsing bug, you're inspecting the wrong response object.
How do I test stop_reason branches in unit tests without burning API budget?
Mock the SDK response shape. The Message object is plain JSON: {stop_reason, content, usage, ...}. Construct fixtures for each value (end_turn, tool_use, max_tokens, stop_sequence, refusal) and feed them through your handler. Never use real API calls in tests for control-flow logic, the test isn't validating Claude's behavior, it's validating your branching.
11 · Practice with AI

Work this with your AI

Work this concept hands-on with Claude Code, Codex, or claude.ai. Copy a prompt, paste it into your assistant, and practise in tandem. Each one keeps you active (explain it back, get drilled, or build) rather than just reading.

  • Drill it like the exam (scenario MCQs)
    Practice in the exam's scenario-MCQ format with trap awareness.
  • Explain it back (Feynman)
    Build durable, transferable understanding of a concept you can half-state.
  • Test me, adapting the difficulty
    Active recall practice on a concept you think you know.
  • Check my prerequisites first
    Before studying a concept that keeps not sticking.
  • Find the high-leverage 20%
    When a domain feels too big and you are short on time.
Self-check

Test yourself

Three diagnostic questions on this primitive. Reveal each answer when you have a guess. Want a full 60-question mock? Open the mock hub →

Q1Your loop checks `response.text.includes('done')` to decide termination. What can go wrong?
Claude may say "I'm done now" while emitting a tool_use block in the same response. The text is preamble; the tool_use is the real next step. Branch on stop_reason, not text.
Q2A response has both a text block and a tool_use block. Which should you handle first?
Branch on stop_reason. If it's tool_use, execute the tool call regardless of text presence. If it's end_turn, the text is the final response. Block-level inspection is unreliable; the field is authoritative.
Q3You set `max_iterations = 5` to prevent infinite loops. The agent fails on legitimate 7-iteration tasks. What's the real fix?
Find why the loop is unbounded: missing tool_result append, ambiguous tool descriptions, or two tools that look interchangeable. Caps mask bugs; stop_reason is the primary signal. Raise the cap to a safety buffer, not the primary control.
Last reviewed: 2026-05-04·Refresh cadence: monthly
D1.3 · D1 · Agentic Architectures

Stop Reason, complete.

You've covered the full ten-section breakdown for this primitive, definition, mechanics, code, false positives, comparison, decision tree, exam patterns, and FAQ. One technical primitive down on the path to CCA-F.

More platforms →