# Stop Reason

> stop_reason is the authoritative struct field that says why Claude stopped, either end_turn (finished) or tool_use (wants to call a tool). Parsing natural-language phrases for termination is the most-tested distractor; it's unreliable and fails in production. Checking stop_reason is the only deterministic loop control.

**Domain:** D1 · Agentic Architectures (27% of CCA-F exam)
**Canonical:** https://claudearchitectcertification.com/concepts/stop-reason
**Last reviewed:** 2026-05-04

## Quick stats

- **Core values:** 2
- **Canonical pattern:** 1
- **Anti-patterns:** 3
- **Exam domain:** D1
- **Reliability vs NL parsing:** 100%

## What it is

stop_reason is a four-value enum field on every messages.create() response that signals why Claude stopped generating. The values you act on: end_turn (model is done), tool_use (model wants tools executed), max_tokens (output budget exhausted), and stop_sequence (custom stop string matched). It is the authoritative termination signal in any agentic loop.

The contract is structural, not linguistic. A response can contain the words "I'm done" while stop_reason is still tool_use, the text is preamble ("let me verify the customer first"), the tool_use is the real action. Conversely, a response can be silent on completion while stop_reason: "end_turn" signals it. Always read the field; never read the text.

The two production branches you write code for every day are end_turn (exit the loop) and tool_use (execute tools, append results, continue). Everything else is either a graceful partial (max_tokens) or a rare edge case (stop_sequence). The exam drills these four branches relentlessly because text-shape parsing is the #1 bug: developers check content[0].type = "text" to decide "the agent is done," but a typical assistant message is [text, tool_use, tool_use].

max_tokens is not an error, it's a normal partial-result signal. When the output budget is exhausted mid-task, Claude emits stop_reason: "max_tokens" with whatever was generated so far. Save the partial work, then either raise max_tokens for a retry or chunk the input. Treating it as a crash loses real progress. The exam tests whether you recognize partial completion as a design decision, not a failure mode.

## How it works

Every messages.create() call returns a Message object with a stop_reason field. The value is set by Claude during generation: if the model finishes its response, stop_reason is end_turn; if it requests a tool, the first tool_use block is appended to content and stop_reason becomes tool_use; if output reaches max_tokens, generation stops there. The field is always present and always one of these four values.

The canonical branching pattern: if stop_reason "end_turn" then exit else if stop_reason "tool_use" then execute and continue else if stop_reason "max_tokens" then save partial else if stop_reason "stop_sequence" then inspect match. Missing a branch is a logic bug, it crashes silently or retries indefinitely. TypeScript's discriminated unions make this safer: the compiler catches missed cases.

Tool execution happens only when stop_reason "tool_use". In that case, content is an array containing one or more tool_use blocks (with name, id, input JSON). Your harness iterates, executes the tool, appends a tool_result block to the message list. The list grows: [user_msg, assistant_resp_1, tool_results_1, assistant_resp_2, tool_results_2, ...]. Growth is bounded by context size and by the stop_reason signal.

The max_tokens parameter sets the output budget per turn, not globally. If you ask for 2048 tokens and the model runs out at 1800, stop_reason: "max_tokens" arrives. The next call with max_tokens=4096 gets a fresh budget. This distinction matters: max_tokens is per-turn, not cumulative. Many developers confuse this and assume a single max_tokens value caps the entire loop. It does not.

## Where you'll see it in production

### Chatbot turn termination

User asks for an account balance. Agent calls get_balance. Response: stop_reason='tool_use'. Agent executes, appends result. Next turn: stop_reason='end_turn' with the answer. If code instead checks content[0].type = 'text', it exits at any text presence, even mid-tool-use.

### Long-document extraction with token limits

Extracting entities from a 200-page contract. Mid-extraction, stop_reason='max_tokens' arrives. Code must save state, return partial result, and either chunk or raise the limit. Treating max_tokens as success silently truncates output.

### Headless CI run

Claude in -p mode emits a single response. CI script reads stop_reason from the JSON output. end_turn → success path; max_tokens → truncated; stop_sequence → matched a custom guard. The script must branch all four outcomes.

## Code examples

### Explicit handler for all four stop_reason values

```python
from anthropic import Anthropic

client = Anthropic()

def handle_response(resp) -> dict:
    """Branch on stop_reason. Never inspect content shape for termination."""

    if resp.stop_reason == "end_turn":
        # Normal completion. Extract text and exit loop.
        text = "".join(b.text for b in resp.content if b.type == "text")
        return {"status": "ok", "text": text}

    if resp.stop_reason == "tool_use":
        # Continue loop: execute tools, append results, resend.
        return {"status": "continue", "tool_calls": [
            {"id": b.id, "name": b.name, "input": b.input}
            for b in resp.content if b.type == "tool_use"
        ]}

    if resp.stop_reason == "max_tokens":
        # Token budget exhausted. Return partial; chunk on retry.
        partial = "".join(b.text for b in resp.content if b.type == "text")
        return {"status": "partial", "text": partial,
                "next_action": "chunk_input_or_raise_max_tokens"}

    if resp.stop_reason == "stop_sequence":
        # A custom stop_sequence matched. Treat as intentional termination.
        return {"status": "stopped", "matched_sequence": resp.stop_sequence}

    # Unknown stop_reason, log and fail safely.
    return {"status": "error", "stop_reason": resp.stop_reason}

```

> Each stop_reason value gets its own branch. max_tokens is NOT an error, it's a normal partial result. Don't treat it like a crash.

### TypeScript handler with discriminated union

```typescript
import Anthropic from "@anthropic-ai/sdk";

type LoopOutcome =
  | { status: "ok"; text: string }
  | { status: "continue"; toolCalls: Anthropic.ToolUseBlock[] }
  | { status: "partial"; text: string; nextAction: string }
  | { status: "stopped"; matched?: string }
  | { status: "error"; stopReason: string | null };

function handleResponse(resp: Anthropic.Message): LoopOutcome {
  switch (resp.stop_reason) {
    case "end_turn":
      return {
        status: "ok",
        text: resp.content
          .filter((b): b is Anthropic.TextBlock => b.type === "text")
          .map((b) => b.text)
          .join(""),
      };
    case "tool_use":
      return {
        status: "continue",
        toolCalls: resp.content.filter(
          (b): b is Anthropic.ToolUseBlock => b.type === "tool_use",
        ),
      };
    case "max_tokens":
      return {
        status: "partial",
        text: resp.content
          .filter((b): b is Anthropic.TextBlock => b.type === "text")
          .map((b) => b.text)
          .join(""),
        nextAction: "chunk_or_raise_max_tokens",
      };
    case "stop_sequence":
      return { status: "stopped", matched: resp.stop_sequence ?? undefined };
    default:
      return { status: "error", stopReason: resp.stop_reason };
  }
}

```

> Discriminated union forces every caller to handle all four outcomes. The compiler catches missed branches.

## Looks-right vs actually-wrong

| Looks right | Actually wrong |
|---|---|
| If response has no tool_use blocks, the agent is done. | A response can have both text and tool_use blocks. stop_reason is the authoritative field. If stop_reason='tool_use', continue the loop even if text is present. |
| Treat max_tokens as a failure and abort. | max_tokens is a normal partial-result signal. Save what was returned, then either raise max_tokens or chunk the input. Aborting loses the partial work. |
| Stop the agent preemptively when token usage approaches the cap. | Preemptive stops lose real work and behave unpredictably. Let the agent finish its turn and read stop_reason, Claude already manages this gracefully. |
| Streaming responses don't have a stop_reason, so handle them differently. | Streaming responses do carry stop_reason, it arrives in the final message_delta event of the SSE stream. The shape is {type: "message_delta", delta: {stop_reason: "end_turn"}}. Code that only inspects content deltas and ignores the closing message_delta will miss the termination signal entirely. |
| When stop_reason: "refusal" arrives, log it and abort the loop. | refusal is a fifth value that lands when Claude declines for safety reasons; it is not interchangeable with end_turn. Aborting silently hides a policy event from your audit log. Treat it like its own branch: surface to the user, capture in telemetry, and never retry blindly with the same prompt, you'll burn budget on guaranteed refusals. |

## Comparison

| stop_reason | Meaning | Next action | Production risk |
| --- | --- | --- | --- |
| end_turn | Agent finished naturally | Exit loop, present text | Low, success path |
| tool_use | Agent wants to call a tool | Execute, append result, continue | High if loop checks text instead |
| max_tokens | Output token budget hit | Return partial, plan retry/chunk | Medium, surfaces undersized max_tokens |
| stop_sequence | Custom stop string matched | Exit, inspect for intent | Low, only if you set it intentionally |
| refusal | Safety policy declined the response | Surface to user; do not retry blindly | Medium, signals prompt or policy issue |
| pause_turn | Long-running tool (e.g. server tools) paused mid-turn | Resume by re-sending the message list | Low, only with extended/server tools |

## Decision tree

1. **Are you building an agentic loop with tools?**
   - **Yes:** Branch on stop_reason: tool_use → execute + continue; end_turn → exit. Never check content shape.
   - **No:** Single-turn call. Read stop_reason for diagnostics, but it's not control-flow critical.

2. **Could max_tokens fire on long inputs?**
   - **Yes:** Plan partial-result handling: save state, chunk input, or raise the limit on retry.
   - **No:** Focus on end_turn / tool_use only. max_tokens is a fallback.

3. **Are you setting a custom stop_sequence parameter?**
   - **Yes:** Add an explicit branch for stop_sequence. Treat as intentional termination, not error.
   - **No:** You'll only see end_turn / tool_use / max_tokens in production.

4. **Are you streaming with stream: true or the SDK's stream helper?**
   - **Yes:** Read stop_reason from the final message_delta event, not from content_block_delta events. Buffer the full stream and inspect final_message.stop_reason.
   - **No:** Read response.stop_reason directly off the message object.

5. **Are you in a regulated domain (medical, legal, finance) where refusals matter?**
   - **Yes:** Add an explicit refusal branch. Log to your audit pipeline; never retry the same prompt automatically; surface a sanitized message to the end user.
   - **No:** Refusals are rare in benign domains. Catch them in the default branch with a generic message.

## Exam-pattern questions

### Q1. Your loop checks response.text.includes('done') to decide termination. What can go wrong?

Claude may say "I'm done now" while emitting a tool_use block in the same response. The text is preamble; the tool_use is the real next step. Branch on stop_reason, not text.

### Q2. A response has both a text block and a tool_use block. Which should you handle first?

Branch on stop_reason. If it's tool_use, execute the tool call regardless of text presence. If it's end_turn, the text is the final response. Block-level inspection is unreliable; the field is authoritative.

### Q3. You set max_iterations = 5 to prevent infinite loops. The agent fails on legitimate 7-iteration tasks. What's the real fix?

Find why the loop is unbounded: missing tool_result append, ambiguous tool descriptions, or two tools that look interchangeable. Caps mask bugs; stop_reason is the primary signal. Raise the cap to a safety buffer, not the primary control.

### Q4. An agent loop hits stop_reason: "max_tokens". Your code throws an error and exits. What should production code do?

Treat max_tokens as a partial-result signal, not a failure. Save what was generated, then either raise max_tokens for a retry or chunk the input. The agent did real work; don't discard it.

### Q5. You set a custom stop_sequences parameter. The agent stops mid-task. Why?

Your stop sequence matched an unintended substring. Inspect response.stop_sequence to see which one triggered. Either tighten the sequence (more specific) or remove it. Custom stop sequences are a sharp tool.

### Q6. Your TypeScript handler has if/else branches for end_turn, tool_use, and max_tokens. What's missing?

A stop_sequence branch for completeness, and a default branch for unknown values (defensive). Use a discriminated union type so the compiler enforces exhaustive checks.

### Q7. An agent calls the same tool 5 times in a row with identical input. Why?

You forgot to append the tool_result block to the message list, so Claude doesn't know the tool ran. Without the result, the model re-requests indefinitely until max_iterations saves you.

### Q8. Why is stop_reason better than counting tool_use blocks for loop control?

Block counting requires you to enumerate response.content and check types. stop_reason is a single authoritative field set during generation, designed for control flow. It's faster, cleaner, and immune to multi-block responses.

## FAQ

### Q1. Can stop_reason change between streaming chunks and the final response?

No. stop_reason is set once at generation end and arrives in the closing message_delta event. Intermediate content_block_delta events have no stop_reason field. If you read it before the stream closes, you'll get null, that's the bug.

### Q2. How do stop_reason and stop_sequence (the parameter) relate?

stop_sequences is a request parameter (an array of strings). stop_reason is a response field. When one of your sequences matches, stop_reason = "stop_sequence" and response.stop_sequence holds the matched string. Without setting the parameter, you'll never see this value.

### Q3. If stop_reason: "max_tokens" arrives mid-tool_use, is the tool call still valid?

It's valid only if the tool_use block is complete (closing } of the input JSON). Truncated tool calls have malformed input, your harness must validate before executing. A partially-streamed tool_use should be discarded; raise max_tokens and re-send.

### Q4. What's the cost difference between end_turn and max_tokens exits?

Same input cost; same output cost up to the limit. max_tokens doesn't add a penalty, you pay for what was generated. The hidden cost is the retry: if you re-run with a higher limit, you pay for the prefix again unless you cache it via cache_control.

### Q5. Should I set max_tokens as high as possible to avoid max_tokens stop_reason?

No. max_tokens reserves capacity but doesn't charge for unused tokens, however, higher values increase latency because the model plans a longer generation. Set it to a realistic ceiling for the task (e.g. 2048 for chat, 8192 for extraction), not the absolute max.

### Q6. How do I detect that a loop is stuck because I forgot to append tool_result?

Symptom: stop_reason: "tool_use" repeats with the same tool_use.id across turns. The model is waiting for that specific result. Log the id per iteration; if the same id appears twice, your harness skipped the append. Without tool_result, Claude has no way to advance.

### Q7. Does stop_reason exist on the Batch API responses?

Yes, identically. Each batched message has a stop_reason in its result object. Batch jobs that include tool-using messages are unusual (you can't continue a loop async), but the field is still populated for diagnostics.

### Q8. What happens to stop_reason when extended thinking is enabled?

Same four-plus values. Extended thinking emits a thinking block before the response, but stop_reason reflects the final generation state. Thinking blocks don't change termination semantics, they're additional content, not a separate signal.

### Q9. Can a single response have stop_reason: "end_turn" AND a tool_use block?

No, never. If a tool_use block is present, stop_reason is always tool_use. The mutual exclusion is structural: Claude commits to one or the other per turn. If you see end_turn with a tool block in your code, it's a parsing bug, you're inspecting the wrong response object.

### Q10. How do I test stop_reason branches in unit tests without burning API budget?

Mock the SDK response shape. The Message object is plain JSON: {stop_reason, content, usage, ...}. Construct fixtures for each value (end_turn, tool_use, max_tokens, stop_sequence, refusal) and feed them through your handler. Never use real API calls in tests for control-flow logic, the test isn't validating Claude's behavior, it's validating your branching.

---

**Source:** https://claudearchitectcertification.com/concepts/stop-reason
**Vault sources:** ACP-T03 §4.1; GAI-K04 §1
**Last reviewed:** 2026-05-04

**Evidence tiers** — 🟢 official Anthropic doc / API contract · 🟡 partial doc / inferred · 🟠 community-derived · 🔴 disputed.
