# Session State & Persistence

> Session state is the running message list (and any external persistence) that gives an agentic loop continuity. The exam tests progressive summarization tradeoffs (token savings vs. precision loss) and when checkpointing or external store reads are required. Full content lands in SCRUM-21 follow-up.

**Domain:** D1 · Agentic Architectures (27% of CCA-F exam)
**Canonical:** https://claudearchitectcertification.com/concepts/session-state
**Last reviewed:** 2026-05-04

## Quick stats

- **Persistence layers:** 3
- **Exam domain:** D1
- **Trap pattern:** summary loss
- **Coverage tier:** B
- **Linked scenarios:** 2

## What it is

A session is the persistent context window in which an agentic task runs. It holds the message list, accumulated file reads, tool calls, and responses. The mental model: a session is a mutable, bounded space (typically 100K to 200K tokens) where the model builds and refines its understanding. Unlike chat conversations that survive indefinitely, a session is scoped to a task.

Session state management is the architectural constraint that makes or breaks long-running agentic systems. The model reads the entire session history on every turn, so every tool call, every file read, every intermediate result consumes tokens. As the session grows, the model eventually hits stop_reason: "max_tokens" mid-task. The fix is progressive state management: track what to keep (case facts), what to compress (verbose reasoning), and what to page (file contents).

Three layers govern state lifecycle in production. In-session: as work progresses, intermediate outputs can be summarized; critical facts (customer ID, refund amount) stay in a case-facts block. Inter-session: when a task ends, context can persist via the project Instructions panel or working files. Subagent handoff: a subagent runs its own session, then returns only a structured summary, not its full reasoning.

The deepest anti-pattern is progressive summarization without immutable facts. Repeatedly summarizing erases specific transactional details. A refund of $247.83 becomes "around $250". A customer ID becomes "the account mentioned earlier". The fix is structural: extract immutable case facts once at the start, preserve them verbatim through every summarization pass. Summaries compress reasoning, never facts.

## How it works

Session state begins with initialization. In Claude Code, a session starts when you create a task; the harness reads project Instructions, loads global preferences, and appends the user's task description. For subagents, initialization is different: the parent agent writes a focused prompt, and the subagent boots with a custom system prompt. Every piece of context the subagent needs must be explicitly passed in the delegating prompt, never assumed.

As the session runs, state accumulates and competes. Each tool call appends a request and a result to the message list. By turn 8, the list contains 7 turns of requests, 7 turns of responses, and all intermediate outputs. Each turn the model reads this entire history. The session's effective capacity shrinks. The model's reasoning quality also degrades, a phenomenon called lost in the middle: key facts buried in a 30-turn conversation get overlooked.

State checkpoints occur at natural boundaries. When a research subagent finishes, it returns a structured summary (findings, sources, obstacles_encountered) instead of its 30-turn conversation. The parent appends the summary, discards the subagent's full context, and continues. For long document-processing tasks that hit max_tokens, the harness should summarize processed chunks into a checkpoint file, reset the session, reload the checkpoint, and continue.

Persistence across sessions happens via project Instructions and working files. If a subagent discovers a workaround (an API needs a special flag), it reports this in obstacles_encountered; the parent decides whether to add it to project Instructions for future tasks. Running notes, glossaries, and context that should survive across tasks live in the project folder, not in the conversation. This is how context carries across sessions without re-explaining the same things each time.

## Where you'll see it in production

### Customer support state through 12-turn refund flow

Turn 1 calls get_customer, turn 2 calls lookup_order, turn 3 calls get_refund_policy. By turn 8 the context is half full. Critical facts (customer_id, order_id, refund_amount) are scattered across turns. The fix is to pin a case-facts block at the context start. Every turn, the model reads this block first; the facts never degrade.

### Multi-agent research with structured handoffs

A coordinator delegates research_market_size, analyze_competitors, assess_regulatory_risk to three subagents. Each returns a summary with structured sources array. The coordinator can compare sources and resolve contradictions. Without structured provenance, the coordinator either picks majority vote (discarding outliers) or escalates without information.

### Long document processing hits max_tokens at chapter 18

A 150-page contract needs entity extraction. By turn 15 the session has read 150 pages. Turn 16 fails with stop_reason: "max_tokens". The harness writes accumulated entities to a checkpoint, resets the session with a fresh instruction "Resume entity extraction. Previously extracted: ...", and continues. Checkpointing prevents restart-from-scratch on long batch tasks.

### Cowork projects carry context across tasks

Day 1: synthesize three vendor proposals into an evaluation matrix. Day 2: update the Q2 headcount plan. Tasks 1 and 2 are separate sessions. But the project's Instructions say vendor matrix lives in ./proposals/matrix.xlsx. When the analyst says "compare new budget against approved vendor costs," the model reads instructions and knows where to look. State persists in Instructions and folder, not in the conversation.

## Code examples

### Pinning case facts through a long agentic session

**Python:**

```python
from anthropic import Anthropic
import json

client = Anthropic()

def run_support_task(customer_msg: str, case_facts: dict):
    """Pin immutable facts in system prompt, never summarize them."""
    system_prompt = f"""You are a customer support agent.

CASE FACTS (immutable, update never):
{json.dumps(case_facts, indent=2)}

Always refer to these facts. Do not summarize them. If they change,
explicitly request an update. Do not infer changes from conversation."""

    messages = [{"role": "user", "content": customer_msg}]

    for turn in range(10):
        resp = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=2048,
            system=system_prompt,
            messages=messages,
        )
        if resp.stop_reason == "end_turn":
            return {"status": "ok", "turns": turn + 1}

        messages.append({"role": "assistant", "content": resp.content})
        # ... handle tool_use, append tool_result ...

    return {"status": "max_iterations"}

# Initialize with immutable facts
facts = {
    "customer_id": "cust_12345",
    "order_id": "ord_98765",
    "refund_amount_requested": 247.83,
    "policy_max_refund": 500.00,
}
run_support_task("Refund order #98765", facts)
```

> Case facts pinned in system context. Every turn the model reads them first. Summarization cannot erase them.

**TypeScript:**

```typescript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

interface CaseFacts {
  customer_id: string;
  order_id: string;
  refund_amount_requested: number;
  policy_max_refund: number;
}

async function runSupportTask(msg: string, facts: CaseFacts) {
  const systemPrompt = `You are a customer support agent.

CASE FACTS (immutable, update never):
${JSON.stringify(facts, null, 2)}

Always refer to these facts. Do not summarize them.`;

  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: msg },
  ];

  for (let turn = 0; turn < 10; turn++) {
    const resp = await client.messages.create({
      model: "claude-opus-4-5",
      max_tokens: 2048,
      system: systemPrompt,
      messages,
    });
    if (resp.stop_reason === "end_turn") {
      return { status: "ok", turns: turn + 1 };
    }
    messages.push({ role: "assistant", content: resp.content });
    // ... handle tool_use, append tool_result ...
  }

  return { status: "max_iterations" };
}
```

> Same pattern in TypeScript. System prompt embeds facts at the top of every API call.

## Looks-right vs actually-wrong

| Looks right | Actually wrong |
|---|---|
| Summarize the conversation every 5 turns to save tokens. | Summarization erases specific facts (amounts, IDs, dates). Immutable case facts must be extracted once and pinned, never summarized. Summarize the reasoning, not the facts. |
| Use a larger model or increase max_tokens when a session runs out of context. | Larger models have larger windows but the problem recurs at higher scale. The fix is active state management: checkpoint, trim, page. Token limits enforce discipline; raising them hides the root cause. |
| Store subagent context in the parent's message list so the parent can see the reasoning. | Subagents return summaries, not full context. The parent's context should stay clean. If you need reasoning, ask the subagent to include obstacles_encountered in its output format. |
| When a task completes, save the entire conversation as a backup. | The conversation is a tool-littered record. Save structured results (extracted entities, decisions, recommendations) in clean format. The conversation has no lasting value; the decisions do. |
| Pass full session history to a subagent so it has all context. | Full history bloats subagent context and dilutes focus. Pass a focused prompt with extracted facts and the specific subtask. Subagents work better with less, not more. |

## Comparison

| Pattern | Session Duration | Primary Risk | Fix |
| --- | --- | --- | --- |
| Pinned case-facts block | Any length | Facts eroded by summarization | Extract once, pin in system context |
| Subagent summary | Multi-agent | Parent rediscovers same problems | Structured obstacles_encountered field |
| Checkpoint and resume | Long batch tasks | max_tokens mid-task | Save state, reset session, reload |
| Lost-in-the-middle mitigation | 10+ turns | Key facts buried | Anchor at top, trim verbose outputs |
| Project Instructions persistence | Multi-task | Shorthand fails across tasks | Persist in Instructions and folder files |
| Structured escalation block | Any length | Human spends time decoding context | Pre-structured handoff with all facts |

## Decision tree

1. **Will the task run more than 8 turns?**
   - **Yes:** Pin a case-facts block in system context. Plan for checkpointing.
   - **No:** In-session state is sufficient. Facts stay in messages naturally.

2. **Is this single-agent or multi-agent?**
   - **Yes:** Multi-agent: design subagent output formats with obstacles_encountered. Parent stays clean via summaries.
   - **No:** Single agent: focus on trimming verbose outputs and anchoring critical facts.

3. **Does the work span multiple tasks (Task 1 ends, Task 2 begins)?**
   - **Yes:** Use Cowork project structure. Persist via Instructions (one-time) and folder files (evolving).
   - **No:** Single task; case facts and checkpoints suffice.

4. **Is there an irreversible action (payment, deletion)?**
   - **Yes:** Mandatory structured escalation block before action. Extract customer_id, amounts, reason, partial_status.
   - **No:** Escalation is optional but use the block pattern anyway.

5. **Is the session at risk of hitting max_tokens?**
   - **Yes:** Checkpoint NOW: save processed state, summarize reasoning, reset. Do not wait for max_tokens.
   - **No:** Continue, but monitor: if approaching turn 12 on a long task, checkpoint preemptively.

## Exam-pattern questions

### Q1. Your support agent forgets the customer ID by turn 30. What's the architectural fix?

Pin a CASE_FACTS block in the system prompt with customer_id, order_id, refund_amount. Re-read every turn. The most-tested distractor is "increase max_tokens"; the right answer is structural state management.

### Q2. A subagent returns wrong findings; coordinator rediscovers the same problems on the next subagent spawn. Why?

Coordinator didn't capture obstacles_encountered from the first subagent. Add a structured field so the coordinator stores and acts on the discovery, then includes the resolution in the next spawn's task string.

### Q3. Long-document extraction dies at chapter 18 every run. What's happening?

Message list contains 17 turns of tool_use + tool_result; chapter 18 hits stop_reason: "max_tokens". Fix: checkpoint state, reset the session, reload the checkpoint, continue. Don't increase the model size.

### Q4. Which is wrong: summarize verbose reasoning, or summarize critical facts?

Summarizing facts is wrong. Pin them in a CASE_FACTS block. Reasoning chains are summarizable; transactional values (amounts, IDs) are not.

### Q5. Subagent spawns receive the parent's full conversation history. Why is this an anti-pattern?

Bloats the subagent context and dilutes focus. Pass a focused prompt with extracted facts and the specific subtask. Subagents work better with less, not more.

### Q6. Cross-task context (vendor matrix path) lives in: project Instructions or session messages?

Project Instructions. Sessions are task-scoped (discarded at end). Project Instructions persist across tasks; folder files persist evolving state.

### Q7. When you hit max_tokens, does increasing the model's window solve the problem?

Temporarily, yes. Architecturally, no. Larger windows defer the problem; the fix is active state management (case-facts, checkpointing, summarization) regardless of window size.

### Q8. An escalation hands off the entire conversation transcript to a human. What's the better pattern?

Structured escalation block: customer_id, order_id, amount, reason, partial_status, recommended_action. Compact (200-500 chars). Human triages in 10 seconds vs 5 minutes for transcript.

## FAQ

### Q1. What's the difference between summarization and pinning case facts?

Summarization compresses entire context for token savings. Case facts extract specific values (customer_id, refund_amount) in original precision. Both can coexist: summarize reasoning, pin the facts.

### Q2. When should subagent context be discarded?

Always. Subagents return summaries, not full conversation. The parent reads the summary, discards the subagent session, and continues. This keeps parent context clean.

### Q3. Is max_tokens a session-state problem or model selection?

Session-state. Raising max_tokens hides the issue. The real problem is active state management: checkpoint early, trim outputs, page data. A well-managed session stays in budget.

### Q4. If a subagent discovers a workaround, where does it go?

In the subagent's output format under obstacles_encountered. The parent reads it and decides whether to add it to project Instructions for future tasks.

### Q5. How does Cowork's Instructions panel differ from system prompt?

System prompt is per-message. Instructions panel is per-project (persists across tasks). Use Instructions for one-time setup; system prompt for task-specific behavior.

### Q6. What happens to context when a Cowork task completes?

The conversation is discarded. Files saved to the project folder survive. Use folder persistence for evolving state; the conversation is ephemeral.

### Q7. Is progressive summarization always bad?

No. Summarizing verbose reasoning chains is correct. Summarizing facts is wrong. Extract facts first, then summarize everything else.

### Q8. When should I send an escalation block instead of full transcript?

Always send the structured block. The full transcript is supplementary if the human needs it. The block is the primary handoff: customer_id, amount, reason, recommendation.

### Q9. How does checkpointing differ from saving the message list?

Saving the message list saves every request and tool output. Checkpointing extracts processed results into a clean format, resets the session, and continues. Checkpointing is smaller and prevents context bloat.

### Q10. What is lost-in-the-middle and how do I prevent it?

Key facts buried in middle of a 30-turn history get overlooked. Prevent by anchoring critical facts at the top of context (case-facts block), use section headers, trim verbose tool results.

---

**Source:** https://claudearchitectcertification.com/concepts/session-state
**Vault sources:** ACP-T03 §5 progressive summarization; GAI-K04 §8 multi-agent memory; ASC-A01 Course 3
**Last reviewed:** 2026-05-04

**Evidence tiers** — 🟢 official Anthropic doc / API contract · 🟡 partial doc / inferred · 🟠 community-derived · 🔴 disputed.