# Checkpoints & Session Management

> Checkpoints save and restore conversation state. Vault coverage is thin; full authoring requires Phase 6 research.

**Domain:** D3 · Agent Operations (20% of CCA-F exam)
**Canonical:** https://claudearchitectcertification.com/concepts/checkpoints
**Last reviewed:** 2026-05-04

## Quick stats

- **Coverage tier:** C
- **Exam domain:** D3
- **Status:** stub
- **Vault depth:** thin
- **Action:** research

## What it is

A checkpoint is a saved state snapshot of a multi-turn Claude session, designed for resuming long-running work across days or after infrastructure failures. After turn 12 of a 30-turn project, save state (history, file context, progress) to disk. Tomorrow, load that checkpoint, continue from turn 13, agent picks up with full context. Checkpoints are state persistence, not summarization.

The critical distinction: checkpoints preserve the full conversation thread, not a summary. Different from progressive summarization, which compresses history. Summarization loses transactional details (order IDs, amounts, deadlines); checkpoints keep them verbatim in the original messages. When you resume, Claude sees the full prior conversation. Preserves precision for financial, legal, medical use cases.

Checkpoints are scoped to single-session resumption, not multi-session fusion. Session A crashes at turn 12, load checkpoint, resume in same session at turn 13. You do not fork session A into session B. Linear resumption, not branching (that would be fork_session, a different pattern).

The use-case cluster is long-running work: multi-day research, 20+ file refactor, hour-spanning data pipelines. Triggers: daily boundaries, infrastructure limits, failure recovery. After 8 hours of work, save a checkpoint. If Claude times out at 2 AM, load yesterday's checkpoint. Checkpoints are the human-friendly recovery mechanism for work too large for one continuous session.

## How it works

Checkpointing has three stages. Capture: at a milestone, export conversation history (messages, tool results, file state) to JSON. Not a summary, the exact message sequence. Store: in version control, cloud storage, or local disk. Restore: next session, load the file, inject messages back into Claude's context, continue. Claude resumes with full memory of prior turns.

The message list in a checkpoint is immutable. When you restore, you inject all prior messages exactly as they were. No rewriting, no compression, no summarization. This is why precision is preserved: a customer ID from turn 8 is still verbatim in turn 13. Contrast with summarization, where turn 8-10 collapse to two sentences.

Checkpoints face one challenge: context window limits. A 30-turn project generates a large message list; by turn 25, prior messages may get lost in the middle (lost-in-the-middle). Mitigation: include a persistent case-facts block at the start of restored context. The block does not expire even as the conversation grows.

Restoration is not automatic. You explicitly load the checkpoint and confirm before continuing. Prevents accidental context corruption (merging unrelated sessions). Flow: end session → save checkpoint → next day → load → confirm → continue. The explicit gate is where you spot issues before proceeding.

## Where you'll see it in production

### Multi-day medical records extraction

1000 clinical notes, 12 hours/day. Save checkpoint after batch 333; next morning, load and continue from batch 334 with same schema and learned patterns. Without checkpointing, day 2 would re-extract or start fresh.

### Cross-service API refactor over 3 days

Day 1 = dependency mapping, day 2 = schema refactoring, day 3 = migration tests. Each day saves a checkpoint. Day 2 finds an issue, revert to day 1's checkpoint and adjust. Without checkpoints, lose day 1's analysis.

### Failure recovery in overnight batch

Pipeline processes 10M records via Batch API. Cluster restarts at 7.2M. Checkpoint saved every 1M records: load 7M, resume at 7M+1. Without checkpoints, restarting wastes compute on 7M records.

### Long-form content with iterative refinement

50-page guide. Day 1: chapters 1-10 + checkpoint. Day 2: chapters 11-20 with full prior context. Day 3: chapters 21-50 with all consistency. Without checkpoints, day 2 lacks chapter 1-10 details.

## Code examples

### Save and restore a session checkpoint

**TypeScript:**

```typescript
import Anthropic from "@anthropic-ai/sdk";
import fs from "fs";

const client = new Anthropic();

interface Checkpoint {
  messages: Anthropic.MessageParam[];
  metadata: { savedAt: string; phase: string; objective: string };
}

function saveCheckpoint(messages: Anthropic.MessageParam[], phase: string, objective: string) {
  const cp: Checkpoint = {
    messages,
    metadata: { savedAt: new Date().toISOString(), phase, objective },
  };
  fs.writeFileSync("checkpoint.json", JSON.stringify(cp, null, 2));
  console.log(`Saved at phase ${phase}`);
}

function loadCheckpoint(): Checkpoint {
  const data = fs.readFileSync("checkpoint.json", "utf-8");
  return JSON.parse(data);
}

async function resume() {
  const cp = loadCheckpoint();
  const messages = cp.messages;
  console.log(`Resuming at ${cp.metadata.phase}, ${messages.length} prior messages`);

  messages.push({ role: "user", content: "Continue. What patterns emerge?" });
  const resp = await client.messages.create({
    model: "claude-opus-4-5",
    max_tokens: 4096,
    system: `You are continuing a multi-day research project. Phase: ${cp.metadata.phase}. Objective: ${cp.metadata.objective}.`,
    messages,
  });

  messages.push({ role: "assistant", content: resp.content });
  saveCheckpoint(messages, `phase ${parseInt(cp.metadata.phase) + 1}`, cp.metadata.objective);
}
```

> Save full message history (not a summary); load and continue. Precision preserved across sessions.

**CLI workflow:**

```bash
# Day 1: start a multi-day project
claude > log-day1.txt
# ... 8 hours of work ...
claude-export-session checkpoint-day1.json

# Day 2: resume
claude-import-session checkpoint-day1.json
# ... continue working ...
claude-export-session checkpoint-day2.json

# Infrastructure failure mid-day-3? Load last good checkpoint
claude-import-session checkpoint-day2.json
# Resume from exact end of day 2.

# Key benefit: full conversation history preserved.
# No summarization, no precision loss.
```

> CLI flow: explicit save/load gates between days. Full thread preserved.

## Looks-right vs actually-wrong

| Looks right | Actually wrong |
|---|---|
| Summarize the conversation at each checkpoint to save tokens. | Summarization erases precision. Checkpoints preserve the full thread; summarization is the opposite approach (and is tested as an anti-pattern). |
| Checkpoints and fork_session do the same thing. | Different. Checkpoints resume linearly (A → A+1 → A+2). fork_session branches (A → A1 AND A2). Don't confuse them. |
| If a session crashes, just start over; files are saved. | Files survive but agent context does not. A new session lacks the prior conversation: Claude repeats analysis, misses nuances, makes inconsistent decisions. Checkpoints preserve agent reasoning, not just output files. |
| Use checkpoints only for infrastructure failures. | Checkpoints are for any resumption: daily breaks, phase boundaries, human review gates. Waiting for failure is reactive; checkpointing by design is proactive. |
| Merge two checkpoints from different sessions into one. | Checkpoints are single-session snapshots. Merging is not supported; corrupts context. To combine work, use manual synthesis or explicit hand-offs (output of A as input to B). |

## Comparison

| Aspect | Checkpoints | Summarization | fork_session | Subagent |
| --- | --- | --- | --- | --- |
| Preserves full conversation? | Yes, verbatim | No (summary) | Yes per fork | No, isolated |
| Precision loss? | None | High | None | Yes |
| Resumption pattern | Linear | Lossy | Branching | Sequential |
| Token budget impact | Grows with history | Reduced (lossy) | Per branch | Current turn only |
| Use case | Multi-day work, recovery | Long convos, cost | Comparing approaches | Parallel tasks |
| Reversibility | Yes (reload prior) | No (data lost) | Yes (merge decision) | No (work discarded) |

## Decision tree

1. **Will the task span multiple days or sessions?**
   - **Yes:** Use checkpoints. Save at daily boundaries.
   - **No:** Optional; single-session work doesn't need them.

2. **Is losing fine-grained context details unacceptable?**
   - **Yes:** Checkpoints. Never summarize. Preserve verbatim messages.
   - **No:** Summarization acceptable if token budget tight.

3. **Need to compare 2+ approaches from same starting point?**
   - **Yes:** Use fork_session, not checkpoints. Checkpoints are linear.
   - **No:** Checkpoints for linear multi-day work.

4. **Resilient to infrastructure failures?**
   - **Yes:** Recommended.
   - **No:** Checkpoints save state so failures don't lose progress.

5. **Have explicit approval gates or human-in-the-loop decisions?**
   - **Yes:** Save before and after each decision point.
   - **No:** Optional for continuous work.

## Exam-pattern questions

### Q1. Session crashes at turn 12. You restart fresh. What did you lose?

All agent context: prior reasoning, accumulated facts, learned patterns. Files survive but agent context does not. Checkpoints preserve the full message thread, so resuming is precision-perfect.

### Q2. Long task: should you summarize at each checkpoint to save tokens?

No. Summarization erases precision. Checkpoints preserve the full message list (verbatim). The exam tests this distinction repeatedly: summarization is the opposite of checkpointing.

### Q3. Checkpoints and fork_session: same pattern?

Different. Checkpoints resume linearly (A → A+1 → A+2). fork_session branches from a common baseline (A → A1 AND A2). Linear resumption vs branching exploration.

### Q4. Two checkpoints from different sessions, can you merge them?

No. Checkpoints are single-session snapshots. Merging is unsupported and corrupts context. To combine work, use manual synthesis or explicit hand-offs (output of A as input to B).

### Q5. When should you save a checkpoint?

At meaningful milestones: daily boundaries, after a major phase completes, before a risky decision, after human-in-the-loop approval. For a 30-turn project, 3-5 checkpoints (one per phase or week).

### Q6. Checkpoint loaded; can you edit it before continuing?

Technically yes, but don't. Editing corrupts message sequences and breaks Claude's understanding. If you need to correct something, load the checkpoint, acknowledge the issue in a new message, and continue. Claude adjusts.

### Q7. Restoring a checkpoint resets the conversation cost?

No. Restoring is one logical session (no reset). Tokens are charged for the restored messages as if part of one continuous session. Plan accordingly: a 30-turn checkpoint costs 30 turns to resume.

### Q8. Can checkpoints be used with the Batch API?

No. The Batch API processes asynchronously and returns results; it's not a continuous agent session. Checkpoints are for interactive multi-turn agent work that spans days or sessions.

## FAQ

### Q1. How much context is lost when resuming from a checkpoint?

Zero. Checkpoints preserve the full message list. When restored, Claude sees the complete prior conversation.

### Q2. Can I edit a checkpoint before restoring?

Technically yes, but don't. Editing corrupts message sequences. If you need to correct, load and acknowledge in a new message; Claude adjusts.

### Q3. What if the checkpoint is too large for the context window?

Save incremental checkpoints (one per day/phase). When resuming, load the most recent, not all 5 days. Sacrifice older context to fit.

### Q4. Should I version-control checkpoints?

Yes, for recovery and audit. Commit at major milestones. Lets you revert to a prior phase if needed.

### Q5. How often should I save checkpoints?

At meaningful milestones: daily, after a major phase, before risky decisions, after human approval. For a 30-turn project, ~3-5 checkpoints.

### Q6. Can I share a checkpoint with a colleague?

Yes. Export, share via git/email, they import. They resume with full context. This is how teams hand off multi-day projects.

### Q7. What if I load a checkpoint and make a conflicting request?

Claude respects restored context but may note the conflict. Clarify. Why explicit confirmation before continuing matters.

### Q8. Is a checkpoint the same as a git commit?

No. Git commits save file state; checkpoints save agent conversation state. Together, they form a full recovery point.

### Q9. Can I use checkpoints with the Batch API?

No. Batch is async, not a continuous agent session. Checkpoints are for interactive multi-turn work.

### Q10. Does restoring reset the conversation counter or cost?

No. Restoring is one logical session (no reset). Tokens are charged for restored messages as if part of one continuous session.

---

**Source:** https://claudearchitectcertification.com/concepts/checkpoints
**Vault sources:** ASC-A01 Course 3 checkpoints
**Last reviewed:** 2026-05-04

**Evidence tiers** — 🟢 official Anthropic doc / API contract · 🟡 partial doc / inferred · 🟠 community-derived · 🔴 disputed.
