Subagents | Claude Architect Certification Prep

On this page

Summary What it is How it works Real-world use Code examples False positives Comparison Decision tree Exam patterns FAQ

01 · Summary

TLDR

Subagents are specialized, isolated agents spawned by a coordinator to handle domain-specific tasks while preserving conversation isolation. They do NOT inherit memory, the coordinator passes context explicitly in the prompt. Hub-and-spoke prevents context creep from parallel work.

Coordinator duties

Tool categories

Inheritance modes

Exam domain

Scopes

02 · Definition

What it is

A subagent is a specialized assistant the coordinator spawns to handle a focused task in isolation. It receives an explicit task string, runs in its own fresh context window, and returns only a structured summary. The intermediate work (file reads, searches, tool calls) is discarded. From the coordinator's perspective the subagent is a black box: task in, summary out.

The mental model is hub-and-spoke. The coordinator (hub) owns task decomposition and result aggregation. Subagents (spokes) are stateless executors that never inherit the coordinator's history, never share memory between invocations, and never communicate directly with each other. Every fact a subagent needs must be embedded in its task string. This isolation is what lets subagents run in parallel without the context explosion that would kill a single mega-loop.

Subagents are scoped by three mechanisms: system prompt (the role description), allowed-tools (the SDK-enforced tool list), and output format (the shape of the summary). A code-review subagent gets [Read, Grep, Bash] but not Edit. A retriever gets [Read, Glob, WebSearch] but no destructive capability. The contract is tool-based, not language-based: the SDK enforces the allowed-tools list, you don't have to hope the subagent respects a prompt suggestion.

Production subagent failures cluster around three gaps: missing task context (the coordinator assumes the subagent knows facts it doesn't), vague output formats (the subagent doesn't know when to stop), and overscoped tools (Edit access on a reviewer that should only read). Each is a coordination bug, not a model limitation. The exam tests whether you recognize the right delegation signals: verbose work, parallel tasks, and read-only analysis are the canonical triggers for subagents over inline loops.

03 · Mechanics

How it works

The coordinator invokes a subagent by passing three inputs: name (the subagent's identifier), task (a self-contained text description), and optional context metadata. The SDK spawns a fresh agent context, loads the subagent's system prompt and tool definitions from configuration, and starts an independent agentic loop inside the subagent. The subagent reads the task, plans, executes tools, and loops on stop_reason exactly as a standalone loop would.

The message list inside the subagent is isolated from the coordinator. Every file read, every tool call, every intermediate result stays in that nested context window. This is the efficiency win: if a code-review subagent reads 20 files to find one bug, the coordinator pays no token cost for those 20 reads. Only the final structured report ("vulnerability in auth.ts line 47, severity high") returns.

When the subagent reaches stop_reason: "end_turn", it emits a final message. The SDK extracts the text and returns it to the coordinator as a tool_result block. The subagent's entire history is then discarded, the agent is stateless by design. If the coordinator needs more work later, it spawns a new invocation with a fresh context. There is no resumption, no second turn, no continuation.

Parallel execution is free. If a coordinator needs four subagents to analyze four repos, the SDK spawns all four simultaneously, awaits all four results, and aggregates them. Each subagent runs in its own context window with no contention; the cost is four separate completions but the speedup and context cleanliness justify it for any decomposable task. The pattern collapses cleanly to one subagent when work isn't parallelizable.

Subagents mechanics, painterly diagram featuring Loop mascot.

04 · In production

Where you'll see it

Parallel multi-repo code analysis

The coordinator gets a task: "find all references to the deprecated client across our 12 repos." Instead of visiting each repo in a massive inline loop (token explosion), it spawns four subagents in parallel, each scoped to three repos. Each subagent reads files, greps imports, returns a structured report. The coordinator merges the four reports and deduplicates. Without parallelization the same task consumes roughly 5x more tokens because every read accumulates in the main context.

Code-review subagent in CI/CD

On every PR, a CI pipeline invokes a code-review subagent with the PR diff. The subagent is restricted to [Read, Grep, Bash], no Edit. It examines security, performance, and maintainability, and returns a JSON report. The pipeline decides whether to approve, request changes, or escalate. Without isolation the 50+ file reads pollute the main context; with the subagent, the main thread sees only the structured verdict.

Knowledge-base retriever subagent

A user asks a general question: "What's our refund policy for SaaS subscriptions?" A retriever subagent is spawned with [Read, Glob, WebSearch]. It searches internal docs and web, summarizes findings, returns a single answer. The coordinator never sees the eight intermediate searches or the three docs that were skipped. The output is clean, directly usable, and audit-ready.

Structured extraction with a dedicated extractor

An invoice-processing pipeline receives a scanned PDF. Rather than have the main agent loop through 40 pages of OCR, a dedicated extraction subagent is spawned with a precise output format {vendor, amount, date, items[]}. It runs, returns the structured data, and exits. The main pipeline continues without ever seeing the intermediate OCR noise or abandoned extraction attempts.

05 · Implementation

Code examples

Spawning and aggregating subagent results

# Coordinator delegates to subagents in parallel.
# Each subagent runs in isolation; only the summary returns.

import asyncio
from anthropic import Anthropic

client = Anthropic()

async def review_repo(name: str, files: list[str]) -> dict:
    """Spawn a code-review subagent for one repo."""
    task = f"""Review the codebase in {name}.
Files to check: {", ".join(files)}

Return JSON:
{{
  "verdict": "approved" | "needs_rework" | "hold",
  "critical_issues": [{{"file": "...", "line": N, "issue": "..."}}],
  "minor_issues": [...]
}}"""

    # In Claude Code SDK this is invoke_subagent("code-reviewer", task)
    resp = await asyncio.to_thread(
        client.messages.create,
        model="claude-opus-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": task}],
    )
    return {"repo": name, "review": resp.content[0].text}

async def coordinate(repos: dict[str, list[str]]) -> dict:
    """Spawn one subagent per repo in parallel and aggregate."""
    results = await asyncio.gather(*[
        review_repo(name, files) for name, files in repos.items()
    ])
    return {
        "total": len(results),
        "reviews": results,
        "verdict": "approved" if all(
            r["review"].get("verdict") == "approved" for r in results
        ) else "needs_review",
    }

asyncio.run(coordinate({
    "auth-service": ["auth.py", "oauth.py"],
    "payment-service": ["payments.py", "webhooks.py"],
}))

Each subagent runs isolated; coordinator awaits, collects, aggregates. Intermediate work stays in the subagent context and is discarded.

06 · Distractor patterns

Looks right, isn't

Each row pairs a plausible-looking pattern with the failure it actually creates. These are the shapes exam distractors are built from.

Looks right

Pass the entire coordinator conversation history to the subagent so it has full context.

Actually wrong

Subagents do not inherit history. Passing prior messages confuses the subagent (it wasn't part of that conversation). Every fact must be embedded in the task string, make it self-contained.

Looks right

Two subagents can coordinate with each other to share results.

Actually wrong

Subagents communicate only through the coordinator. Direct subagent-to-subagent channels do not exist. If A's output feeds B's input, the coordinator must orchestrate: run A, collect its summary, pass that into B's task string.

Looks right

Give every subagent full tool access (Edit, Bash, all of them) so it can handle any task.

Actually wrong

Tool scope must match the role. A reviewer should not have Edit. A retriever should not have Bash. Overscoping causes accidental side effects (file edits, deployments) and obscures intent. Restrict to the minimum needed.

Looks right

A vague output format is fine, the subagent will figure out when it's done.

Actually wrong

Without a defined output format, subagents wander and run too long. Structured formats (JSON, fixed sections) create natural stopping points. A good subagent config includes an explicit output shape exactly because it doubles as a termination cue.

Looks right

Subagent failure should trigger a retry with the same task.

Actually wrong

Retries succeed only if the failure was transient. If the task is ambiguous or the tool set is wrong, retrying burns tokens without fixing the root cause. Debug the task string, the tool restrictions, and the output format first; retry second.

07 · Compare

Side-by-side

Aspect	Subagent (isolated)	Inline agentic loop	Parallel subagents	Sequential coordinator
Context window	Isolated; discarded after	Grows monotonically	N fresh windows	Main stays clean
Parallel execution	Yes, N concurrent	Single-threaded	Natural; N tasks async	Never parallel
Tool access	Scoped per role	Full set	Scoped per subagent	Full set
Output shape	Structured summary only	Full message history	N summaries merged	Single result
Visibility	Black box, no intermediate work shown	Every tool call visible	Summary only per subagent	Every step logged
Best for	Verbose or read-only tasks	3 to 15 turn loops	4+ parallel tasks	Fixed sequential pipelines

08 · When to use

Decision tree

Will the work produce verbose intermediate output (file reads, searches, false starts)?

YesDelegate to a subagent. Coordinator stays clean; noise is discarded.

NoRun inline. Subagent overhead is not justified.

Are the tasks independent and parallelizable (4 repos, each analyzed separately)?

YesSpawn N subagents in parallel. Each gets its own context window; results merge in coordinator.

NoSingle subagent or inline loop.

Should this agent be unable to modify files (read-only analysis)?

YesRestrict allowed-tools to [Read, Grep, Glob, WebSearch]. Omit Edit, Bash, Write.

NoGrant the tools the role needs, no more.

Can the task be expressed as a self-contained string without prior context?

YesSubagent is viable. Embed every needed fact in the task string.

NoSubagent will be confused. Run inline, or refactor to extract a clear standalone task.

Does the output need a strict format (JSON, structured sections, checklist)?

YesDefine the format in the subagent's system prompt. This creates a stopping point and enables aggregation.

NoFree-form text is acceptable; coordination becomes harder.

09 · On the exam

Question patterns

Subagents exam trap — painterly cautionary scene featuring Loop mascot.

Your code-review subagent missed three critical issues that you can clearly see in the file. Why?

The subagent received only the file path, not the file contents, and lacked a Read tool. Subagents do not inherit coordinator history or environment. Embed the file content (or grant Read) explicitly in the task string.

Two subagents are returning conflicting reports about the same bug. How do you resolve it?

You don't, the coordinator does. Subagents never communicate directly. Aggregate both reports in the coordinator, ask Claude to reconcile, or escalate to a human reviewer with both reports as evidence.

A research subagent ran for 40 turns and returned a perfect summary, but the bill is huge. What's the architectural fix?

Define a structured output format in the subagent's system prompt. Without one, subagents wander. A clear output shape ({findings: [], confidence: number}) doubles as a stopping cue and caps token cost.

Your retriever subagent has `[Read, Grep, Glob, WebSearch, Bash, Edit]` and accidentally modified a config file. What was the design error?

Tool overscoping. A retriever should never have Edit. Restrict allowed-tools to [Read, Grep, Glob, WebSearch]. The minimum needed for the role.

You spawned 4 subagents in parallel; 3 finished, the 4th hangs forever. How do you debug?

Check the 4th subagent's task string. The most common cause is ambiguity, the subagent is asking for clarification or wandering. Add max_iterations and a deterministic output format to bound it. Don't retry blindly.

Your coordinator passes the entire chat history to each subagent for "context." Subagents respond with confused outputs. Why?

Subagents don't inherit history; passing it confuses them (they were not part of that conversation). Embed only the facts the subagent needs in a self-contained task string. Treat each subagent as starting from zero.

A subagent returns `stop_reason: "max_tokens"` with a partial summary. Production code aborts. What should it do instead?

Treat it as a partial result, not failure. Either accept the partial (with a confidence flag) or spawn a refined subagent ("focus on files X to Y only"). Aborting wastes the subagent's actual work.

Your coordinator spawns Tool A, then waits for results, then spawns Tool B based on A's output. Why is this slower than ideal?

It's sequential when it could be parallel for independent work. If Tool B doesn't strictly depend on Tool A's specific output, spawn both at once with Promise.all. The coordinator merges both outputs.

10 · FAQ

Frequently asked

What's the difference between a subagent and an agentic loop?

A loop is a while block in the coordinator that runs tool_use to tool_result cycles. A subagent is a separate execution: the coordinator spawns it, it loops internally, and returns a summary. Subagents are best for isolated tasks; loops are best for single threads with complex tool interactions.

Can a subagent call another subagent?

No. Subagents are leaf nodes, they loop internally but cannot spawn further subagents. If you need N levels of delegation, flatten to a hub-and-spoke where the coordinator (hub) spawns all first-level subagents directly.

How do I know when to use a subagent vs. staying inline?

Use subagents when (1) the task produces verbose intermediate work, (2) tasks are parallelizable, or (3) the agent must be restricted (read-only, no edit). Use inline loops for 3 to 15 turn flows where every step matters in the main context.

What happens if a subagent's task is ambiguous?

The subagent will request clarification in its output, or it will make assumptions and drift. Always embed concrete examples, file names, and an expected format in the task string. The more specific, the more reliable.

How do I aggregate results from N parallel subagents?

Collect all summaries in the coordinator as tool_result blocks, then append them to the coordinator's message list in a single user-role block. The coordinator can now reason over all N results together.

Can a subagent's output feed into another subagent's task?

Only indirectly through the coordinator. Run subagent A, collect its summary, manually format it into subagent B's task string, then invoke B. The coordinator orchestrates all data flow; subagents never talk directly.

What if a subagent exceeds its token budget?

The subagent returns stop_reason: "max_tokens" with a partial summary. Handle it in the coordinator: either accept the partial result or spawn a new subagent with a refined task ("focus on files X to Y only").

Are subagents cheaper than inline loops?

Not on a per-token basis, but yes on a per-task basis for large decomposable work. If a task generates 50 file reads, an inline loop stores all 50 in context; subagents discard them. For parallel tasks, subagents are both faster and cheaper.

How do I prevent a subagent from entering an infinite loop?

Same as inline loops: set max_iterations, define a clear output format (so the subagent knows when it's done), and ensure the task is well-scoped. A wandering subagent is usually a sign of vague task or vague output format.

Can subagents use MCP servers or custom tools?

Yes. Subagents have their own tool definitions (from the config's allowed-tools list) or MCP definitions (via .mcp.json). Tool access is scoped per subagent; the SDK enforces the restrictions.

Subagents.

TLDR

What it is

How it works

Where you'll see it

Parallel multi-repo code analysis

Code-review subagent in CI/CD

Knowledge-base retriever subagent

Structured extraction with a dedicated extractor

Code examples

Looks right, isn't

Side-by-side

Decision tree

Will the work produce verbose intermediate output (file reads, searches, false starts)?

Are the tasks independent and parallelizable (4 repos, each analyzed separately)?

Should this agent be unable to modify files (read-only analysis)?

Can the task be expressed as a self-contained string without prior context?

Does the output need a strict format (JSON, structured sections, checklist)?

Question patterns

Frequently asked