What it is
The coordinator is the single hub that receives the user's research query and turns it into a fan-out of self-contained subagent tasks. Its first and most load-bearing job is semantic decomposition: looking at a topic like creative industries and enumerating every sub-domain that matters (visual arts, music, writing, film, performing arts), not just the first one that surfaces from a keyword scan. Lexical decomposition stops at the words you can see in the query. Semantic decomposition asks what the user actually needs covered.
Once decomposition is complete, the coordinator dispatches subtasks via a synchronous-fork-then-join pattern. All N research subagents fire at once through asyncio.gather (Python) or Promise.all (TypeScript). The coordinator awaits the full set, then proceeds to verification. Latency becomes max(subagents) instead of sum(subagents). That parallelism is the architectural payoff and the reason hub-and-spoke beats a single mega-loop on any decomposable task.
Routing is strict hub-and-spoke: subagents never call each other. If subagent B depends on a finding from subagent A, the answer is not for B to import A. The answer is A returns to the coordinator, the coordinator constructs B's task prompt with A's finding embedded, and B starts in a fresh context. Every cross-subagent edge passes through the hub. This is what keeps isolation, parallelism, and visibility intact at the same time.
How it works
Step 1 is decomposition. The coordinator reads the query, identifies the topic class, and produces an explicit list of sub-domain tasks. For impact of AI on creative industries the list must include visual arts, music, writing, film, and performing arts. The decomposition function is unit-testable and should be reviewed before any spawn happens; a coverage bug here cannot be recovered downstream by tuning subagents or upgrading models.
Step 2 is fan-out. Each task becomes a messages.create call with its own system prompt, scoped tools whitelist, and a self-contained messages body. No history is inherited. The coordinator wraps the fan-out with a Semaphore(MAX_PARALLEL=5) to bound concurrency and a per-subagent retry budget (default 2) for timeout handling. stop_reason from each subagent tells the coordinator whether the response is complete, max-tokens partial, or tool-use mid-flight.
Step 3 is join. The coordinator awaits the full gather, inspects each subagent's status_code, and decides per-result: accept, retry with a narrower query, or transparently mark a gap. Successful findings get pooled and handed to the verification subagent. Timeouts return structured error context, not silence: {status: 'timeout', query, partial_results, alternatives}. The coordinator uses that envelope to make a real decision; an empty [] masquerading as success would force it to guess.
The 4 decisions
Each row pairs the right answer with the most-tested distractor. The Why column explains the failure mode behind the wrong choice.
| Decision | Right answer | Wrong answer | Why |
|---|---|---|---|
User asks about creative industries. How do you decompose? | Semantic enumeration: visual arts, music, writing, film, performing arts | Lexical split on the words creative and industries | Lexical splits miss every sub-domain that wasn't named. Semantic decomposition asks what the user actually needs covered and enumerates the full set before spawning. |
| Subagent B needs a finding from Subagent A. How does B get it? | A returns to coordinator. Coordinator embeds A's finding in B's task prompt | A calls B directly with the finding | Direct calls break isolation, kill parallelism (B blocks on A even when independent), and hide the dependency from the coordinator's orchestration graph. |
| Final report is missing 4 of 5 sub-domains. Where do you debug first? | The coordinator's decomposition function. The bug is upstream | Tune subagent prompts or upgrade their model | If the coordinator never enumerated music or writing, no subagent could research them. Decomposition is the load-bearing step. Fix it first. |
| How many subagents to fan out? | Bounded by Semaphore(MAX_PARALLEL=5). Diminishing returns past 5-7 | Unbounded. Spawn one per sub-domain regardless of count | Unbounded fan-out hits API concurrency limits, rate-limit backpressure, and coordinator-side context contention. Cap, measure, tune to your workload. |
Where it breaks
5 failure pairs. Each one is one exam pattern. The fix is always architectural, never a prose plea to the model.
Coordinator splits creative industries on the word creative. Spawns one subagent for creative writing. Misses music, film, visual arts, performing arts entirely.
Replace lexical split with a semantic enumerator. For each topic class, list ALL relevant sub-domains in code, not the first one the regex catches.
Coordinator awaits subagent 1 before spawning subagent 2. Latency = sum(subagents) ~ 25s for 5 tasks. Parallel architecture wasted.
Use asyncio.gather(*tasks) (Python) or Promise.all(tasks) (TS). All N subagents fire at once. Latency drops to max(subagents) ~ 5s.
Researcher A imports Researcher B and passes a finding directly. B inherits A's call-site context. Coordinator loses visibility on the dependency.
Route through the coordinator. A returns its finding; coordinator builds B's task prompt with the finding embedded. Hub-and-spoke is non-negotiable.
Coordinator spawns 50 subagents at once for a long taxonomy. Anthropic API rate-limits half of them; retry storms compound the problem.
Wrap fan-out in Semaphore(MAX_PARALLEL=5). Set a retry budget per subagent (default 2). Beyond that, accept partial data and mark the gap.
Web-search subagent times out, returns []. Coordinator interprets empty list as no information exists. Final report has a silent gap with no acknowledgement.
Return {status: 'timeout', query, partial_results, alternatives}. Coordinator inspects status_code, retries narrower or marks the gap transparently in synthesis.
Exam patterns
5 V2 questions wired to this deep dive. Each shows all 4 options with rationale, the mental model under test, and the priority order across distractors.
Concepts wired
4 primitives compose this sub-pattern. Each card links to the concept page where the primitive is taught in isolation.
Continue the parent
2 more sub-patterns under Multi-Agent Research System. Each one drills into a different load-bearing decision.
Structured Claim-Source Mapping
How verified claims get pinned to their source documents in a structured-output schema so the final report cannot fabricate.
Every verified claim emits as a JSON object with claim_id, claim_text, source_url, source_passage, confidence, and notes. The schema is enforced at the verification step. Synthesis renders citations from the schema and cannot invent attributions because the source text is pinned in-record.
Subagent allowedTools and Isolation
How the parent agent enforces tool whitelists per subagent and how each subagent runs in a fresh context with no chat-history inheritance.
Every subagent declares an allowedTools list ([Read, WebSearch, Bash] for research, [Read] only for synthesis). The SDK enforces it. Each subagent runs in a fresh isolated context with no inherited messages. Every fact it needs is embedded in the task prompt. Tool overscoping and history inheritance are the canonical failure modes.
Frequently asked
Should decomposition happen at runtime or be hardcoded?
How does the coordinator know when all subagents are done?
asyncio.gather / Promise.all resolves when every task has either returned or raised. The coordinator inspects each result's stop_reason and status_code to decide accept, retry, or mark a gap. There is no polling. The runtime owns the join.