Multi-Agent Research System | Claude Architect Certification Prep

01 · Problem framing

The problem

What the customer needs

Complete coverage of the research topic. Every relevant sub-domain enumerated, none silently dropped.
Reconciled contradictions preserved with attribution, not flattened into one 'most likely' number.
Cited final report that traces every claim to a verifiable source and acknowledges data gaps.

Why naive approaches fail

Coordinator decomposes 'creative industries' into only visual arts. Misses music, writing, film, performing arts.
Web-search subagent times out and returns empty results as success. Coordinator treats as 'no info' instead of 'needs retry'.
Synthesis picks the 'more likely' statistic between 45% (Pew) and 12% (McKinsey). Drops the conflict, ships misinformation.

Definition of done

Topic-coverage gap rate = 0 (decomposition reviewed before spawn)
Timeout-as-empty-results rate = 0 (structured error context required)
Contradiction-preservation rate = 100% (both stats + sources retained)
Subagent-to-subagent direct call rate = 0 (all routes through coordinator)

02 · Architecture

The system

03 · Component detail

What each part does

5 components, each owns a concept. Click any card to drill into the underlying primitive.

Coordinator Agent

the hub of hub-and-spoke

Receives the user query, performs SEMANTIC decomposition (not lexical) into all relevant sub-domains, spawns research subagents in parallel with explicit task prompts, awaits all results, routes findings to verification, hands verified claims to synthesis. Owns every cross-subagent communication path.

Configuration

Decomposition is the load-bearing step. For 'impact of AI on creative industries' the coordinator must enumerate visual + music + writing + film + performing arts, not stop at the first sub-domain. Spawn pattern: asyncio.gather (Python) / Promise.all (TS).

Concept: subagents ↗

Research Subagent (parallel)

scoped tools, isolated context

One subagent per sub-domain. Receives an explicit task prompt. No inherited history, no parent context. Runs research with a narrow tool whitelist (Read, WebSearch, Bash). Returns structured findings JSON: {claim, sources: [{url, date, confidence}]}. Never editorialises; reports facts as stated.

Configuration

system: "You are a research specialist. Find authoritative sources. Return JSON {findings: [{claim, sources: [...]}]}." tools: [Read, WebSearch, Bash]. messages: [{role: "user", content: task_from_coordinator}].

Concept: tool-calling ↗

Verification Subagent

fact-check + reconcile contradictions

Cross-checks all claims from research subagents. When two sources conflict (45% Pew vs 12% McKinsey), preserves both with their context and attribution rather than picking the 'more likely' one. Returns verified claims with confidence scores; the verification step is what protects the report from misinformation.

Configuration

Input: pooled claims from all research subagents. Output: {verifications: [{claim, verified, confidence, sources_reconciled, notes}]}. Notes field captures the context that explains apparent contradictions (different timeframes, definitions, populations).

Concept: evaluation ↗

Synthesis Subagent

read-only narrative generator

Receives verified claims + the coordinator's narrative prompt. Writes a cohesive Markdown report with inline citations [1], [2]. CRITICAL: tools restricted to Read only. No WebSearch, no Bash. This prevents re-research and keeps synthesis focused on stitching the verified facts into a story.

Configuration

system: "You are a synthesis specialist. Read verified findings and write a cited narrative. Do NOT research." tools: [Read]. Input: {verified_claims, narrative_prompt}. Output: Markdown with [n] citations.

Concept: context-window ↗

Error Propagation Layer

structured timeout context

When a subagent times out or hits a dead end, returns structured error context the coordinator can act on: {status: 'timeout', query, partial_results, alternatives}. Coordinator inspects status_code and either retries with a narrower scope, accepts partial data, or transparently marks the gap in the final report.

Configuration

On timeout: {status: 'timeout', query, partial, alternatives: ['narrower query', 'different keywords', ...]}. On no_results: {status: 'no_results', query, alternatives}. Never return [] as success. Silence loses the failure context.

Concept: structured-outputs ↗

04 · One concrete run

Data flow

05 · Build it

Eight steps to production

Build the coordinator's semantic decomposition

The decomposition step is where most coverage failures actually originate. Analyse the topic semantically and enumerate ALL relevant sub-domains before spawning anything. For 'creative industries', that means visual arts AND music AND writing AND film AND performing arts. Not the first one that comes to mind. The decomposition is the coordinator's load-bearing responsibility.

Build the coordinator's semantic decomposition

from typing import List

def decompose_query(query: str) -> List[str]:
    """Semantic decomposition. Enumerate all relevant sub-domains.

    The exam-question distractor is to blame subagents for narrow
    coverage when the coordinator's decomposition was the bug.
    """
    q = query.lower()
    if "creative industries" in q:
        domains = [
            "visual arts (digital art, graphic design, photography)",
            "music production and composition",
            "writing (novels, journalism, screenwriting)",
            "film and video production",
            "performing arts (theater, dance)",
        ]
    elif "healthcare" in q:
        domains = [
            "clinical decision support",
            "medical imaging and diagnostics",
            "drug discovery and trials",
            "patient-facing communication",
            "administrative + revenue cycle",
        ]
    else:
        # Generic fallback. STILL decompose, never single-shot
        domains = [
            f"{query}. Recent academic literature",
            f"{query}. Industry case studies",
            f"{query}. Empirical adoption data",
        ]
    return [f"Find AI impact on {d}" for d in domains]

↪ Concept: subagents

Define subagent system prompts and tool whitelists

Every subagent gets its own system prompt + scoped tool list. Research subagents get [Read, WebSearch, Bash]; verification gets [Read, WebSearch, Bash] + a fact-check rubric; synthesis gets [Read] only. That read-only restriction is the architectural detail that prevents synthesis from re-researching mid-narrative.

Define subagent system prompts and tool whitelists

RESEARCH_SUBAGENT_SYSTEM = """You are a research specialist.
1. Find authoritative sources on the assigned sub-domain.
2. Extract key claims with evidence.
3. Return structured JSON ONLY:
   {"findings": [
       {"claim": "...", "sources": [{"url": "...", "date": "YYYY-MM-DD", "confidence": 0.0-1.0}]}
   ]}
Do NOT synthesize, editorialize, or pick winners between conflicting sources.
Report facts as stated. Preserve contradictions for verification."""

VERIFICATION_SUBAGENT_SYSTEM = """You are a fact-checker.
1. Read pooled claims from research subagents.
2. Cross-check each claim against its sources.
3. When sources conflict, PRESERVE BOTH with attribution + context.
4. Return JSON:
   {"verifications": [
       {"claim": "...", "verified": true|false, "confidence": 0.0-1.0,
        "sources_reconciled": [...], "notes": "context about conflicts"}
   ]}"""

SYNTHESIS_SUBAGENT_SYSTEM = """You are a synthesis specialist.
1. Read verified findings (JSON).
2. Write a cohesive Markdown narrative with inline [n] citations.
3. Acknowledge data gaps transparently when verifications flag them.
4. Do NOT conduct new research. You have ONLY the Read tool."""

↪ Concept: tool-calling

Spawn research subagents in parallel

All research subagents fire at once via async fan-out. Latency is max(subagents), not sum. The whole point of the architecture. Each subagent receives an explicit task prompt with the context it needs; nothing is inherited from the coordinator's history. Cost: N separate API calls. Worth it.

Spawn research subagents in parallel

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()

async def spawn_research(task: str) -> dict:
    """One isolated research subagent. No inherited history."""
    resp = await client.messages.create(
        model="claude-sonnet-4.5",
        max_tokens=2048,
        system=RESEARCH_SUBAGENT_SYSTEM,
        tools=RESEARCH_TOOLS,  # [Read, WebSearch, Bash]
        messages=[{"role": "user", "content": task}],
    )
    return {
        "task": task,
        "stop_reason": resp.stop_reason,
        "result": extract_json(resp),
    }

async def research_in_parallel(tasks: list[str]) -> list[dict]:
    """Fan out N subagents at once. Latency = max, not sum."""
    return await asyncio.gather(*(spawn_research(t) for t in tasks))

# Coordinator usage
tasks = decompose_query("impact of AI on creative industries")
results = asyncio.run(research_in_parallel(tasks))

↪ Concept: agentic-loops

Return structured error context, never silence

When a subagent times out or hits no results, the WORST thing it can do is return []. The coordinator then can't tell whether 'no info exists' or 'we never got the data'. A critical distinction for the final report. Always return a structured error: status + query + partial_results + alternatives. The coordinator inspects status_code and decides: retry, narrow, or transparently mark the gap.

Return structured error context, never silence

def handle_subagent_error(error_type: str, query: str, partial: list | None = None) -> dict:
    """Structured error context. Coordinator consumes status_code to decide."""
    if error_type == "timeout":
        return {
            "status": "timeout",
            "query": query,
            "partial_results": partial or [],
            "alternatives": [
                "Narrow the query to a single sub-aspect",
                "Try synonyms or domain-specific terms",
                "Reduce the time horizon (e.g., last 12 months)",
            ],
        }
    if error_type == "no_results":
        return {
            "status": "no_results",
            "query": query,
            "alternatives": ["Broaden keywords", "Check spelling", "Try sister terms"],
        }
    if error_type == "rate_limited":
        return {"status": "rate_limited", "query": query, "retry_after_s": 60}
    return {"status": "unknown_error", "query": query}

# Coordinator inspects status_code
def coordinator_handle(result: dict, original_task: str):
    status = result.get("status")
    if status == "timeout":
        # Either retry with the first alternative or include a transparent gap
        return retry_with_narrower_query(result["alternatives"][0])
    if status == "no_results":
        return mark_data_unavailable(original_task)
    return result  # success

↪ Concept: structured-outputs

Run the verification subagent and preserve contradictions

Pool all claims from research subagents and pass them to a single verification subagent. When two sources conflict (45% Pew vs 12% McKinsey), the verification subagent's job is NOT to pick a winner. It is to preserve both with their context (different definitions, different timeframes, different populations) and attribute each to its source. Picking one is misinformation; preserving both is journalism.

Run the verification subagent and preserve contradictions

def build_verification_task(pooled_claims: list[dict]) -> str:
    """Pool all research-subagent claims for fact-checking."""
    body = "\n".join(
        f"- Claim: {c['claim']}\n  Sources: {c['sources']}"
        for c in pooled_claims
    )
    return f"""Verify the following claims pooled from research subagents.

For each claim:
1. Check that sources are credible and dated.
2. If two or more sources CONFLICT, do NOT pick a winner.
   Preserve both with their context (timeframe, definition, population)
   and emit both in sources_reconciled with notes explaining the apparent
   contradiction. The reader gets BOTH numbers + WHY they differ.
3. Return JSON:
   {{"verifications": [
       {{"claim": "...", "verified": true|false, "confidence": 0.0-1.0,
         "sources_reconciled": [...], "notes": "..." }}
   ]}}

CLAIMS TO VERIFY:
{body}"""

# Coordinator pools claims and dispatches
all_claims = [c for r in research_results for c in r.get("findings", [])]
verification_task = build_verification_task(all_claims)
verified = client.messages.create(
    model="claude-sonnet-4.5",
    max_tokens=4096,
    system=VERIFICATION_SUBAGENT_SYSTEM,
    tools=VERIFICATION_TOOLS,
    messages=[{"role": "user", "content": verification_task}],
)

↪ Concept: evaluation

Run the synthesis subagent with READ-ONLY tools

Synthesis is the final step. It receives verified claims + the coordinator's narrative prompt, and emits Markdown with inline citations. The crucial detail: the synthesis subagent's tool list is [Read] only. No WebSearch, no Bash. That restriction prevents it from re-researching mid-narrative (a common failure mode where synthesis fact-checks itself again and inflates latency 3-5×).

Run the synthesis subagent with READ-ONLY tools

def synthesize(verified_claims: list[dict], narrative_prompt: str) -> str:
    """Read-only narrative generation. No re-research."""
    task = f"""Narrative prompt from coordinator:
{narrative_prompt}

Verified findings (JSON):
{json.dumps(verified_claims, indent=2)}

Write a Markdown report that:
1. Flows logically from finding to finding
2. Uses inline citations [1], [2], etc. matching sources_reconciled order
3. ACKNOWLEDGES data gaps transparently where verifications.notes flagged them
4. Does NOT re-research or speculate beyond the verified findings

You have ONLY the Read tool. No WebSearch, no Bash."""

    resp = client.messages.create(
        model="claude-sonnet-4.5",
        max_tokens=3000,
        system=SYNTHESIS_SUBAGENT_SYSTEM,
        tools=[READ_TOOL_ONLY],  # Critical: read-only
        messages=[{"role": "user", "content": task}],
    )
    return extract_text(resp)

↪ Concept: context-window

Route ALL communication through the coordinator

If subagent B needs a finding from subagent A, the answer is NOT to call A from B. The answer is: A finishes, returns to coordinator, coordinator passes the finding into B's task prompt. This single rule preserves isolation (each subagent has clean context), parallelism (when dependencies allow), and visibility (the coordinator owns the whole orchestration graph).

Route ALL communication through the coordinator

# WRONG. Direct subagent-to-subagent communication
# class ResearcherA:
#     def __init__(self, researcher_b):
#         self.b = researcher_b  # ❌ creates hidden coupling
#     async def find_papers(self, topic):
#         finding = await self._search(topic)
#         await self.b.search_web(finding)  # ❌ breaks isolation

# RIGHT. Coordinator owns all routing
async def coordinated_dependent_research(topic: str):
    # Phase 1: A runs alone
    a_finding = await spawn_research(f"Find academic papers on {topic}")

    # Phase 2: coordinator passes A's finding into B's TASK PROMPT
    b_task = (
        f"Topic: {topic}. "
        f"Key research direction surfaced by paper search: "
        f"{a_finding['result']['top_direction']}. "
        f"Now search the web for industry coverage of that direction."
    )
    b_result = await spawn_research(b_task)

    # Phase 3: coordinator continues with both
    return {"a": a_finding, "b": b_result}

↪ Concept: subagents

Cap parallelism and add retry budgets

Parallel fan-out has diminishing returns past 5-7 subagents. API concurrency limits, context-window contention on the coordinator side, and rate-limit backpressure all kick in. Cap concurrency, set a retry budget per subagent (typically 2 retries with narrowed queries), and emit telemetry: spawn count, parallel max, retry rate, partial-data rate. These metrics are how you tune the system in production.

Cap parallelism and add retry budgets

import asyncio

MAX_PARALLEL = 5
RETRY_BUDGET = 2

semaphore = asyncio.Semaphore(MAX_PARALLEL)

async def spawn_with_retry(task: str, attempt: int = 0) -> dict:
    """Bounded-concurrency research with structured-error retry."""
    async with semaphore:
        result = await spawn_research(task)
        status = result.get("result", {}).get("status")
        if status == "timeout" and attempt < RETRY_BUDGET:
            narrower = narrow_query(result["result"].get("alternatives", [task])[0])
            telemetry.increment("subagent.retry", labels={"reason": "timeout"})
            return await spawn_with_retry(narrower, attempt + 1)
        return result

# Coordinator with bounded fan-out + retries
results = await asyncio.gather(*(spawn_with_retry(t) for t in tasks))

↪ Concept: evaluation

06 · Configuration decisions

The four decisions

Decision	Right answer	Wrong answer	Why
Coverage gap appears in the final report	Audit the coordinator's decomposition first. It almost always lives there	Tune subagent prompts or upgrade their model	Decomposition is the coordinator's job. If 4 of 5 sub-domains were never enumerated, no amount of subagent quality recovers them. Fix decomposition; the rest follows.
Subagent timed out. What does it return?	Structured error: {status: 'timeout', query, partial_results, alternatives}	Empty list `[]`, marked as success	Silence loses the failure context. The coordinator can't distinguish 'no info exists' from 'we never got the data', so the final report can't acknowledge the gap honestly.
Two sources disagree (45% Pew vs 12% McKinsey)	Verification preserves both with attribution + context (different definitions, different timeframes)	Synthesis picks the 'more likely' one and drops the other	Both numbers are correct under their respective definitions. Dropping one is misinformation. Preserving both with context is the journalistic move and the architectural one.
Subagent A's output is needed by Subagent B	A returns to coordinator; coordinator passes A's finding into B's task prompt	A calls B directly with the finding	Direct subagent-to-subagent calls break isolation, kill parallelism (B waits for A), and hide the dependency from the coordinator's view. Hub-and-spoke is the architecture for a reason.

07 · Failure modes

Where it breaks

Five failure pairs. Each one is one exam question. The fix is always architectural, deterministic gates, structured fields, pinned state.

❌ Narrow task decomposition

Coordinator decomposes 'creative industries' into visual arts only; report misses music, writing, film, performing arts. Subagents finished successfully. The bug is upstream.

AP-10

✅ Fix

Fix the coordinator's semantic decomposition. Enumerate ALL relevant sub-domains before spawning. The decomposition step is the coordinator's load-bearing responsibility.

❌ Silent timeout returns empty as success

Web-search subagent times out, returns []. Coordinator treats as 'no information exists' instead of 'timed out'. Final report has a silent gap.

AP-11

✅ Fix

Return structured error: {status: 'timeout', query, partial_results, alternatives}. Coordinator inspects status_code, retries with a narrower scope, or marks the gap transparently in the report.

❌ Latency bloat from over-broad verification

Synthesis subagent calls verify_fact for 100 claims sequentially. 80 are simple (Wikipedia), 20 complex. Total 60+ seconds for what should be 10.

AP-12

✅ Fix

Scope verify_fact narrowly: simple-claim batch verification (parallel, ~3s) + dedicated complex-verification subagent (parallel, pre-synthesis). Synthesis assumes facts are pre-verified.

❌ Dropped contradictions

Two sources conflict (45% any-use Pew vs 12% daily-use McKinsey). Synthesis picks the 'more likely' one; the other is dropped. Report is misinformation.

AP-13

✅ Fix

Preserve both at the verification step with sources_reconciled + notes explaining the apparent conflict. Synthesis presents both with attribution. Reader sees both numbers and why they differ.

❌ Direct subagent-to-subagent communication

Researcher A (papers) directly hands a finding to Researcher B (web). Isolation breaks; parallelism degrades to sequential; coordinator loses visibility.

AP-14

✅ Fix

Route everything through the coordinator. A returns to coordinator; coordinator constructs B's task prompt with A's finding embedded. Hub-and-spoke is non-negotiable.

08 · Budget

Cost & latency

Research subagents (3-5 parallel)

~$0.06-0.15 per query

3 subagents × ~20K input + ~2K output ≈ $0.06; 5 subagents ≈ $0.10. Parallel: latency = max(subagents) ≈ 3-5s, cost = sum.

Verification subagent

~$0.03-0.05 per query

Reads pooled findings (~15K) + cross-checks (~10K) + emits verifications (~1K) ≈ $0.04. Single pass, serial.

Synthesis subagent

~$0.02-0.03 per query

Reads verified findings (~5K) + generates Markdown narrative (~3K). Read-only tools keep cost low; no re-research.

Retry overhead (timeouts)

~$0.01-0.02 per retry

Narrowed retries (~10K input + 1K output). Cap at 2 retries per subagent to bound cost; beyond that, accept partial data and mark the gap.

p95 end-to-end latency

~10-14s

Decompose ~0.5s + parallel research ~5s + verification ~3s + synthesis ~3s + coordinator overhead. Subagents in parallel save ~10s vs sequential.

09 · Ship gates

Ship checklist

Two passes. Build-time gates verify the code; run-time gates verify the system in production.

Build-time

Coordinator's decomposition function reviewed for completeness BEFORE first spawn↗ subagents
Each subagent has its own system prompt + scoped tool whitelist↗ tool-calling
Synthesis subagent has Read-only tool list (no WebSearch, no Bash)↗ context-window
All subagent fan-out via async gather / Promise.all↗ agentic-loops
Structured error context on every timeout / no-results path↗ structured-outputs
Verification subagent preserves contradictions with attribution↗ evaluation
No direct subagent-to-subagent calls anywhere in the codebase↗ subagents
Coordinator inspects status_code on every subagent return
Bounded concurrency (semaphore) to cap parallel API calls
Per-subagent retry budget with narrowed-query alternatives
Telemetry: spawn count, parallel max, retry rate, partial-data rate

Run-time

Decomposition function unit-tested with 5+ representative topic types
Each subagent's tool whitelist verified. No Bash + Edit + Write together unless intended
Structured error contract enforced via TypeScript / Pydantic schema
Bounded concurrency (semaphore) tested under burst load (20+ queued tasks)
Retry budget per subagent capped (default 2); telemetry on retry rate
Verification subagent has fact-check rubric + contradiction-preservation rule documented
Synthesis subagent's tool list is [Read] only; lint check prevents regression
End-to-end test: contradictions survive from research → verification → synthesis with attribution

10 · Question patterns

Five exam-pattern questions

A research system decomposes 'impact of AI on creative industries' into three subtopics: visual arts, music, writing. The web-search subagent finds results for all three. The synthesis subagent produces a report covering only visual arts. Why?

The root cause is the coordinator's decomposition, not the subagents. Subagents finished successfully on what they were assigned; the coordinator only assigned visual-arts tasks. The fix is upstream: analyse the topic semantically and enumerate all relevant sub-domains (visual + music + writing + film + performing arts) before spawning. Decomposition is the coordinator's load-bearing responsibility, and the most-tested distractor on this scenario is to blame the subagents instead.

A web-search subagent times out and returns an empty result list. The coordinator treats this as 'no information available' and moves forward. The final report is incomplete. What's the architectural fix?

Subagents must return structured error context on timeout, never silence. The shape: {status: 'timeout', query, partial_results, alternatives: ['narrower query', 'different keywords', ...]}. The coordinator inspects status_code and decides: retry with the first alternative, accept partial data, or transparently mark the gap in the synthesis. Returning [] as success conflates 'no info exists' with 'we never got the data'. And the report can't acknowledge a gap it can't see. Tagged to AP-11.

A research report cites two conflicting statistics: '45% of creative workers use AI' (Pew) and '12% use AI daily' (McKinsey). Should synthesis pick the more likely one?

No. Both are correct. They measure different things (any-use vs daily-use) under different methodologies. The verification subagent's job is to preserve both with attribution + context, structured as

sources_reconciled: [{stat: '45%', source: 'Pew', context: 'any use'}, {stat: '12%', source: 'McKinsey', context: 'daily use'}]

plus a notes field explaining the apparent contradiction. Synthesis then presents both with attribution. Picking one is misinformation; preserving both is journalism. Tagged to AP-13.

Subagent A (academic papers) finds a key research direction. Subagent B (web search) needs that finding to guide its queries. Should A pass it directly to B?

No. All cross-subagent communication routes through the coordinator. A returns its finding; the coordinator constructs B's task prompt with the finding embedded: Topic: ... Key direction from paper search: [A's finding]. Now search the web for that direction. Direct calls break isolation (B inherits A's context noise), kill parallelism (B waits for A even when it could run in parallel), and hide the dependency from the coordinator's orchestration graph. Hub-and-spoke is non-negotiable. Tagged to AP-14.

A synthesis subagent needs to verify ~100 facts in a final report. Calling `verify_fact` sequentially takes 60+ seconds. What's the architectural fix?

Two layers. First, scope verify_fact narrowly: simple-claim verification (Wikipedia lookups, ~5ms each) batched in parallel completes ~80 claims in ~2s. Second, dedicate a separate verification subagent that runs before synthesis, processing complex multi-source reconciliation in parallel; synthesis then assumes facts are pre-verified and only reads. Total latency drops from 60s+ to ~3-5s. The general lesson: don't let the synthesis subagent re-research mid-narrative. Tagged to AP-12.

11 · FAQ

Frequently asked

Should subagents run in parallel or in sequence?

Parallel whenever possible. Independent research tasks (visual arts, music, writing) all run at once via asyncio.gather / Promise.all. Cost: N separate API calls. Latency: max(N) ≈ 5-8s, not sum. Sequence only when there's a true data dependency (B needs A's output). And even then, the coordinator handles the chaining; subagents never call each other.

Can subagents inherit the coordinator's conversation history?

No. Subagents are isolated by design. That's the architectural win. The coordinator passes context explicitly in the subagent's task prompt: User asked: [query]. Key context so far: [pinned facts]. Your task: [focused research goal]. Subagent starts fresh with only what's in the task. Inheriting history defeats parallelism and bloats per-subagent context cost.

What happens if multiple subagents return partial / timeout results?

Coordinator collects what came back, invokes synthesis with a narrative-prompt note: Research is incomplete due to timeouts on [X, Y]. The report should acknowledge gaps in those areas explicitly. Transparency beats silence. The reader sees a report that says 'we got these 3 sub-domains; the other 2 timed out' rather than a confidently-misleading report missing 2 whole sub-domains.

Should the synthesis subagent have web-search access?

No. Read-only is the architectural detail. Synthesis stitches verified findings into a narrative; it does not re-research. If synthesis needs to verify a fact mid-sentence, that's a sign verification should have been broader upstream. Fix the verification phase, not the synthesis tool list. Read-only also caps the latency and cost of synthesis predictably.

How do we handle contradictions surfaced by research subagents?

Don't resolve them at the subagent level. Pass conflicting findings to the verification subagent with sources intact. Verification reconciles: preserves both with attribution + a notes field explaining the conflict (different timeframes, definitions, populations, methodologies). Synthesis then presents both with context. The reader gets transparency; the system avoids fabricating false certainty.

Can a subagent spawn another subagent (nested)?

In theory yes; in practice avoid it. Nested subagents increase latency, complicate context flow, and obscure the orchestration graph from the coordinator. Keep the hierarchy shallow: coordinator → leaf subagents. If you need 'meta-research' (one subagent's job is to figure out what to research), have the coordinator do that decomposition step explicitly.

What's the maximum number of subagents to run in parallel?

No hard limit, diminishing returns past 5-7. API concurrency limits, rate-limit backpressure, and context-window contention on the coordinator side all start kicking in. Use a bounded semaphore (MAX_PARALLEL = 5), measure latency at different fan-outs, and tune to your workload. For 10+ tasks, consider Batch API or sequential task chains.

Multi-Agent Research System.

The problem

What the customer needs

Why naive approaches fail

The system

What each part does

Coordinator Agent

Research Subagent (parallel)

Verification Subagent

Synthesis Subagent

Error Propagation Layer

Data flow

Eight steps to production

Build the coordinator's semantic decomposition

Define subagent system prompts and tool whitelists

Spawn research subagents in parallel

Return structured error context, never silence

Run the verification subagent and preserve contradictions

Run the synthesis subagent with READ-ONLY tools

Route ALL communication through the coordinator

Cap parallelism and add retry budgets

The four decisions

Where it breaks

Cost & latency

Ship checklist

Build-time

Run-time

Five exam-pattern questions

Frequently asked

Multi-Agent Research System, complete.

Multi-Agent Research System.

The problem

What the customer needs

Why naive approaches fail

The system

What each part does

Coordinator Agent

Research Subagent (parallel)

Verification Subagent

Synthesis Subagent

Error Propagation Layer

Data flow

Eight steps to production

Build the coordinator's semantic decomposition

Define subagent system prompts and tool whitelists

Spawn research subagents in parallel

Return structured error context, never silence

Run the verification subagent and preserve contradictions

Run the synthesis subagent with READ-ONLY tools

Route ALL communication through the coordinator

Cap parallelism and add retry budgets

The four decisions

Where it breaks

Cost & latency

Ship checklist

Build-time

Run-time

Five exam-pattern questions

Frequently asked

Multi-Agent Research System, complete.

Share this primitive