Structured Claim-Source Mapping · Multi-Agent Research System

01 · The pattern

What it is

Structured claim-source mapping is the contract that says every claim in the final report is traceable to a specific passage in a specific source. The verification subagent emits JSON that pairs each claim with its provenance: claim_id, claim_text, source_url, source_passage (the actual quoted text, not just a URL), publication_date, and confidence. Synthesis reads only this schema and renders inline citations from it. No free-form attribution, no model-generated URLs.

The architectural reason is anti-fabrication. A synthesis subagent that writes prose plus citations from memory will eventually produce a citation that looks plausible but doesn't exist. By forcing synthesis to render [1] from a record where source_passage is already pinned, you remove the model's ability to hallucinate the link. If the passage isn't in the schema, there is no citation to render. Either the record is complete or it gets surfaced as a data gap.

The schema also handles the conflicting-source case explicitly. When two sources disagree (45% Pew vs 12% McKinsey), the verification subagent emits a sources_reconciled array with both records pinned, plus a notes field that explains the apparent conflict (different timeframes, different definitions, different populations). Synthesis presents both with attribution; it does not pick a winner. Picking one is misinformation. Preserving both with context is journalism.

02 · How it runs

How it works

The verification subagent receives pooled findings from research subagents and a fact-check rubric. For each candidate claim, it confirms the source is credible and dated, extracts the verbatim source_passage that backs the claim, and assigns a confidence score. The output is JSON only: {verifications: [{claim_id, claim_text, verified, confidence, sources_reconciled: [{stat, source_url, source_passage, publication_date, context}], notes}]}. The schema is enforced via Pydantic in Python or Zod in TypeScript; malformed output fails the verification step and triggers a retry.

Synthesis is read-only. It receives the verified-claims JSON, the coordinator's narrative prompt, and a tool list of [Read] only. It walks the verifications array in order, writes prose that flows logically, and emits inline [1], [2] citations that index into sources_reconciled. The render is mechanical. Synthesis is never asked to invent an attribution; if it tries, the verification record is the only thing the citation can point to.

Data gaps are first-class. When verification finds a claim it cannot confirm (no credible source, conflicting data without enough context, or a research-subagent timeout), it emits {verified: false, notes: 'no credible source within window'}. Synthesis is instructed to acknowledge that gap in prose: Adoption rates among independent musicians remain unverified across our sources. Transparent gaps beat confident fabrication every time. This is the architectural detail that protects the report from looking complete when it isn't.

03 · Configuration decisions

The 4 decisions

Each row pairs the right answer with the most-tested distractor. The Why column explains the failure mode behind the wrong choice.

Decision	Right answer	Wrong answer	Why
Two sources disagree (45% Pew vs 12% McKinsey)	Preserve both in `sources_reconciled` with attribution + `notes` explaining the difference	Pick the higher-confidence source and drop the other	Both numbers are correct under their own definitions (any-use vs daily-use). Dropping one is misinformation. Preserving both with context is the journalistic and architectural move.
Synthesis needs a citation for a claim. Where does the URL come from?	From the `source_url` field in the verification record	Synthesis generates the citation from memory of training data	Model-generated citations are the canonical fabrication failure. Schema-pinned citations cannot be invented. The model is rendering, not authoring.
A claim has no credible source. What happens?	Emit `{verified: false, notes: 'no credible source'}`. Synthesis acknowledges the gap in prose	Drop the claim silently from the final report	Silent drops produce reports that look complete when they aren't. Acknowledged gaps are honest and let the reader judge confidence.
Should the schema include the verbatim `source_passage`?	Yes. The exact quoted text that backs the claim	Just the URL. The passage can be re-fetched at render time	Re-fetching introduces a new failure mode (URL went 404, page changed). Pinning the passage at verification time freezes provenance. The record is self-contained.

04 · Failure modes

Where it breaks

5 failure pairs. Each one is one exam pattern. The fix is always architectural, never a prose plea to the model.

❌ Free-form citations

Synthesis writes prose with (Pew, 2024) style citations from memory. Half the year tags are wrong; one URL doesn't resolve.

✅ Fix

Force synthesis to render [1], [2] indexes that resolve through the verified-claims JSON. The model is a renderer, not a citation author.

❌ Conflict-flattening

Verification sees 45% Pew and 12% McKinsey, writes ~30% (averaged). The averaged number doesn't exist anywhere; the report is misinformation.

✅ Fix

Emit both in sources_reconciled with attribution + notes explaining timeframe/definition/population differences. Synthesis presents both with context.

❌ Silent unverified-claim drop

Verification can't confirm a claim, drops it. Final report reads as if no such question was ever asked.

✅ Fix

Emit {verified: false, notes: 'unverified'} and instruct synthesis to acknowledge the gap. Transparency beats false completeness.

❌ Passage-by-URL only

Schema stores source_url but not source_passage. A week later the URL returns 404 or the page is rewritten. Report claims become unverifiable.

✅ Fix

Pin the verbatim source_passage at verification time. The record is self-contained even if the source URL drifts.

❌ Schema not enforced

Verification emits malformed JSON; synthesis parses what it can and improvises the rest. Fabrication slips back in via the missing fields.

✅ Fix

Validate verification output with Pydantic / Zod. Malformed output fails the step and triggers a retry. Synthesis never sees an incomplete record.

05 · Exam patterns

Exam patterns

5 V2 questions wired to this deep dive. Each shows all 4 options with rationale, the mental model under test, and the priority order across distractors.

A web-search subagent times out and returns an empty result list. The coordinator treats this as 'no information available' and moves forward. The final report is incomplete. What is the architectural fix?

A research report cites two conflicting statistics: '45% of creative workers use AI' (Pew) and '12% use AI daily' (McKinsey). Should synthesis pick the more likely one?

A synthesis subagent needs to verify ~100 facts in a final report. Calling verify_fact sequentially takes 60+ seconds. What is the architectural fix?

Research system decomposes 'impact of AI on creative industries' into visual arts, music, and writing. Web-search subagent finds excellent results for all three. Synthesis covers only visual arts. Where is the bug?

Subagent A (academic papers) finds a key direction. Subagent B (web search) needs that direction. A junior engineer suggests A call B directly to skip a coordinator round trip. Why is that wrong?

06 · Concepts in play

Concepts wired

4 primitives compose this sub-pattern. Each card links to the concept page where the primitive is taught in isolation.

Structured outputs

Schema-enforced verification

Concept: structured-outputs ↗

Evaluation

Verification as fact-check gate

Concept: evaluation ↗

tool_choice

Forced JSON schema on verification

Concept: tool-choice ↗

Context window

Synthesis read-only on verified set

Concept: context-window ↗

07 · Sibling deep dives

Continue the parent

2 more sub-patterns under Multi-Agent Research System. Each one drills into a different load-bearing decision.

Coordinator Routing

How the coordinator decomposes a research query and dispatches subtasks across the hub-and-spoke topology.

The coordinator owns semantic decomposition (not lexical), enumerates every relevant sub-domain before spawning anything, and dispatches research subagents in parallel via a synchronous-fork-then-join pattern. Coverage gaps live in the decomposition step. Not in the subagents.

Read deep dive ↗

Subagent allowedTools and Isolation

How the parent agent enforces tool whitelists per subagent and how each subagent runs in a fresh context with no chat-history inheritance.

Every subagent declares an allowedTools list ([Read, WebSearch, Bash] for research, [Read] only for synthesis). The SDK enforces it. Each subagent runs in a fresh isolated context with no inherited messages. Every fact it needs is embedded in the task prompt. Tool overscoping and history inheritance are the canonical failure modes.

Read deep dive ↗

08 · FAQ

Frequently asked

Why pin the passage instead of just the URL?

URLs drift. Pages get rewritten, papers retracted, sites moved. Pinning the verbatim source_passage at verification time freezes provenance in the record itself. The record is self-contained even if the source URL later breaks.

How does the schema prevent fabrication if the model is still generating prose?

The model generates prose freely, but citation rendering is mechanical: [1] resolves through the verified-claims JSON to a record with source_url and source_passage. If a record doesn't exist, there's no [N] to render. The model can't invent attributions because it's rendering an array, not authoring citations.

What confidence score threshold should trigger a gap acknowledgement?

Domain-dependent, but a reasonable default is < 0.6 confidence triggers a notes field that synthesis surfaces. Below < 0.3 triggers an explicit gap acknowledgement. Above >= 0.6, present the claim normally. Tune by calibrating against human-graded reports.

Structured Claim-Source Mapping.

What it is

How it works

The 4 decisions

Where it breaks

Exam patterns

Concepts wired

Structured outputs

Evaluation

tool_choice

Context window

Continue the parent

Coordinator Routing

Subagent allowedTools and Isolation

Frequently asked

Structured Claim-Source Mapping, complete.

Structured Claim-Source Mapping.

What it is

How it works

The 4 decisions

Where it breaks

Exam patterns

Concepts wired

Structured outputs

Evaluation

tool_choice

Context window

Continue the parent

Coordinator Routing

Subagent allowedTools and Isolation

Frequently asked

Structured Claim-Source Mapping, complete.

Share this primitive