On this page
TLDR
Delegation, Description, Discernment, Diligence, Anthropic's 4D framework for agent prompt design. A full deep-dive guide is coming soon. synthesis / authored
What it is
The 4D Framework is Anthropic's pedagogical model for AI fluency, taught in Claude Certifications and operationalized in prompt design. The four dimensions are Delegation, Description, Discernment, Diligence, each addressing a phase of the human-AI collaboration lifecycle. Not a technical architecture; a mental model for how to interact with AI responsibly and effectively. Every exam question on prompt design tests whether you recognize which D is missing.
Delegation is the decision of what work the human should do vs the AI. Example: "Should I research ten competitors myself or ask Claude?" Informed by platform awareness (Claude's strengths like synthesis, weakness like real-time data) and task awareness (what work benefits from AI). Poor delegation: asking Claude to verify legal text (wrong; humans must validate). Right: asking Claude to draft for human review.
Description is how you communicate the delegated task. Three sub-dimensions: Product (desired format), Process (step-by-step instructions), Performance (the tone/style/role). "Summarize" (vague) vs "3-sentence summary in bullets, non-technical audience, friendly advisor voice" (Description done right). Techniques: few-shot examples, explicit constraints, chain-of-thought prompting.
Discernment is the human's quality-control lens. LLMs are not infallible. Three sub-dimensions: Product Discernment (accurate?), Process Discernment (logical reasoning?), Performance Discernment (effective tone?). Loop: evaluate, refine Description, re-prompt. Diligence is responsible use: choosing the right model, transparency about AI's role, verifying before shipping.
How it works
The four Ds operate in sequence but as a cycle. Start with Delegation (what task to hand off), move to Description (write the prompt), execute, then Discernment (evaluate output), then Diligence (safe to deploy?). If Discernment detects a flaw, loop back to Description, re-execute, Discern again. Repeats until output meets standard, or you conclude the task is unsuitable for AI (back to Delegation).
Practical manifestation: (1) Delegation: "I need Claude to extract data from legal documents." (2) Description: prompt with input schema, few-shot examples, explicit rules. (3) Execute. (4) Discernment: validate JSON against schema, spot-check 5 documents. (5) Diligence: if 95%+ accurate, deploy with human-in-the-loop for edge cases; if <95%, loop back to Description.
The Framework is agnostic to model, task, scale. Whether Claude for a one-off question or a multi-agent system, the four Ds apply. In agentic loops, each turn re-runs the cycle: Delegation (which tool next?), Description (how to describe?), execute, Discernment (valid output?), Diligence (safe to append?).
The exam heavily tests Diligence + Delegation trade-offs. "Should we auto-approve refunds <$100?" Wrong: "Yes, automate everything." Right: "Diligence demands human review even for small amounts." The 4D lens makes this clear: Delegation decides what to hand off; Diligence validates that the outcome is safe to deploy without human oversight.

Where you'll see it
Customer support refund workflow with 4D
Delegation: humans handle policy interpretation; Claude handles fact-gathering. Description: prompt extracts customer ID, order ID, refund reason, summarizes policy-violation status. Discernment: validate fields, check policy interpretation. Diligence: refunds >$500 bypass Claude; <$100 auto-approved by Claude but logged.
Code review automation for PRs
Delegation: Claude reviews; humans merge. Description: provide PR diff, ask for JSON {verdict, critical_issues, recommendations}. Discernment: read review, verify verdict matches findings. Diligence: never auto-merge; require human click-through. Transparency: "AI-assisted code review, human approval required."
Expense report validation
Delegation: Claude extracts merchant, amount, category; humans authorize. Description: receipt image, structured output. Discernment: spot-check 10 manually. Diligence: expenses >$5000 auto-escalate; <$100 auto-approved + logged.
Email triage with 4D decomposition
Delegation: Claude filters spam, routes; humans respond. Description: classify (spam/feedback/billing/legal), extract intent, suggest routing. Discernment: false-positive rate <2% for legal mail. Diligence: escalate legal/compliance, never auto-delete.
Code examples
from anthropic import Anthropic
import json
client = Anthropic()
# DELEGATION: Claude extracts; humans validate and authorize
# DESCRIPTION: structured prompt with schema and constraints
# DISCERNMENT: validate output, measure accuracy
# DILIGENCE: high-risk amounts escalate, low amounts auto-approve
def validate_expense_4d(receipt_b64: str, threshold: float = 5000):
# DESCRIPTION: explicit schema + constraints
prompt = """Extract expense from receipt. Return JSON:
{
"merchant": "string",
"amount": number,
"currency": "USD" | "other",
"category": "travel" | "meals" | "supplies" | "other",
"tax": number,
"confidence_score": 0.0-1.0
}
- Confidence < 0.7? Set amount to null.
- Missing? Use null, don't invent.
"""
resp = client.messages.create(
model="claude-opus-4-5",
max_tokens=512,
messages=[{
"role": "user",
"content": [
{"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": receipt_b64}},
{"type": "text", "text": prompt},
],
}],
)
# DISCERNMENT: validate
try:
output = json.loads(resp.content[0].text)
except json.JSONDecodeError:
return {"status": "error", "reason": "invalid_json"}
if output.get("confidence_score", 0) < 0.7:
return {"status": "hold", "reason": "low_confidence", "data": output}
# DILIGENCE: route by amount
if output["amount"] and output["amount"] > threshold:
return {"status": "escalate", "reason": f"amount > $threshold", "data": output, "action": "SEND_TO_MANAGER"}
else:
return {"status": "approved", "data": output, "audit_log": {"method": "ai_extracted_auto_approved", "reviewer": "claude-opus-4-5"}}Looks right, isn't
Each row pairs a plausible-looking pattern with the failure it actually creates. These are the shapes exam distractors are built from.
The 4D Framework is a technical architecture (like MVC).
It's a mental model for human-AI collaboration, not a code pattern. Guides how you think about delegating, communicating, validating, and deploying responsibly.
If Discernment detects a flaw, increase max_tokens and retry.
Discernment failures usually mean Description was vague or Delegation was wrong. Refine the prompt or reconsider task suitability. More tokens rarely fix flawed delegation.
Diligence means trusting Claude's output completely.
Diligence means verifying and being honest about limitations. Trusting blindly is the opposite. Diligence is "I validated this before shipping."
All four Ds must be applied to every task.
Low-stakes tasks (brainstorming): Discernment and Diligence can be minimal. High-stakes (compliance, financial): all four are critical.
Description is the same as 'write a better prompt.'
Description includes Product, Process, Performance. "Better prompt" is vague. Description is specific: schema, examples, tone.
Side-by-side
| Dimension | Delegation | Description | Discernment | Diligence |
|---|---|---|---|---|
| What it answers | What work should AI do? | How do I describe the task? | Is the output correct? | Is it safe to deploy? |
| Phase | Before prompting | Prompt design | After execution | Final validation |
| Failure mode | AI does non-AI work (legal validation) | Vague prompt | Accepting output without review | Deploying without auditing |
| Fix | Reassign: human validates, Claude drafts | Add schema, examples, constraints | Spot-check; validate schema | Log decisions; escalate high-risk |
| Exam signal | "Should AI or human do this?" | "Write a prompt that..." | "Why did the output fail?" | "When should this be escalated?" |
Decision tree
Should a human or Claude do this work?
Is your prompt clear on Product, Process, Performance?
Did the output meet expectations?
Is it safe to deploy without human review?
Are you using all 4 Ds for high-stakes tasks?
Question patterns

11 V2 questions wired to this concept. Tap an answer to check it instantly — you'll see whether it's right and why — then expand the full breakdown for the mental model and all four rationales.
Tap your answer to check it.
Tap your answer to check it.
Tap your answer to check it.
Tap your answer to check it.
Tap your answer to check it.
Tap your answer to check it.
5 additional questions for this concept live in the practice pillar. Take a mock exam ↗
Frequently asked
Is the 4D Framework specific to Claude?
Can I skip one of the Ds?
Difference between Description and 'good prompt'?
How do I know if Delegation is right?
Example of bad Diligence?
Does 4D apply to agentic loops?
Use 4D for personal projects?
Can the 4D Framework prevent all AI mistakes?
Where does 'tool design' fit in 4D?
Diligence vs validation?
Work this with your AI
Work this concept hands-on with Claude Code, Codex, or claude.ai. Copy a prompt, paste it into your assistant, and practise in tandem. Each one keeps you active (explain it back, get drilled, or build) rather than just reading.
- Drill it like the exam (scenario MCQs)Practice in the exam's scenario-MCQ format with trap awareness.
- Explain it back (Feynman)Build durable, transferable understanding of a concept you can half-state.
- Test me, adapting the difficultyActive recall practice on a concept you think you know.
- Check my prerequisites firstBefore studying a concept that keeps not sticking.
- Find the high-leverage 20%When a domain feels too big and you are short on time.
