# 4D Framework

> Delegation, Description, Discernment, Diligence, Anthropic's 4D framework for agent prompt design. Vault coverage thin; needs Phase 6 research.

**Domain:** D4 · Prompt Engineering (20% of CCA-F exam)
**Canonical:** https://claudearchitectcertification.com/concepts/4d-framework
**Last reviewed:** 2026-05-04

## Quick stats

- **Pillars:** 4
- **Exam domain:** D4
- **Coverage tier:** C
- **Status:** stub
- **Action:** research

## What it is

The 4D Framework is Anthropic's pedagogical model for AI fluency, taught in Claude Certifications and operationalized in prompt design. The four dimensions are Delegation, Description, Discernment, Diligence, each addressing a phase of the human-AI collaboration lifecycle. Not a technical architecture; a mental model for how to interact with AI responsibly and effectively. Every exam question on prompt design tests whether you recognize which D is missing.

Delegation is the decision of what work the human should do vs the AI. Example: "Should I research ten competitors myself or ask Claude?" Informed by platform awareness (Claude's strengths like synthesis, weakness like real-time data) and task awareness (what work benefits from AI). Poor delegation: asking Claude to verify legal text (wrong; humans must validate). Right: asking Claude to draft for human review.

Description is how you communicate the delegated task. Three sub-dimensions: Product (desired format), Process (step-by-step instructions), Performance (the tone/style/role). "Summarize" (vague) vs "3-sentence summary in bullets, non-technical audience, friendly advisor voice" (Description done right). Techniques: few-shot examples, explicit constraints, chain-of-thought prompting.

Discernment is the human's quality-control lens. LLMs are not infallible. Three sub-dimensions: Product Discernment (accurate?), Process Discernment (logical reasoning?), Performance Discernment (effective tone?). Loop: evaluate, refine Description, re-prompt. Diligence is responsible use: choosing the right model, transparency about AI's role, verifying before shipping.

## How it works

The four Ds operate in sequence but as a cycle. Start with Delegation (what task to hand off), move to Description (write the prompt), execute, then Discernment (evaluate output), then Diligence (safe to deploy?). If Discernment detects a flaw, loop back to Description, re-execute, Discern again. Repeats until output meets standard, or you conclude the task is unsuitable for AI (back to Delegation).

Practical manifestation: (1) Delegation: "I need Claude to extract data from legal documents." (2) Description: prompt with input schema, few-shot examples, explicit rules. (3) Execute. (4) Discernment: validate JSON against schema, spot-check 5 documents. (5) Diligence: if 95%+ accurate, deploy with human-in-the-loop for edge cases; if <95%, loop back to Description.

The Framework is agnostic to model, task, scale. Whether Claude for a one-off question or a multi-agent system, the four Ds apply. In agentic loops, each turn re-runs the cycle: Delegation (which tool next?), Description (how to describe?), execute, Discernment (valid output?), Diligence (safe to append?).

The exam heavily tests Diligence + Delegation trade-offs. "Should we auto-approve refunds <$100?" Wrong: "Yes, automate everything." Right: "Diligence demands human review even for small amounts." The 4D lens makes this clear: Delegation decides what to hand off; Diligence validates that the outcome is safe to deploy without human oversight.

## Where you'll see it in production

### Customer support refund workflow with 4D

Delegation: humans handle policy interpretation; Claude handles fact-gathering. Description: prompt extracts customer ID, order ID, refund reason, summarizes policy-violation status. Discernment: validate fields, check policy interpretation. Diligence: refunds >$500 bypass Claude; <$100 auto-approved by Claude but logged.

### Code review automation for PRs

Delegation: Claude reviews; humans merge. Description: provide PR diff, ask for JSON {verdict, critical_issues, recommendations}. Discernment: read review, verify verdict matches findings. Diligence: never auto-merge; require human click-through. Transparency: "AI-assisted code review, human approval required."

### Expense report validation

Delegation: Claude extracts merchant, amount, category; humans authorize. Description: receipt image, structured output. Discernment: spot-check 10 manually. Diligence: expenses >$5000 auto-escalate; <$100 auto-approved + logged.

### Email triage with 4D decomposition

Delegation: Claude filters spam, routes; humans respond. Description: classify (spam/feedback/billing/legal), extract intent, suggest routing. Discernment: false-positive rate <2% for legal mail. Diligence: escalate legal/compliance, never auto-delete.

## Code examples

### 4D-aligned expense validation

**Python:**

```python
from anthropic import Anthropic
import json

client = Anthropic()

# DELEGATION: Claude extracts; humans validate and authorize
# DESCRIPTION: structured prompt with schema and constraints
# DISCERNMENT: validate output, measure accuracy
# DILIGENCE: high-risk amounts escalate, low amounts auto-approve

def validate_expense_4d(receipt_b64: str, threshold: float = 5000):
    # DESCRIPTION: explicit schema + constraints
    prompt = """Extract expense from receipt. Return JSON:
{
  "merchant": "string",
  "amount": number,
  "currency": "USD" | "other",
  "category": "travel" | "meals" | "supplies" | "other",
  "tax": number,
  "confidence_score": 0.0-1.0
}
- Confidence < 0.7? Set amount to null.
- Missing? Use null, don't invent.
"""

    resp = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=512,
        messages=[{
            "role": "user",
            "content": [
                {"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": receipt_b64}},
                {"type": "text", "text": prompt},
            ],
        }],
    )

    # DISCERNMENT: validate
    try:
        output = json.loads(resp.content[0].text)
    except json.JSONDecodeError:
        return {"status": "error", "reason": "invalid_json"}

    if output.get("confidence_score", 0) < 0.7:
        return {"status": "hold", "reason": "low_confidence", "data": output}

    # DILIGENCE: route by amount
    if output["amount"] and output["amount"] > threshold:
        return {"status": "escalate", "reason": f"amount > $threshold", "data": output, "action": "SEND_TO_MANAGER"}
    else:
        return {"status": "approved", "data": output, "audit_log": {"method": "ai_extracted_auto_approved", "reviewer": "claude-opus-4-5"}}
```

> 4D cycle visible: Delegation (Claude extracts), Description (schema + constraints), Discernment (validate confidence), Diligence (route by amount risk).

**TypeScript:**

```typescript
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

interface ExpenseData {
  merchant: string;
  amount?: number;
  currency: "USD" | "other";
  category: "travel" | "meals" | "supplies" | "other";
  tax?: number;
  confidence_score: number;
}

async function validateExpense4D(receiptB64: string, threshold: number = 5000) {
  const prompt = `Extract expense. Return JSON: {merchant, amount, currency, category, tax, confidence_score}. Confidence < 0.7? amount = null. Missing? null.`;

  const resp = await client.messages.create({
    model: "claude-opus-4-5", max_tokens: 512,
    messages: [{ role: "user", content: [
      { type: "image", source: { type: "base64", media_type: "image/jpeg", data: receiptB64 } },
      { type: "text", text: prompt },
    ]}],
  });

  let output: ExpenseData | null = null;
  try {
    const text = resp.content[0].type === "text" ? resp.content[0].text : "{}";
    output = JSON.parse(text);
  } catch { return { status: "error", reason: "invalid_json" }; }

  if (output.confidence_score < 0.7) {
    return { status: "hold", reason: "low_confidence", data: output };
  }
  if (output.amount && output.amount > threshold) {
    return { status: "escalate", reason: `amount > \$${threshold}`, data: output, action: "SEND_TO_MANAGER" };
  }
  return { status: "approved", data: output, auditLog: { method: "ai_extracted_auto_approved", reviewer: "claude-opus-4-5" } };
}
```

> Same 4D cycle in TypeScript: explicit prompt design, validation, risk-based routing.

## Looks-right vs actually-wrong

| Looks right | Actually wrong |
|---|---|
| The 4D Framework is a technical architecture (like MVC). | It's a mental model for human-AI collaboration, not a code pattern. Guides how you think about delegating, communicating, validating, and deploying responsibly. |
| If Discernment detects a flaw, increase max_tokens and retry. | Discernment failures usually mean Description was vague or Delegation was wrong. Refine the prompt or reconsider task suitability. More tokens rarely fix flawed delegation. |
| Diligence means trusting Claude's output completely. | Diligence means verifying and being honest about limitations. Trusting blindly is the opposite. Diligence is "I validated this before shipping." |
| All four Ds must be applied to every task. | Low-stakes tasks (brainstorming): Discernment and Diligence can be minimal. High-stakes (compliance, financial): all four are critical. |
| Description is the same as 'write a better prompt.' | Description includes Product, Process, Performance. "Better prompt" is vague. Description is specific: schema, examples, tone. |

## Comparison

| Dimension | Delegation | Description | Discernment | Diligence |
| --- | --- | --- | --- | --- |
| What it answers | What work should AI do? | How do I describe the task? | Is the output correct? | Is it safe to deploy? |
| Phase | Before prompting | Prompt design | After execution | Final validation |
| Failure mode | AI does non-AI work (legal validation) | Vague prompt | Accepting output without review | Deploying without auditing |
| Fix | Reassign: human validates, Claude drafts | Add schema, examples, constraints | Spot-check; validate schema | Log decisions; escalate high-risk |
| Exam signal | "Should AI or human do this?" | "Write a prompt that..." | "Why did the output fail?" | "When should this be escalated?" |

## Decision tree

1. **Should a human or Claude do this work?**
   - **Yes:** That's Delegation. Assign based on capability: humans for judgment/authority, Claude for synthesis/analysis.
   - **No:** Continue.

2. **Is your prompt clear on Product, Process, Performance?**
   - **Yes:** Good Description. Execute.
   - **No:** Refine: schema (Product), examples (Process), tone guidance (Performance).

3. **Did the output meet expectations?**
   - **Yes:** Good Discernment. Ready for Diligence.
   - **No:** Discernment detected a flaw. Loop back to Description.

4. **Is it safe to deploy without human review?**
   - **Yes:** High-confidence, low-stakes, fully logged. Deploy.
   - **No:** Diligence says: escalate or refine.

5. **Are you using all 4 Ds for high-stakes tasks?**
   - **Yes:** Good. Compliance, financial, legal demand all 4.
   - **No:** Add the missing D before shipping.

## Exam-pattern questions

### Q1. Should we use Claude to verify legal text before sending to a client?

No, that's wrong Delegation. Legal text requires human authority and accountability. Use Claude to draft for human review. Delegation assigns work to leverage each agent's strengths; lawyers review, Claude drafts.

### Q2. "Be helpful" as the prompt: what's missing?

Description (Product, Process, Performance). "Be helpful" is vague. Specify the format (JSON, bullets), the reasoning steps (chain-of-thought), and the tone (friendly advisor, expert reviewer). Vague Description produces vague output.

### Q3. Discernment fails: agent's output is wrong but you ship anyway. What's the meta-failure?

Skipping Diligence. Discernment detected the flaw; Diligence demands you don't deploy without addressing it. Either fix (loop back to Description) or escalate to human. Shipping known-flawed output is the opposite of Diligence.

### Q4. Auto-approve refunds <$100 without human review: smart automation or bad practice?

Bad practice without Diligence. Even small amounts demand audit logging and periodic human review. Diligence is "I validated this before shipping"; auto-approval without validation skips that step.

### Q5. Description includes Product, Process, Performance. What's the difference?

Product: desired output format (JSON schema, markdown, prose). Process: step-by-step instructions for reasoning (chain-of-thought, few-shot examples). Performance: tone, style, role (friendly, expert, advisor). All three together = strong Description.

### Q6. Skip Diligence on low-stakes tasks: ever acceptable?

Sometimes. For brainstorming or summarizing a blog post, minimal Diligence is fine. For compliance, financial, legal, healthcare, all four Ds are mandatory. Match Diligence rigor to stakes.

### Q7. The 4D Framework is a technical architecture (like MVC)?

No. It's a mental model for human-AI collaboration. Guides how you think about delegating, communicating, validating, deploying. Not a code pattern; a discipline.

### Q8. Discernment detects a flaw: where do you loop back?

To Description. Refine the prompt: tighten Product (output format), clarify Process (reasoning steps), adjust Performance (tone). If refinement doesn't help, reconsider Delegation (is this task suited for AI at all?).

## FAQ

### Q1. Is the 4D Framework specific to Claude?

No. Applies to any LLM. Delegation, Description, Discernment, Diligence are human-AI collaboration principles, not product-specific.

### Q2. Can I skip one of the Ds?

Depends on stakes. Casual tasks: Discernment and Diligence can be minimal. Financial/compliance: all four are mandatory.

### Q3. Difference between Description and 'good prompt'?

Description is systematic: Product (format), Process (reasoning), Performance (tone). A good prompt often includes all three but can be vague about why.

### Q4. How do I know if Delegation is right?

Ask: Does Claude excel at this? Is a human required? Can I validate the output? If yes to all, Delegation is sound.

### Q5. Example of bad Diligence?

Auto-approving refunds >$500 without human review. Deploying AI-generated legal text without lawyer review. Not disclosing AI involvement.

### Q6. Does 4D apply to agentic loops?

Yes. Each turn: Delegation (which tool?), Description (how to describe?), execute, Discernment (valid output?), Diligence (safe to append?).

### Q7. Use 4D for personal projects?

Yes. Lower stakes mean less Diligence, but the cycle still applies. Especially Description: vague prompts produce vague outputs.

### Q8. Can the 4D Framework prevent all AI mistakes?

No. It's a mental model. Reduces error rates significantly when applied; doesn't eliminate them. Still needs testing, monitoring, iteration.

### Q9. Where does 'tool design' fit in 4D?

Description. Tool descriptions, schemas, allowed-tools lists are all part of Description (how you communicate the task to Claude).

### Q10. Diligence vs validation?

Validation is a check. Diligence is the broader habit: choosing the right model, transparency about AI's role, verifying before shipping, audit logging.

---

**Source:** https://claudearchitectcertification.com/concepts/4d-framework
**Vault sources:** ClaudeCertifications course materials
**Last reviewed:** 2026-05-04

**Evidence tiers** — 🟢 official Anthropic doc / API contract · 🟡 partial doc / inferred · 🟠 community-derived · 🔴 disputed.
