D4.8 · Domain 4 · Prompt Engineering · 20% of CCA-F

4D Framework.

8 min read·10 sections·Tier A

Delegation, Description, Discernment, Diligence, Anthropic's 4D framework for agent prompt design. A full deep-dive guide is coming soon. synthesis / authored

Stub, research neededDomain 4
4D Framework, hero illustration featuring Loop mascot in a warm gallery scene.
Domain D4Prompt Engineering · 20%
On this page
01 · Summary

TLDR

Delegation, Description, Discernment, Diligence, Anthropic's 4D framework for agent prompt design. A full deep-dive guide is coming soon. synthesis / authored

4
Pillars
D4
Exam domain
C
Coverage tier
stub
Status
research
Action
02 · Definition

What it is

The 4D Framework is Anthropic's pedagogical model for AI fluency, taught in Claude Certifications and operationalized in prompt design. The four dimensions are Delegation, Description, Discernment, Diligence, each addressing a phase of the human-AI collaboration lifecycle. Not a technical architecture; a mental model for how to interact with AI responsibly and effectively. Every exam question on prompt design tests whether you recognize which D is missing.

Delegation is the decision of what work the human should do vs the AI. Example: "Should I research ten competitors myself or ask Claude?" Informed by platform awareness (Claude's strengths like synthesis, weakness like real-time data) and task awareness (what work benefits from AI). Poor delegation: asking Claude to verify legal text (wrong; humans must validate). Right: asking Claude to draft for human review.

Description is how you communicate the delegated task. Three sub-dimensions: Product (desired format), Process (step-by-step instructions), Performance (the tone/style/role). "Summarize" (vague) vs "3-sentence summary in bullets, non-technical audience, friendly advisor voice" (Description done right). Techniques: few-shot examples, explicit constraints, chain-of-thought prompting.

Discernment is the human's quality-control lens. LLMs are not infallible. Three sub-dimensions: Product Discernment (accurate?), Process Discernment (logical reasoning?), Performance Discernment (effective tone?). Loop: evaluate, refine Description, re-prompt. Diligence is responsible use: choosing the right model, transparency about AI's role, verifying before shipping.

03 · Mechanics

How it works

The four Ds operate in sequence but as a cycle. Start with Delegation (what task to hand off), move to Description (write the prompt), execute, then Discernment (evaluate output), then Diligence (safe to deploy?). If Discernment detects a flaw, loop back to Description, re-execute, Discern again. Repeats until output meets standard, or you conclude the task is unsuitable for AI (back to Delegation).

Practical manifestation: (1) Delegation: "I need Claude to extract data from legal documents." (2) Description: prompt with input schema, few-shot examples, explicit rules. (3) Execute. (4) Discernment: validate JSON against schema, spot-check 5 documents. (5) Diligence: if 95%+ accurate, deploy with human-in-the-loop for edge cases; if <95%, loop back to Description.

The Framework is agnostic to model, task, scale. Whether Claude for a one-off question or a multi-agent system, the four Ds apply. In agentic loops, each turn re-runs the cycle: Delegation (which tool next?), Description (how to describe?), execute, Discernment (valid output?), Diligence (safe to append?).

The exam heavily tests Diligence + Delegation trade-offs. "Should we auto-approve refunds <$100?" Wrong: "Yes, automate everything." Right: "Diligence demands human review even for small amounts." The 4D lens makes this clear: Delegation decides what to hand off; Diligence validates that the outcome is safe to deploy without human oversight.

4D Framework mechanics, painterly diagram featuring Loop mascot.
04 · In production

Where you'll see it

Customer support refund workflow with 4D

Delegation: humans handle policy interpretation; Claude handles fact-gathering. Description: prompt extracts customer ID, order ID, refund reason, summarizes policy-violation status. Discernment: validate fields, check policy interpretation. Diligence: refunds >$500 bypass Claude; <$100 auto-approved by Claude but logged.

Code review automation for PRs

Delegation: Claude reviews; humans merge. Description: provide PR diff, ask for JSON {verdict, critical_issues, recommendations}. Discernment: read review, verify verdict matches findings. Diligence: never auto-merge; require human click-through. Transparency: "AI-assisted code review, human approval required."

Expense report validation

Delegation: Claude extracts merchant, amount, category; humans authorize. Description: receipt image, structured output. Discernment: spot-check 10 manually. Diligence: expenses >$5000 auto-escalate; <$100 auto-approved + logged.

Email triage with 4D decomposition

Delegation: Claude filters spam, routes; humans respond. Description: classify (spam/feedback/billing/legal), extract intent, suggest routing. Discernment: false-positive rate <2% for legal mail. Diligence: escalate legal/compliance, never auto-delete.

05 · Implementation

Code examples

4D-aligned expense validation
from anthropic import Anthropic
import json

client = Anthropic()

# DELEGATION: Claude extracts; humans validate and authorize
# DESCRIPTION: structured prompt with schema and constraints
# DISCERNMENT: validate output, measure accuracy
# DILIGENCE: high-risk amounts escalate, low amounts auto-approve

def validate_expense_4d(receipt_b64: str, threshold: float = 5000):
    # DESCRIPTION: explicit schema + constraints
    prompt = """Extract expense from receipt. Return JSON:
{
  "merchant": "string",
  "amount": number,
  "currency": "USD" | "other",
  "category": "travel" | "meals" | "supplies" | "other",
  "tax": number,
  "confidence_score": 0.0-1.0
}
- Confidence < 0.7? Set amount to null.
- Missing? Use null, don't invent.
"""

    resp = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=512,
        messages=[{
            "role": "user",
            "content": [
                {"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": receipt_b64}},
                {"type": "text", "text": prompt},
            ],
        }],
    )

    # DISCERNMENT: validate
    try:
        output = json.loads(resp.content[0].text)
    except json.JSONDecodeError:
        return {"status": "error", "reason": "invalid_json"}

    if output.get("confidence_score", 0) < 0.7:
        return {"status": "hold", "reason": "low_confidence", "data": output}

    # DILIGENCE: route by amount
    if output["amount"] and output["amount"] > threshold:
        return {"status": "escalate", "reason": f"amount > $threshold", "data": output, "action": "SEND_TO_MANAGER"}
    else:
        return {"status": "approved", "data": output, "audit_log": {"method": "ai_extracted_auto_approved", "reviewer": "claude-opus-4-5"}}
4D cycle visible: Delegation (Claude extracts), Description (schema + constraints), Discernment (validate confidence), Diligence (route by amount risk).
06 · Distractor patterns

Looks right, isn't

Each row pairs a plausible-looking pattern with the failure it actually creates. These are the shapes exam distractors are built from.

Looks right

The 4D Framework is a technical architecture (like MVC).

Actually wrong

It's a mental model for human-AI collaboration, not a code pattern. Guides how you think about delegating, communicating, validating, and deploying responsibly.

Looks right

If Discernment detects a flaw, increase max_tokens and retry.

Actually wrong

Discernment failures usually mean Description was vague or Delegation was wrong. Refine the prompt or reconsider task suitability. More tokens rarely fix flawed delegation.

Looks right

Diligence means trusting Claude's output completely.

Actually wrong

Diligence means verifying and being honest about limitations. Trusting blindly is the opposite. Diligence is "I validated this before shipping."

Looks right

All four Ds must be applied to every task.

Actually wrong

Low-stakes tasks (brainstorming): Discernment and Diligence can be minimal. High-stakes (compliance, financial): all four are critical.

Looks right

Description is the same as 'write a better prompt.'

Actually wrong

Description includes Product, Process, Performance. "Better prompt" is vague. Description is specific: schema, examples, tone.

07 · Compare

Side-by-side

DimensionDelegationDescriptionDiscernmentDiligence
What it answersWhat work should AI do?How do I describe the task?Is the output correct?Is it safe to deploy?
PhaseBefore promptingPrompt designAfter executionFinal validation
Failure modeAI does non-AI work (legal validation)Vague promptAccepting output without reviewDeploying without auditing
FixReassign: human validates, Claude draftsAdd schema, examples, constraintsSpot-check; validate schemaLog decisions; escalate high-risk
Exam signal"Should AI or human do this?""Write a prompt that...""Why did the output fail?""When should this be escalated?"
08 · When to use

Decision tree

01

Should a human or Claude do this work?

YesThat's Delegation. Assign based on capability: humans for judgment/authority, Claude for synthesis/analysis.
NoContinue.
02

Is your prompt clear on Product, Process, Performance?

YesGood Description. Execute.
NoRefine: schema (Product), examples (Process), tone guidance (Performance).
03

Did the output meet expectations?

YesGood Discernment. Ready for Diligence.
NoDiscernment detected a flaw. Loop back to Description.
04

Is it safe to deploy without human review?

YesHigh-confidence, low-stakes, fully logged. Deploy.
NoDiligence says: escalate or refine.
05

Are you using all 4 Ds for high-stakes tasks?

YesGood. Compliance, financial, legal demand all 4.
NoAdd the missing D before shipping.
09 · On the exam

Question patterns

4D Framework exam trap, painterly cautionary scene featuring Loop mascot.

11 V2 questions wired to this concept. Tap an answer to check it instantly — you'll see whether it's right and why — then expand the full breakdown for the mental model and all four rationales.

Should you use Claude to verify legal text before it is sent to a client?

Tap your answer to check it.

If the prompt is just "be helpful", which D in the 4D Framework is missing?

Tap your answer to check it.

Discernment flags the agent's output as flawed, but the team ships it anyway. What is the meta-failure in the 4D Framework?

Tap your answer to check it.

Auto-approving refunds under $100 with no human review and no audit log: smart automation or 4D anti-pattern?

Tap your answer to check it.

Description in the 4D Framework has three sub-parts: Product, Process, and Performance. What does each one cover?

Tap your answer to check it.

Is it ever acceptable to skip Diligence on low-stakes tasks?

Tap your answer to check it.

5 additional questions for this concept live in the practice pillar. Take a mock exam ↗

10 · FAQ

Frequently asked

Is the 4D Framework specific to Claude?
No. Applies to any LLM. Delegation, Description, Discernment, Diligence are human-AI collaboration principles, not product-specific.
Can I skip one of the Ds?
Depends on stakes. Casual tasks: Discernment and Diligence can be minimal. Financial/compliance: all four are mandatory.
Difference between Description and 'good prompt'?
Description is systematic: Product (format), Process (reasoning), Performance (tone). A good prompt often includes all three but can be vague about why.
How do I know if Delegation is right?
Ask: Does Claude excel at this? Is a human required? Can I validate the output? If yes to all, Delegation is sound.
Example of bad Diligence?
Auto-approving refunds >$500 without human review. Deploying AI-generated legal text without lawyer review. Not disclosing AI involvement.
Does 4D apply to agentic loops?
Yes. Each turn: Delegation (which tool?), Description (how to describe?), execute, Discernment (valid output?), Diligence (safe to append?).
Use 4D for personal projects?
Yes. Lower stakes mean less Diligence, but the cycle still applies. Especially Description: vague prompts produce vague outputs.
Can the 4D Framework prevent all AI mistakes?
No. It's a mental model. Reduces error rates significantly when applied; doesn't eliminate them. Still needs testing, monitoring, iteration.
Where does 'tool design' fit in 4D?
Description. Tool descriptions, schemas, allowed-tools lists are all part of Description (how you communicate the task to Claude).
Diligence vs validation?
Validation is a check. Diligence is the broader habit: choosing the right model, transparency about AI's role, verifying before shipping, audit logging.
11 · Practice with AI

Work this with your AI

Work this concept hands-on with Claude Code, Codex, or claude.ai. Copy a prompt, paste it into your assistant, and practise in tandem. Each one keeps you active (explain it back, get drilled, or build) rather than just reading.

  • Drill it like the exam (scenario MCQs)
    Practice in the exam's scenario-MCQ format with trap awareness.
  • Explain it back (Feynman)
    Build durable, transferable understanding of a concept you can half-state.
  • Test me, adapting the difficulty
    Active recall practice on a concept you think you know.
  • Check my prerequisites first
    Before studying a concept that keeps not sticking.
  • Find the high-leverage 20%
    When a domain feels too big and you are short on time.
Self-check

Test yourself

Three diagnostic questions on this primitive. Reveal each answer when you have a guess. Want a full 60-question mock? Open the mock hub →

Q1Should we use Claude to verify legal text before sending to a client?
No, that's wrong Delegation. Legal text requires human authority and accountability. Use Claude to draft for human review. Delegation assigns work to leverage each agent's strengths; lawyers review, Claude drafts.
Q2"Be helpful" as the prompt: what's missing?
Description (Product, Process, Performance). "Be helpful" is vague. Specify the format (JSON, bullets), the reasoning steps (chain-of-thought), and the tone (friendly advisor, expert reviewer). Vague Description produces vague output.
Q3Discernment fails: agent's output is wrong but you ship anyway. What's the meta-failure?
Skipping Diligence. Discernment detected the flaw; Diligence demands you don't deploy without addressing it. Either fix (loop back to Description) or escalate to human. Shipping known-flawed output is the opposite of Diligence.
Last reviewed: 2026-05-04·Refresh cadence: monthly
D4.8 · D4 · Prompt Engineering

4D Framework, complete.

You've covered the full ten-section breakdown for this primitive, definition, mechanics, code, false positives, comparison, decision tree, exam patterns, and FAQ. One technical primitive down on the path to CCA-F.

More platforms →