The short version
A long, complex business prompt does not have to be split to be reliable. Five techniques compound: front-load the contract (role, constraints, output format) so the model anchors on it; lead with an executive summary; tag supporting sections with XML-style delimiters like <policy> and <data>; add a scoped ignore instruction per task so the model knows what is out of scope; and make the compression-versus-retention trade-off an explicit decision against the reliability requirement, not a default preference. On the CCA-F this is a D4 plus D5 topic; the distractor to reject is "use a more capable model."
The lost-in-the-middle failure mode
A compliance review agent reads a 40-page policy document plus the candidate refund case. The relevant policy clause is on page 17 - well inside the middle of the prompt. The agent confidently approves the refund using a clause from page 3 that looks similar but materially differs. The refund violates policy. Nobody catches it until an audit a quarter later. The model did not malfunction; the prompt structure put the load-bearing rule in the lowest-attention zone of the prompt.
Per the /concepts/attention-engineering page in the vault: "LLMs exhibit a U-shaped attention curve across their context window. The beginning (system prompt, first 10%) and end (last 5%) receive disproportionately high attention. The middle 40-80% is effectively lost." The failure is structural, not capability-related. A bigger model attends to the middle slightly better but with the same U-shape. The fix is to move the high-stakes content out of the middle and to mark it with structural cues so the model can find it on demand.
Five techniques and the prompt skeleton that uses all of them
1. Front-loaded contract.Roles, constraints, and output format go at the very top of the prompt. The model anchors on them before processing background material. This is the highest-attention zone; spend it on the contract. Per /concepts/system-prompts: "The anatomy of a production system prompt has five load-bearing sections: role definition, task boundaries, output format, tool guidance, and example patterns." All five belong above the background.
2. Hierarchical structure. Lead with an executive summary of the task in 2-4 sentences. Follow with the detail sections in order of priority. The summary keeps the model oriented across long context; the priority order ensures that even if the middle is under-weighted, the top-priority sections are still in the high-attention front zone.
3. XML-style sectioning.Wrap each section in a tag that describes its content: <policy>, <data>, <examples>, <case>. Per the Skilljar Lesson 28 material: "XML tags act as containers that separate distinct portions of your prompt. You can create custom tag names that describe the content they contain. The tag name itself provides context about the data type." Claude is unusually responsive to XML tags because the training data included them as structural cues; Markdown headers help, but tags help more.
<role>
You are a compliance reviewer for refund requests at a SaaS company.
</role>
<constraints>
- Refunds over $500 require manager approval.
- Cite the policy clause for every decision.
- Output JSON: { decision, clause_id, reason }.
</constraints>
<executive-summary>
Review the case in <case> against the rules in <policy>; apply the constraints; emit JSON.
</executive-summary>
<policy>
[refund-limits] No refund without proof of duplicate charge or service failure.
[refund-amounts] Amount must match the charged amount; partial refunds for partial services.
[escalation] Amounts over $500 escalate to manager.
... (37 more clauses)
</policy>
<case>
Customer: cust_8814
Charge: $612 on 2026-05-12 for plan upgrade.
Claim: "Did not authorize this upgrade."
</case>
<scope>
For this decision, treat only the <policy> block as authoritative.
Ignore <case-studies> if present.
</scope>
<question>
Should this refund be approved? Cite the policy clause.
</question>4. Scoped ignore instruction.Place an explicit per-task instruction immediately before the question naming the sections in scope and out of scope. The model still reads the ignored sections (the tokens are in context), but the explicit scope guides attention. This is particularly important for multi-department documents where one task touches one department's rules and the others are noise.
5. Explicit compression-vs-retention trade-off.When the prompt still risks blowing past the budget, decide whether to compress (lose nuance, save cost and latency) or retain (preserve reliability, pay cost and latency). Per the Skilljar RAG lesson: "Hard token limits mean very long documents simply will not fit. Claude becomes less effective with extremely long prompts. Larger prompts cost more money and take longer to process. Performance degrades when there is too much information to sift through." The decision is contextual: a legal review must retain; a routine triage can compress. Name the constraint, then pick the technique.
The skeleton above puts all five techniques in one prompt. The contract is at the top. The executive summary orients the model. The XML tags structure the middle. The scope instruction guides attention. The question is at the bottom, in the other high-attention zone. The middle holds the bulk of the policy and the case, but with tags the model can attend to the right block when it processes the question.
Seven checks for every long business prompt
- Contract at the top. Role, constraints, output format. No background context above them.
- Executive summary in the second block. Two to four sentences describing the task and the expected output.
- XML tags on every distinct content section.Tag names should describe the content (<policy>, <data>, <case>), not its role in the prompt.
- Scoped ignore instruction immediately before the question.Name what is in scope; name what is out of scope.
- Question at the bottom. Recency zone holds the question, not more context.
- Pin stable rules to the system prompt. Anything that does not change between requests goes in the system; the human turn holds the task-specific content.
- Decide compression vs retention against the reliability requirement. Name the constraint first; pick the technique second. Default-preferences are how silent failures ship.
Five mistakes that bury the contract
- Background first, contract last. The 40 pages of policy come before the role and the output format. Cause: thinking the model needs context before instructions. Fix: contract at the top; context follows.
- Markdown-only structure. Headers and bullets only; no XML tags. Cause: web-doc instincts. Fix: tag every distinct content section; Markdown is fine inside tags but not as the primary structural cue.
- Question in the middle. The model is asked to answer in the middle of the prompt and then given more context after. Cause: stream-of-thought authoring. Fix: question always last.
- No scoped ignore. The whole prompt is in scope by default; the model over-weights peripheral content. Cause: relying on the model to infer relevance. Fix: explicit per-task scope before the question.
- Implicit compression-vs-retention. The prompt is silently truncated upstream or silently retained when it should be summarized. Cause: no explicit decision. Fix: name the reliability requirement; pick the technique that satisfies it.
How this shows up on the exam
Vault and external references
- Vault:
data/aeo/reports/2026-05-17-recommendations.md§Signal 1 - source of the five-technique framing and the compression-vs-retention trade-off. - Vault:
data/aeo/reports/2026-05-16-recommendations.md§Signal 1 - earliest formulation of the same recommendation across competitor signal. - Vault:
public/concepts/attention-engineering.md§How it works - the U-shaped attention curve, Lost-in-the-Middle empirical basis, and why position beats styling. - Vault:
public/concepts/system-prompts.md§How it works - five-section system prompt anatomy and the stable-vs-variable split that informs the pinning rule. - Vault:
public/concepts/context-window.md- 200K total budget, the count_tokens measurement loop, and windowing as a fallback when the prompt exceeds the window. - Vault:
99-attachements/asc-a01-skilljar-course-content/course-12-claude-with-google-vertex/lesson-28-structure-with-xml-tags.md- canonical Skilljar coverage of XML tags as structural cues, with the debugging-vs-docs worked example. - Vault:
99-attachements/asc-a01-skilljar-course-content/course-11-claude-in-amazon-bedrock/lesson-43-introducing-retrieval-augmented-generation.md- the trade-offs that motivate RAG vs. long single prompts, including the performance-degrades-with-too-much-information point. - External: Liu et al. (2023) "Lost in the Middle: How Language Models Use Long Contexts" - empirical basis for the U-shaped attention curve.