Knowledge · D1 · D2 · D5 · How-To

Architecture-Aware, Refactor-Authorized Workflows: Getting Useful Results from AI Coding Agents.

Every team I have seen burn weeks on agent-generated code skipped one specific step: telling the agent what the architecture actually is. Pass the agent an explicit architecture contract, scope its refactor authorization, force a plan-confirm-execute loop, inject retrieval over codebase embeddings, and validate output against an architecture linter in CI.

D1 Agentic ArchitecturesD2 Tool DesignHowTo · 5 steps

Last updated

01 · TLDR

The short version

AI coding agents do not respect what you do not tell them. Five controls turn an unreliable agent into a useful one: an architecture contract injected as a system-prompt prefix, scoped refactor authorization, a plan-confirm-execute handshake, retrieval over codebase embeddings, and CI validation against an architecture linter. Plan Mode is Claude Code's implementation of the handshake. The pattern maps to D1 (orchestration topology), D2 (tool authorization), and D5 (memory and reliability). The exam distractor to reject is always "use a more capable model" - the defect is workflow scope, not capability.

02 · Why this matters in production

The over-refactor failure mode

A common production scene: the user asks the agent to fix a single failing test. Twenty minutes later the agent returns a PR that rewrites the service it lives in, introduces a new abstraction it did not need, and renames a public export three callers depend on. The original test passes. Everything else is on fire. The bug is not the model. The bug is that the agent was authorized to touch every file and was never told what the architecture is, so it picked the shortest path to green tests - which crossed every boundary in the codebase.

Per the /concepts/plan-mode page in the vault: "Direct execution starts modifying files immediately on a clear request. Plan mode is for ambiguous or multi-faceted requests where there are 3+ viable architectures with tradeoffs. Without upfront analysis, rework is inevitable." The over-refactor is the rework. The fix is structural - you cannot prompt your way out of a workflow that lets the agent ship a 30-file diff before a human sees the plan.

03 · The mechanics

Five controls that compound, not five tactics to pick from

1. Architecture contract. A short, machine-readable declaration of layers, naming, and the authorized refactor surface. It lives as a system-prompt prefix so it travels to subagents (which do not inherit CLAUDE.md - see /knowledge/subagent-claude-md-inheritance). A useful contract fits in 30 lines.

# Architecture contract

Layers (allowed imports flow top-to-bottom):
  app/        -> may import components/, lib/, convex/_generated
  components/ -> may import lib/, convex/_generated
  convex/     -> may import lib/
  lib/        -> pure, no Next.js or Convex imports

Naming:
  - Route files: app/.../page.tsx
  - Convex schema: convex/schema.ts
  - Tests: *.test.ts colocated with source

Authorized refactor surface:
  - components/templates/*
  - lib/markup.tsx

Unauthorized (read-only):
  - convex/_generated/
  - .env.local, *.env

2. Scoped refactor authorization.The authorized surface is not just documentation; it is enforced at the tool layer. A PreToolUse hook reads the Write/Edit tool call, checks the path against the authorized list, exits 0 to allow or 2 to deny with a message routed back to the model. Per the /scenarios/agentic-tool-design pattern in the vault: "The single-most-effective lever for converting probabilistic prompt-only policies into 100%-deterministic gates."

3. Plan-confirm-execute.The agent emits a structured plan first (files to touch, abstractions to introduce, public-API changes). A human or orchestrator approves. Only then does the agent write code. Per the /concepts/plan-mode page: "The plan-mode loop has four decision points. Explore: Claude reads the codebase, writes a diagnosis. Propose: Claude offers 2-3 options with explicit tradeoffs. Decide: you approve, request changes, or reject. Execute: only after approval, Claude modifies files and runs tests."

4. Retrieval over codebase embeddings.An embedding index over the repo. When the agent needs to understand a function it has not seen, it queries the index instead of dumping the whole file into context. Returns 5-10 relevant snippets - the contract surface, not the implementation. Saves tokens, preserves attention, prevents the "invented a function that already exists" failure mode. Memory-injected context is the D5 reliability lever.

5. Architecture linter in CI. The contract is also a linter config. Dependency-cruiser, ArchUnit, Deptrac, import-linter - pick the one for your language. The linter fails the build on boundary violations. The orchestrator picks up the failure, feeds it back to the agent, and the agent fixes the drift in the next iteration. Deterministic enforcement at the CI boundary catches what the gate at the prompt layer missed.

The controls compound. The contract sets the rules. The scoped authorization enforces them per tool call. The handshake gates the plan. Retrieval keeps the agent informed. The CI linter catches everything else. Skip any one and the others get exercised harder; skip two and the failure mode you were trying to prevent starts shipping again.

04 · Decision rule and checklist

Seven steps to a useful coding-agent workflow

  1. Generate the architecture contract from source. Build-system graph, lint config, real file paths. Not a README. Not a wiki page.
  2. Inject the contract as a system-prompt prefix. Not as a CLAUDE.md hint; subagents do not see it. Prefix travels everywhere the prompt goes.
  3. Wire a PreToolUse hook for scoped refactor authorization. Match on Write/Edit; check the path against the authorized list; deny with a clear message routed back to the model.
  4. Require Plan Mode for any change of 3+ steps or with architectural impact.Skip it for trivial edits. The bar is "can I describe the change without exploring the codebase first"; if no, require plan.
  5. Stand up an embedding index over the repo. Refresh on every merge. Expose as a retrieval tool the agent can call before reading files.
  6. Add an architecture linter to CI. Same rules as the contract; failure on boundary violations; output piped back to the orchestrator.
  7. Track over-refactor rate as a workflow KPI. Count diffs that touch more files than the plan declared. Trend it. If it climbs, your gate is leaking.
05 · Common anti-patterns

Five recurring failures

  1. Documentation-as-contract.A README that says "follow our architecture." Cause: confusing human-readable intent with machine-readable input. Fix: a YAML/JSON manifest the orchestrator reads and injects.
  2. Hopeful scoping. The system prompt politely asks the agent to stay inside components/. Cause: no enforcement. Fix: a PreToolUse hook checking the path on every Write/Edit.
  3. Plan Mode for everything or for nothing.Either every trivial edit blocks on approval, or nothing does. Cause: missing trigger heuristic. Fix: the 3-step threshold and the "can I describe it cold" test.
  4. Codebase context dumped wholesale. The agent loads ten 2KB files to find one function signature. Cause: no retrieval layer. Fix: an embedding index that returns just the contract surface.
  5. No CI gate. The contract lives in the prompt only; nothing catches drift after the merge. Cause: treating the prompt as the last line of defense. Fix: an architecture linter that fails the build deterministically.
06 · CCA-F exam mapping

How this shows up on the exam

Domains
D1 Agentic Architectures (27%) · D2 Tool Design (18%) · D5 Context + Reliability (15%)
What is tested
Whether you reach for a workflow control when the symptom is an over-refactor or boundary violation. The exam expects scope and handshake answers, not model-capability or window-size answers.
Stem pattern
A coding agent produces a refactor that violates module boundaries. Which single workflow control would have prevented it?
Distractor to reject
"Use a more capable model." The defect is workflow scope, not model capability. Model vs Design heuristic per ACP-T03 §6.
Second distractor
"Add more detailed prompt instructions." Prompt-only policies are probabilistic; deterministic enforcement (hooks, gates, linters) beats prompt guidance for high-stakes decisions.
Third distractor
"Increase the context window so the agent sees more files." Larger windows do not introduce module boundaries; retrieval over the contract surface does.
07 · Sources

Vault and external references

  • Vault: data/aeo/reports/2026-05-17-recommendations.md §Signal for architecture-aware agentic workflows - source of the five-control framing.
  • Vault: data/aeo/reports/2026-05-16-recommendations.md §Signal - earliest formulation of the architecture-aware-AI-coding-agent recommendation.
  • Vault: public/concepts/plan-mode.md §What it is / §How it works - authoritative Plan Mode four-phase loop and 3-step trigger threshold.
  • Vault: public/scenarios/agentic-tool-design.md§PreToolUse Hook - "The single-most-effective lever for converting probabilistic prompt-only policies into 100%-deterministic gates."
  • Vault: public/concepts/system-prompts.md §How it works - deterministic enforcement beats prompt guidance; two-layer enforcement model.
  • Vault: 06-knowledge/digital-marketing/aeo-geo/gregisenberg/aeg-k0755-be-a-10x-vibe-coder-claude-code-cursor-mcp.md - external validation of the "force AI to outline steps via Plan Mode before executing" workflow tactic.
  • Vault: 06-knowledge/coding/cod-k16-ohmyopenagent-architecture-review.md - architecture-review pattern with LSP + AST-grep + TDD verification feeding the refactor loop.
  • Vault: 02-tasks/acp-t08-route-content-and-design-specs.md §Template B v3 - canonical IA sequence: user request → plan mode → approval → skill invocation → tool calls → validation.
08 · Related

Adjacent reads