Knowledge · D3 · How-To

Debugging Claude Code Hooks That Fire Silently or Don't Run At All.

We spent an afternoon last sprint convinced our PreToolUse hook was working because the agent did not crash - turns out the hook had never registered and Claude was happily running unblocked. Four silent-failure causes, the canonical debug loop with --debug, the registered-but-skipped vs erroring-silently vs never-registered taxonomy, --hooks-dry-run, and the CCA-F anti-pattern of confusing missing output with successful execution.

D3 Agent OperationsTroubleshootingHowTo · 5 steps

Last updated

01 · TLDR

The short version

A silent hook is the worst failure mode in agent operations: no error, no log, no signal that anything is wrong - except the side effect you depended on never happened. Three failure modes (registered-but-skipped, registered-and-erroring, never-registered) need three different fixes. Run claude --debug to see the registration list and per-invocation log. Always write a heartbeat line to a known sink so absence of the line is a positive failure signal. The exam trap under D3 is "no output means no problem" - it does not; it means you cannot distinguish a crashed hook from a skipped hook.

02 · Why this matters in production

When the deterministic gate stops being deterministic

Hooks exist because prompt-only enforcement is probabilistic. Per the /scenarios/conversational-ai-patterns scenario in the vault: "Prompt-only enforcement is probabilistic (roughly 92% in this scenario). Hooks are deterministic. They read structured state, exit 2, and route Claude back. For business-bearing guarantees (do not re-ask answered questions, do not process unverified accounts), the 8% leak from prompt-only is unacceptable." A hook that silently fails to register is even worse than prompt-only: you believe you have a deterministic gate, you do not, and the failure is invisible until an auditor catches the violation weeks later.

The most painful version of this: a PreToolUse hook designed to block refunds above $500 that never registered because the JSON config had the hooks key one level too deep. The agent processed three refunds for $750, $1,200, and $4,300 before anyone noticed. The hook script was correct. The wiring was not. There was no error message; there was just an absence of the protection the team thought it had.

03 · The mechanics

How hooks load, why they go silent, and how to diagnose each case

Per the /concepts/hooks page in the vault, the lifecycle has three steps: "Step 1: Tool Interception. Claude emits a tool_use block; the SDK extracts the tool name and input JSON. Step 2: Hook Execution. Your hook command runs as a subprocess; stdin contains a JSON object with hook_event_name, tool_name, tool_input, and optionally tool_response (PostToolUse only). Step 3: Decision Point. PreToolUse: exit code 0 (allow), 2 (deny), 1 (error)." Each of those three steps can break silently, and the diagnostic flow depends on which one.

Never registered. Symptom: the hook does not appear in the registration list when you run claude --debug. Cause: JSON parse failure, wrong key path, malformed event name. Fix: validate the config with jq, confirm the hooks key is at the top level (not nested in a comment block or wrong section), and confirm event names match the case-sensitive SDK expectation (PreToolUse, not pre_tool_use).

Registered but skipped.Symptom: claude --debug shows registration on startup, but no invocation log appears on the tool you expected. Cause: the matcher string does not match the tool name. Per /concepts/hooks: "Matcher syntax is critical. A matcher is a pipe-separated list of tool names: 'Read|Edit|Grep' fires on any of those three. The wildcard '*' matches all tools. Matchers are exact string matches; 'Read' does not match 'ReadFile'." Fix: dump the actual tool name from a debug session, copy it exactly into the matcher.

Registered and erroring silently.Symptom: invocation appears in debug log; exit code is non-zero; nothing else happens. Cause: stderr is being discarded. The hook crashed but you cannot see why. Fix: explicit "2>> ~/.claude/hook.log" redirection in the hook command string, or wrap the hook in a shell wrapper that always logs.

A minimal heartbeat-instrumented hook in shell:

#!/usr/bin/env bash
# /opt/hooks/refund-guard.sh
LOG=~/.claude/hook.log
exec 2>> "$LOG"

INPUT=$(cat)
EVENT=$(echo "$INPUT" | jq -r '.hook_event_name')
TOOL=$(echo "$INPUT" | jq -r '.tool_name')
AMOUNT=$(echo "$INPUT" | jq -r '.tool_input.amount // 0')

echo "$(date -u +%FT%TZ) event=$EVENT tool=$TOOL amount=$AMOUNT" >> "$LOG"

if [[ "$TOOL" == "process_refund" && "$AMOUNT" -gt 500 ]]; then
  echo "Refund amount $AMOUNT exceeds policy 500" >&2
  exit 2
fi
exit 0

Three things matter about that snippet. Every invocation writes a one-line heartbeat to a known file before any decision logic - so absence of the line proves the hook never ran. The shell redirects stderr to the same log so a crash leaves a trace. The exit codes are explicit (2 for deny with a clear message, 0 for allow). With that pattern in place, the failure modes above become falsifiable: grep the log, see whether the heartbeat is there, classify the problem, fix the right thing.

For richer iteration, use the --hooks-dry-run pattern. Record one real session with --debug capturing the full stdin payloads. Replay the recorded payloads through your hook script directly, as many times as you want, without re-incurring the tool-call cost or risking state mutation. This is the cheapest iteration loop for hook authoring; treat it as the test harness.

04 · Decision rule and checklist

Seven checks when a hook fails silently

  1. Run claude --debug. Confirm the hook is in the registration list. If not, the config did not parse - jump to step 2.
  2. Validate the config with jq.Confirm the hooks key is at the top level and event names match the SDK's case-sensitive expectations.
  3. Check the matcher against the real tool name. Grep the debug log for the actual tool_name on the call you expected to intercept. Exact match only.
  4. Check file permissions. The hook script must be executable (chmod +x). Path must be absolute.
  5. Capture stderr to a log.2>> ~/.claude/hook.log on the command string. Without this, a crash is invisible.
  6. Write a heartbeat line on every invocation. Even on success. Absence of the line is the positive failure signal.
  7. Iterate with --hooks-dry-run on a recorded session. Do not test against live tool calls; the iteration loop is too slow and too destructive.
05 · Common anti-patterns

Five recurring hook-debugging mistakes

  1. Silence as success. Assuming the hook ran because nothing crashed. Cause: no heartbeat logging. Fix: heartbeat on every invocation.
  2. Relative paths. The hook script path is relative; the working directory changes mid-session; the hook fails to launch. Fix: absolute paths always; use $${HOME} expansion if needed.
  3. Exit 0 with a printed warning. The hook detects a violation but exits 0 with a stderr warning. The warning is invisible to the model. Fix: exit 2 to deny; the stderr message routes back to Claude.
  4. Wrong matcher casing or substring.Matcher is 'Read' but the tool is 'ReadFile'. The hook never fires. Fix: copy the exact tool name from the debug log.
  5. Testing only in the parent session. The hook works in the parent but does not fire on subagent tool calls because configuration scope differs. Fix: test in the actual execution context, not just the parent.
06 · CCA-F exam mapping

How this shows up on the exam

Domain
D3 Agent Operations (20%) · D2 Tool Design overlap (18%)
What is tested
Whether you can name the FIRST diagnostic step on a silently-failing hook and whether you understand the three failure modes are distinct. The exam is diagnostic, not implementation; you read the symptom and pick the cause.
Stem pattern
A PostToolUse hook is configured but the side effect never occurs. Three causes are listed. Which is the FIRST diagnostic step?
Distractor to reject
"Restart Claude Code." Clears state but does not reveal which of the three failure modes you are in.
Second distractor
"Switch from PostToolUse to PreToolUse." Changes semantics, not diagnostics. If the hook is never registered, the event name does not matter.
Third distractor
"Add more detail to the system prompt." Prompt vs Hook heuristic per ACP-T03 §6. A prompt cannot fix a wiring problem with a deterministic gate.
07 · Sources

Vault and external references

  • Vault: data/aeo/reports/2026-05-17-recommendations.md §Signal - source of the four silent-failure causes and the missing-output anti-pattern.
  • Vault: public/concepts/hooks.md §How it works - the three-step lifecycle, matcher syntax, exact-match constraints, absolute-path requirement.
  • Vault: public/concepts/hooks.md§What it is - "Production hooks fail in two ways: incorrect matchers and silent errors."
  • Vault: 99-attachements/asc-a01-skilljar-course-content/course-02-claude-code-101/lesson-12-hooks.md - Skilljar canonical exit-code semantics (0 allow, 2 deny, 1 error) and PreToolUse block-tool-call behavior.
  • Vault: public/scenarios/conversational-ai-patterns.md- "Prompt-only enforcement is probabilistic (roughly 92%). Hooks are deterministic." - establishes why silent hook failure is so corrosive.
  • Vault: public/scenarios/agentic-tool-design.md §PreToolUse and PostToolUse - canonical hook patterns and configuration examples.
  • Vault: public/scenarios/claude-code-for-cicd.md - headless mode and --debug flag usage in CI contexts that surface hook behavior.
08 · Related

Adjacent reads