# Agent Skills for Enterprise KM

> An enterprise-scale Skills registry. Each Skill is a markdown file with frontmatter (name + version + description + tags + dependencies + access_level), stored in .claude/skills/{team}/{name}.md so naming collisions become structural impossibilities. The registry indexer rebuilds on every commit, the search service surfaces the right Skill from 200+, semver gates breaking changes, and a permission-aware layer enforces ACLs before invocation (support agents cannot invoke finance Skills, no matter how cleverly prompted). Empirically confirmed on the real CCA-F exam by multiple pass-takers as one of the highest-leverage beyond-guide scenarios.

**Sub-marker:** P3.11
**Domains:** D3 · Agent Operations, D2 · Tool Design + Integration
**Exam weight:** 38% of CCA-F (D3 + D2)
**Build time:** 26 minutes
**Source:** 🟢 Beyond-guide scenario · empirically witnessed on the real CCA-F exam
**Canonical:** https://claudearchitectcertification.com/scenarios/agent-skills-for-enterprise-km
**Last reviewed:** 2026-05-04

## In plain English

Think of this as the way a 5,000-person company stops re-inventing the same agent prompt fifteen times. Every team writes its own Skills. Refund handling, expense reporting, deployment runbooks. And they all live in one shared library, organised by team and version, just like a code repo. When an agent on any team needs to do something, it searches the library, finds the right Skill, checks that the user is allowed to use it (finance Skills are not for the support team), and runs it. The whole point is that knowledge gets re-used safely at enterprise scale, not copy-pasted into a hundred system prompts.

## Exam impact

Domain 3 (Claude Code Configuration, 20%) tests Skills frontmatter, namespace conventions, and dependency resolution. Domain 2 (Tool Design, 18%) tests permission-aware invocation and the search-service contract. Confirmed on the real exam by two independent pass-takers (per ACP-T05 §Scenario 11 catalog). One of the two beyond-guide scenarios with the highest known exam frequency. Drilling this scenario lifts pass probability materially.

## The problem

### What the customer needs
- One source of truth across 15 teams. No copy-pasted Skill prompts drifting in 15 different repos.
- Discoverable at enterprise scale. An agent on the marketing team finds the right finance Skill in seconds, not hours.
- Permission-aware. Finance's budget-approval Skill must be unreachable from support's agent, no matter how the support agent is prompted.

### Why naive approaches fail
- 200+ Skills in one flat folder → collision week 1 (refund-resolver exists in support/, growth/, AND finance/, all mean different things).
- No semver → v2 silently breaks v1 callers when frontmatter shape changes; agents start failing silently across the org.
- No ACL → support agent invokes finance/budget-approval because the Skill description sounded relevant; policy violation at scale.

### Definition of done
- Naming collision rate = 0 (team namespace prefix enforced)
- Breaking-change incidents = 0 (semver in frontmatter, callers pin major)
- Cross-team unauthorized invocation rate = 0 (ACL check before execution)
- Skill-discovery p95 latency < 200ms (embeddings or full-text index)
- Reindex SLA < 60s from commit to searchable

## Concepts in play

- 🟢 **Skills** (`skills`), Markdown + frontmatter as the unit of reusable knowledge
- 🟢 **Project memory** (`claude-md-hierarchy`), Skills extend project-level CLAUDE.md across teams
- 🟢 **Tool calling** (`tool-calling`), Skill invocation as a structured tool call
- 🟢 **Attention engineering** (`attention-engineering`), Frontmatter routes the LLM to the right Skill
- 🟢 **Evaluation** (`evaluation`), Pre-execution ACL check is a gate, not a heuristic
- 🟢 **Context window** (`context-window`), Search returns top-k Skills, not all 200+
- 🟢 **Subagents** (`subagents`), Some Skills spawn isolated subagents internally
- 🟢 **Structured outputs** (`structured-outputs`), Skill frontmatter is the contract

## Components

### Skill Definition File, .claude/skills/{team}/{name}.md

The unit of enterprise knowledge. Markdown body holds the instructions; YAML frontmatter holds the metadata the registry indexes (name, version, description, tags, depends_on, access_level). Lives in version control next to code, reviewed via PR like any other team artifact.

**Configuration:** Path convention: .claude/skills/{team}/{name}.md. Frontmatter required: name, version (semver), description, tags, depends_on, access_level. Body: the actual prompt + examples. Reviewed in PRs.
**Concept:** `skills`

### Shared Registry & Indexer, rebuilds on every commit

A CI job that walks .claude/skills//*.md, parses frontmatter, validates schema, builds a searchable index, and publishes it to the registry service. Idempotent. Fast (sub-minute on 500 Skills). When a Skill commit lands, the index is fresh within 60s and the new version is discoverable.

**Configuration:** Triggered on push to main. Steps: glob skills, parse YAML, validate (semver, ACL, deps exist), upload index to registry. Reindex SLA: <60s. Failed parses fail the CI; bad Skills never reach the registry.
**Concept:** `structured-outputs`

### Search Service, embeddings-based at scale

Indexes Skill descriptions + tags + frontmatter. Agents query in natural language ('find a Skill for processing customer refunds') and get the top-k matches with their metadata. Full-text works at <50 Skills; embeddings (OpenAI / Voyage) become essential past 100; org-wide deployments use a hybrid (embeddings for recall, full-text for precision).

**Configuration:** POST /search { query, k=5, filters: { team?, access_level?, tag? } } → [{slug, version, description, score}]. Latency p95 < 200ms. Cache embeddings keyed by (skill_slug, content_hash); recompute only on content change.
**Concept:** `context-window`

### Git-Based Versioning, semver in frontmatter + Git tags

Every Skill carries a semver version in its frontmatter; every release tags Git so older versions stay reachable. Callers pin a MAJOR version (refund-resolver:v1.x); the registry serves the latest patch within that major. Breaking changes bump the major; old callers keep working until they migrate.

**Configuration:** Frontmatter: version: 1.2.3. Caller: depends_on: ['support/refund-resolver:1.x']. Registry resolves to latest patch within pinned major. Deprecated versions stay queryable for 6 months before archive.
**Concept:** `tool-calling`

### Access Control Layer, permission-aware invocation

Sits between Skill discovery and Skill execution. Reads the calling agent's role + the Skill's access_level (public | team | role-restricted | sensitive). Denies invocation when the agent's role isn't in the allowlist. Returns a structured permission-denied error. The agent observes it and can request access via the org's standard flow, not bypass it.

**Configuration:** Pre-invocation: { agent_role, skill_acl } → { allowed: bool, reason }. ACL stored in frontmatter access_level + team-level org config. Denied: structured error { code: 'ACL_DENIED', skill, reason, request_url }.
**Concept:** `evaluation`

## Build steps

### 1. Lay out the team-namespaced directory

Create .claude/skills/{team}/{name}.md per team. Even on day one with 5 Skills, namespace from the start. Retrofitting a flat layout into namespaces at 100 Skills is painful. The directory IS the registry's source of truth.

**Python:**

```python
# Repository layout
# .claude/
# └── skills/
#     ├── support/
#     │   ├── refund-resolver.md       # support/refund-resolver
#     │   └── escalation-router.md     # support/escalation-router
#     ├── platform/
#     │   ├── deploy-runbook.md
#     │   └── incident-triage.md
#     ├── data/
#     │   ├── query-builder.md
#     │   └── pii-redactor.md
#     └── finance/
#         └── budget-approval.md       # access_level: sensitive

# Bootstrap script for a fresh repo
import os
TEAMS = ["support", "platform", "data", "growth", "finance"]
for t in TEAMS:
    os.makedirs(f".claude/skills/{t}", exist_ok=True)
    with open(f".claude/skills/{t}/.gitkeep", "w") as f:
        pass
print("namespace-by-team layout ready; commit and start authoring.")
```

**TypeScript:**

```typescript
// Repository layout
// .claude/
// └── skills/
//     ├── support/
//     │   ├── refund-resolver.md       // support/refund-resolver
//     │   └── escalation-router.md     // support/escalation-router
//     ├── platform/
//     │   ├── deploy-runbook.md
//     │   └── incident-triage.md
//     ├── data/
//     │   ├── query-builder.md
//     │   └── pii-redactor.md
//     └── finance/
//         └── budget-approval.md       // access_level: sensitive

// Bootstrap script for a fresh repo
import { mkdirSync, writeFileSync } from "node:fs";
const teams = ["support", "platform", "data", "growth", "finance"];
for (const t of teams) {
  mkdirSync(`.claude/skills/${t}`, { recursive: true });
  writeFileSync(`.claude/skills/${t}/.gitkeep`, "");
}
console.log("namespace-by-team layout ready; commit and start authoring.");
```

Concept: `skills`

### 2. Define the Skill frontmatter schema

Every Skill carries the same YAML frontmatter shape, validated by the indexer. Required: name, version (semver), description, tags, access_level. Optional: depends_on, deprecated, owners. Schema lives in the repo so PRs that break it fail CI before merging.

**Python:**

```python
# .claude/skills/_schema.yaml. The frontmatter contract
# Validated by the indexer; PRs that violate this schema fail CI.

required:
  - name              # team/skill-name (e.g. support/refund-resolver)
  - version           # semver: MAJOR.MINOR.PATCH
  - description       # 1-2 sentence description, search-indexed
  - tags              # array, search-indexed
  - access_level      # public | team | role-restricted | sensitive

optional:
  - depends_on        # ['support/case-facts:1.x', ...]
  - deprecated        # 'use support/refund-resolver-v2 instead'
  - owners            # ['@support-team', '@jane.doe']

# Example skill. Support/refund-resolver.md
---
name: support/refund-resolver
version: 1.2.3
description: |
  Resolves customer refund requests up to $500 using the case-facts
  block and escalation queue. For amounts above cap, escalates.
tags: [refund, customer-support, payment]
access_level: team
depends_on:
  - support/case-facts:1.x
  - shared/escalation-queue:2.x
owners:
  - "@support-team"
---

# Body: the actual instructions and examples ...
```

**TypeScript:**

```typescript
// .claude/skills/_schema.yaml. The frontmatter contract
// Validated by the indexer; PRs that violate this schema fail CI.
//
// required:
//   - name              // team/skill-name (e.g. support/refund-resolver)
//   - version           // semver: MAJOR.MINOR.PATCH
//   - description       // 1-2 sentence description, search-indexed
//   - tags              // array, search-indexed
//   - access_level      // public | team | role-restricted | sensitive
//
// optional:
//   - depends_on        // ['support/case-facts:1.x', ...]
//   - deprecated        // 'use support/refund-resolver-v2 instead'
//   - owners            // ['@support-team', '@jane.doe']

// Example skill. Support/refund-resolver.md
// ---
// name: support/refund-resolver
// version: 1.2.3
// description: |
//   Resolves customer refund requests up to $500 using the case-facts
//   block and escalation queue. For amounts above cap, escalates.
// tags: [refund, customer-support, payment]
// access_level: team
// depends_on:
//   - support/case-facts:1.x
//   - shared/escalation-queue:2.x
// owners:
//   - "@support-team"
// ---
//
// # Body: the actual instructions and examples ...
```

Concept: `structured-outputs`

### 3. Build the registry indexer

A CI job walks .claude/skills//*.md, parses each Skill's frontmatter, validates the schema, resolves dependencies, and writes a searchable index. Runs on every push to main; reindex SLA <60s on 500 Skills. Bad Skills (broken schema, missing dep, semver violation) fail the CI. They never reach the registry.

**Python:**

```python
# scripts/index_skills.py. Runs in CI on push to main
import yaml, json, glob, sys, hashlib, semver
from pathlib import Path

REQUIRED = {"name", "version", "description", "tags", "access_level"}
ACCESS_LEVELS = {"public", "team", "role-restricted", "sensitive"}

def parse(path: Path) -> dict:
    text = path.read_text()
    if not text.startswith("---"):
        raise ValueError(f"{path}: missing frontmatter")
    _, fm, body = text.split("---", 2)
    meta = yaml.safe_load(fm)
    missing = REQUIRED - set(meta)
    if missing:
        raise ValueError(f"{path}: missing keys: {missing}")
    if meta["access_level"] not in ACCESS_LEVELS:
        raise ValueError(f"{path}: bad access_level: {meta['access_level']}")
    semver.VersionInfo.parse(meta["version"])  # raises if invalid
    meta["body_hash"] = hashlib.sha256(body.encode()).hexdigest()[:12]
    meta["path"] = str(path)
    return meta

def build_index() -> list[dict]:
    skills = [parse(Path(p)) for p in glob.glob(".claude/skills/**/*.md", recursive=True)]
    # Resolve dependencies. Every depends_on must exist
    names = {s["name"] for s in skills}
    for s in skills:
        for dep in s.get("depends_on", []):
            dep_name = dep.split(":")[0]
            if dep_name not in names:
                raise ValueError(f"{s['name']}: missing dep {dep_name}")
    return skills

if __name__ == "__main__":
    try:
        index = build_index()
        Path("dist/skill-registry.json").write_text(json.dumps(index, indent=2))
        print(f"indexed {len(index)} skills; pushed to registry")
    except ValueError as e:
        print(f"::error::{e}", file=sys.stderr)
        sys.exit(1)
```

**TypeScript:**

```typescript
// scripts/index-skills.ts. Runs in CI on push to main
import { readFileSync, writeFileSync } from "node:fs";
import { glob } from "glob";
import { parse as parseYaml } from "yaml";
import { createHash } from "node:crypto";
import semver from "semver";

const REQUIRED = ["name", "version", "description", "tags", "access_level"];
const ACCESS_LEVELS = ["public", "team", "role-restricted", "sensitive"];

interface Skill {
  name: string;
  version: string;
  description: string;
  tags: string[];
  access_level: string;
  depends_on?: string[];
  deprecated?: string;
  owners?: string[];
  body_hash: string;
  path: string;
}

function parse(path: string): Skill {
  const text = readFileSync(path, "utf8");
  if (!text.startsWith("---")) throw new Error(`${path}: missing frontmatter`);
  const [, fm, body] = text.split("---", 3);
  const meta = parseYaml(fm) as Record<string, unknown>;
  for (const k of REQUIRED) {
    if (!(k in meta)) throw new Error(`${path}: missing key ${k}`);
  }
  if (!ACCESS_LEVELS.includes(meta.access_level as string)) {
    throw new Error(`${path}: bad access_level: ${meta.access_level}`);
  }
  if (!semver.valid(meta.version as string)) {
    throw new Error(`${path}: invalid semver ${meta.version}`);
  }
  const body_hash = createHash("sha256").update(body).digest("hex").slice(0, 12);
  return { ...meta, body_hash, path } as Skill;
}

async function buildIndex(): Promise<Skill[]> {
  const paths = await glob(".claude/skills/**/*.md");
  const skills = paths.map((p) => parse(p));
  const names = new Set(skills.map((s) => s.name));
  for (const s of skills) {
    for (const dep of s.depends_on ?? []) {
      const depName = dep.split(":")[0];
      if (!names.has(depName)) throw new Error(`${s.name}: missing dep ${depName}`);
    }
  }
  return skills;
}

try {
  const index = await buildIndex();
  writeFileSync("dist/skill-registry.json", JSON.stringify(index, null, 2));
  console.log(`indexed ${index.length} skills; pushed to registry`);
} catch (e) {
  console.error(`::error::${(e as Error).message}`);
  process.exit(1);
}
```

Concept: `structured-outputs`

### 4. Add semantic search over the registry

At <50 Skills, full-text on description+tags is enough. Past 100, agents need to discover by intent rather than keyword (a Skill that handles customer refunds should match refund-resolver even without the word 'refund' in the query). Embeddings + vector index over Skill description+tags is the play; cache embeddings keyed by body_hash so re-embedding only fires on content change.

**Python:**

```python
# scripts/search_service.py
from anthropic import Anthropic
import numpy as np, json
from pathlib import Path

# Index loaded from registry
SKILLS = json.loads(Path("dist/skill-registry.json").read_text())

# Embedding cache keyed by (skill name, body_hash)
_emb_cache: dict[tuple[str, str], list[float]] = {}

def embed_text(text: str) -> list[float]:
    """Stand-in for any embeddings provider (Voyage, OpenAI, etc.)."""
    # In production, batch-embed once at index time and cache:
    # client.embed(text, model="voyage-2-large")
    raise NotImplementedError

def index_skill(s: dict):
    key = (s["name"], s["body_hash"])
    if key not in _emb_cache:
        text = f"{s['description']} {' '.join(s['tags'])}"
        _emb_cache[key] = embed_text(text)

def cosine(a, b):
    a, b = np.array(a), np.array(b)
    return float(a @ b / (np.linalg.norm(a) * np.linalg.norm(b)))

def search(query: str, k: int = 5,
           team: str | None = None,
           access_level: str | None = None) -> list[dict]:
    qv = embed_text(query)
    scored = []
    for s in SKILLS:
        if team and not s["name"].startswith(f"{team}/"):
            continue
        if access_level and s["access_level"] != access_level:
            continue
        index_skill(s)
        sim = cosine(qv, _emb_cache[(s["name"], s["body_hash"])])
        scored.append((sim, s))
    scored.sort(key=lambda x: -x[0])
    return [
        {**s, "score": round(score, 3)}
        for score, s in scored[:k]
    ]

# Example
# results = search("handle a customer refund up to $500", k=3, team="support")
```

**TypeScript:**

```typescript
// scripts/search-service.ts
import { readFileSync } from "node:fs";

interface Skill {
  name: string;
  version: string;
  description: string;
  tags: string[];
  access_level: string;
  body_hash: string;
}

const SKILLS: Skill[] = JSON.parse(
  readFileSync("dist/skill-registry.json", "utf8"),
);

const embCache = new Map<string, number[]>();

async function embedText(text: string): Promise<number[]> {
  // Stand-in for Voyage / OpenAI / etc.
  // In production: batch-embed at index time and cache.
  throw new Error("not implemented");
}

async function indexSkill(s: Skill) {
  const key = `${s.name}|${s.body_hash}`;
  if (!embCache.has(key)) {
    const text = `${s.description} ${s.tags.join(" ")}`;
    embCache.set(key, await embedText(text));
  }
}

function cosine(a: number[], b: number[]): number {
  let dot = 0, na = 0, nb = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    na += a[i] * a[i];
    nb += b[i] * b[i];
  }
  return dot / (Math.sqrt(na) * Math.sqrt(nb));
}

export async function search(
  query: string,
  opts: { k?: number; team?: string; access_level?: string } = {},
) {
  const { k = 5, team, access_level } = opts;
  const qv = await embedText(query);
  const scored: Array<{ score: number; skill: Skill }> = [];
  for (const s of SKILLS) {
    if (team && !s.name.startsWith(`${team}/`)) continue;
    if (access_level && s.access_level !== access_level) continue;
    await indexSkill(s);
    const sim = cosine(qv, embCache.get(`${s.name}|${s.body_hash}`)!);
    scored.push({ score: sim, skill: s });
  }
  scored.sort((a, b) => b.score - a.score);
  return scored.slice(0, k).map(({ score, skill }) => ({
    ...skill,
    score: Math.round(score * 1000) / 1000,
  }));
}
```

Concept: `context-window`

### 5. Pin versions on every dependency edge

Every depends_on in a Skill's frontmatter pins a MAJOR version (support/case-facts:1.x), not a fixed PATCH. The registry resolves to the latest PATCH within the pinned major. When case-facts ships a breaking change, it bumps to v2. Old callers continue against v1.x; new callers opt in to v2 explicitly. This is exactly how pip / npm work, applied to Skills.

**Python:**

```python
# scripts/resolve_deps.py. Given a Skill, resolve its depends_on graph
import json, semver
from pathlib import Path

SKILLS = json.loads(Path("dist/skill-registry.json").read_text())
INDEX = {s["name"]: [] for s in SKILLS}
for s in SKILLS:
    INDEX[s["name"]].append(s)
for name in INDEX:
    INDEX[name].sort(key=lambda s: semver.VersionInfo.parse(s["version"]))

def resolve(spec: str) -> dict:
    """spec: 'team/skill:1.x' or 'team/skill:>=2.0.0 <3.0.0'."""
    name, _, constraint = spec.partition(":")
    versions = INDEX.get(name, [])
    if not versions:
        raise LookupError(f"unknown skill: {name}")

    if constraint.endswith(".x"):
        major = int(constraint.split(".")[0])
        candidates = [
            v for v in versions
            if semver.VersionInfo.parse(v["version"]).major == major
        ]
    else:
        candidates = [
            v for v in versions
            if semver.match(v["version"], constraint)
        ]

    if not candidates:
        raise LookupError(f"{name}: no version satisfies {constraint}")
    return candidates[-1]  # latest matching

def topo_resolve(skill_spec: str, seen: set | None = None) -> list[dict]:
    """Resolve full dep graph in topological order."""
    seen = seen or set()
    skill = resolve(skill_spec)
    if skill["name"] in seen:
        return []
    seen.add(skill["name"])
    out = []
    for dep_spec in skill.get("depends_on", []):
        out.extend(topo_resolve(dep_spec, seen))
    out.append(skill)
    return out
```

**TypeScript:**

```typescript
// scripts/resolve-deps.ts
import { readFileSync } from "node:fs";
import semver from "semver";

interface Skill {
  name: string;
  version: string;
  depends_on?: string[];
  [k: string]: unknown;
}

const SKILLS: Skill[] = JSON.parse(
  readFileSync("dist/skill-registry.json", "utf8"),
);
const INDEX = new Map<string, Skill[]>();
for (const s of SKILLS) {
  if (!INDEX.has(s.name)) INDEX.set(s.name, []);
  INDEX.get(s.name)!.push(s);
}
for (const versions of INDEX.values()) {
  versions.sort((a, b) => semver.compare(a.version, b.version));
}

export function resolve(spec: string): Skill {
  // spec: 'team/skill:1.x' or 'team/skill:>=2.0.0 <3.0.0'
  const [name, constraint] = spec.split(":");
  const versions = INDEX.get(name);
  if (!versions) throw new Error(`unknown skill: ${name}`);

  let candidates: Skill[];
  if (constraint?.endsWith(".x")) {
    const major = Number(constraint.split(".")[0]);
    candidates = versions.filter((v) => semver.major(v.version) === major);
  } else {
    candidates = versions.filter((v) => semver.satisfies(v.version, constraint));
  }
  if (candidates.length === 0) {
    throw new Error(`${name}: no version satisfies ${constraint}`);
  }
  return candidates[candidates.length - 1];
}

export function topoResolve(spec: string, seen = new Set<string>()): Skill[] {
  const skill = resolve(spec);
  if (seen.has(skill.name)) return [];
  seen.add(skill.name);
  const out: Skill[] = [];
  for (const dep of skill.depends_on ?? []) {
    out.push(...topoResolve(dep, seen));
  }
  out.push(skill);
  return out;
}
```

Concept: `tool-calling`

### 6. Enforce ACLs before invocation

Permission-aware RAG isn't built into Claude. You implement it. Read the calling agent's role + the Skill's access_level, run a hard check before invoking, and return a structured error on deny. This is a deterministic gate, not a prompt-language constraint; the Skill's body never executes if ACL fails.

**Python:**

```python
# scripts/acl_gate.py
from typing import TypedDict, Literal

class Skill(TypedDict):
    name: str
    access_level: Literal["public", "team", "role-restricted", "sensitive"]
    owners: list[str]

class AgentContext(TypedDict):
    role: str           # e.g. 'support-agent', 'finance-agent'
    teams: list[str]    # ['support', 'shared']
    elevated: bool      # has the user explicitly elevated to invoke sensitive skills?

ROLE_ACL = {
    # Each access_level → which roles may invoke
    "public": lambda ctx, s: True,
    "team": lambda ctx, s: any(s["name"].startswith(f"{t}/") for t in ctx["teams"]),
    "role-restricted": lambda ctx, s: ctx["role"] in s.get("allowed_roles", []),
    "sensitive": lambda ctx, s: ctx["elevated"] and any(
        s["name"].startswith(f"{t}/") for t in ctx["teams"]
    ),
}

def check(ctx: AgentContext, skill: Skill) -> dict:
    """Returns {allowed, reason, request_url?}."""
    rule = ROLE_ACL[skill["access_level"]]
    if rule(ctx, skill):
        return {"allowed": True, "reason": "access_granted"}
    return {
        "allowed": False,
        "reason": f"agent role={ctx['role']} cannot invoke {skill['name']} (access_level={skill['access_level']})",
        "request_url": f"https://internal.example.com/skills/request-access?skill={skill['name']}",
    }

# Usage in the agent loop
def invoke_skill(ctx: AgentContext, skill_spec: str, payload: dict):
    from resolve_deps import resolve
    skill = resolve(skill_spec)
    decision = check(ctx, skill)
    if not decision["allowed"]:
        return {"error": "ACL_DENIED", **decision}
    # ...actually invoke the Skill body...
```

**TypeScript:**

```typescript
// scripts/acl-gate.ts
type AccessLevel = "public" | "team" | "role-restricted" | "sensitive";

interface Skill {
  name: string;
  access_level: AccessLevel;
  allowed_roles?: string[];
  owners?: string[];
}

interface AgentContext {
  role: string;        // e.g. 'support-agent', 'finance-agent'
  teams: string[];     // ['support', 'shared']
  elevated: boolean;   // has the user explicitly elevated to invoke sensitive skills?
}

const ROLE_ACL: Record<AccessLevel, (ctx: AgentContext, s: Skill) => boolean> = {
  public: () => true,
  team: (ctx, s) => ctx.teams.some((t) => s.name.startsWith(`${t}/`)),
  "role-restricted": (ctx, s) => (s.allowed_roles ?? []).includes(ctx.role),
  sensitive: (ctx, s) =>
    ctx.elevated && ctx.teams.some((t) => s.name.startsWith(`${t}/`)),
};

export function check(ctx: AgentContext, skill: Skill) {
  if (ROLE_ACL[skill.access_level](ctx, skill)) {
    return { allowed: true as const, reason: "access_granted" };
  }
  return {
    allowed: false as const,
    reason: `agent role=${ctx.role} cannot invoke ${skill.name} (access_level=${skill.access_level})`,
    request_url: `https://internal.example.com/skills/request-access?skill=${skill.name}`,
  };
}

// Usage in the agent loop
export async function invokeSkill(
  ctx: AgentContext,
  skillSpec: string,
  payload: Record<string, unknown>,
) {
  const { resolve } = await import("./resolve-deps");
  const skill = resolve(skillSpec) as Skill;
  const decision = check(ctx, skill);
  if (!decision.allowed) {
    return { error: "ACL_DENIED" as const, ...decision };
  }
  // ...actually invoke the Skill body...
}
```

Concept: `evaluation`

### 7. Wire the agent's Skill discovery into its tool loop

Expose two tools to every agent: search_skills(query, filters) and invoke_skill(name, version, payload). The agent finds Skills by intent, the ACL gate runs inside invoke_skill, and the Skill body executes only on allow. The agent never sees the registry's raw 200+ entries. Just the top-k matches for its query, gated by access_level.

**Python:**

```python
# Skills are exposed as two tools to every agent
TOOLS = [
    {
        "name": "search_skills",
        "description": (
            "Find a Skill in the enterprise registry by natural-language query. "
            "Returns up to k matches with name, version, description, score. "
            "Use BEFORE invoke_skill so you have a name+version to invoke."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string"},
                "k": {"type": "integer", "default": 5},
                "team": {"type": "string"},  # optional filter
                "access_level": {"type": "string"},  # optional filter
            },
            "required": ["query"],
        },
    },
    {
        "name": "invoke_skill",
        "description": (
            "Invoke a Skill from the registry. ACL is checked before the "
            "Skill body executes; if denied, returns ACL_DENIED with a "
            "request_url for access. Always pin a major version (e.g. 1.x)."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "version_constraint": {"type": "string", "default": "*"},
                "payload": {"type": "object"},
            },
            "required": ["name", "payload"],
        },
    },
]
```

**TypeScript:**

```typescript
// Skills are exposed as two tools to every agent
import type Anthropic from "@anthropic-ai/sdk";

export const tools: Anthropic.Tool[] = [
  {
    name: "search_skills",
    description:
      "Find a Skill in the enterprise registry by natural-language query. " +
      "Returns up to k matches with name, version, description, score. " +
      "Use BEFORE invoke_skill so you have a name+version to invoke.",
    input_schema: {
      type: "object",
      properties: {
        query: { type: "string" },
        k: { type: "integer", default: 5 },
        team: { type: "string" },
        access_level: { type: "string" },
      },
      required: ["query"],
    },
  },
  {
    name: "invoke_skill",
    description:
      "Invoke a Skill from the registry. ACL is checked before the Skill " +
      "body executes; if denied, returns ACL_DENIED with a request_url for " +
      "access. Always pin a major version (e.g. 1.x).",
    input_schema: {
      type: "object",
      properties: {
        name: { type: "string" },
        version_constraint: { type: "string", default: "*" },
        payload: { type: "object" },
      },
      required: ["name", "payload"],
    },
  },
];
```

Concept: `tool-calling`

### 8. Track usage + deprecation lifecycle

Once Skills are in production, the registry needs to know which Skills are hot, which are stale, which have known broken versions. Log every invoke_skill call with name, version, agent role, outcome. Surface a deprecation notice in search_skills results when an old version is queried. Auto-archive Skills with zero invocations in 6 months.

**Python:**

```python
# scripts/usage_tracker.py
from datetime import datetime, timedelta
from collections import Counter
import json
from pathlib import Path

# Append-only log of every invoke_skill call
def log_invocation(name: str, version: str, agent_role: str, outcome: str):
    record = {
        "ts": datetime.utcnow().isoformat() + "Z",
        "name": name,
        "version": version,
        "agent_role": agent_role,
        "outcome": outcome,  # 'success' | 'acl_denied' | 'error'
    }
    with open("logs/skill-invocations.jsonl", "a") as f:
        f.write(json.dumps(record) + "\n")

# Nightly: surface deprecation candidates + hot Skills
def nightly_report():
    cutoff = datetime.utcnow() - timedelta(days=180)
    invocations = [
        json.loads(line)
        for line in Path("logs/skill-invocations.jsonl").read_text().splitlines()
    ]
    recent = [r for r in invocations if datetime.fromisoformat(r["ts"][:-1]) > cutoff]
    hot = Counter((r["name"], r["version"]) for r in recent).most_common(20)
    invoked_names = {r["name"] for r in recent}
    all_names = {s["name"] for s in json.loads(Path("dist/skill-registry.json").read_text())}
    cold = sorted(all_names - invoked_names)

    print("=== Hot Skills (last 180d) ===")
    for (name, ver), count in hot:
        print(f"  {name}:{ver}  {count}")
    print(f"\n=== Cold Skills (deprecation candidates) ({len(cold)}) ===")
    for name in cold:
        print(f"  {name}")
```

**TypeScript:**

```typescript
// scripts/usage-tracker.ts
import { appendFileSync, readFileSync } from "node:fs";

interface Record {
  ts: string;
  name: string;
  version: string;
  agent_role: string;
  outcome: "success" | "acl_denied" | "error";
}

export function logInvocation(
  name: string,
  version: string,
  agent_role: string,
  outcome: Record["outcome"],
) {
  const r: Record = { ts: new Date().toISOString(), name, version, agent_role, outcome };
  appendFileSync("logs/skill-invocations.jsonl", JSON.stringify(r) + "\n");
}

export function nightlyReport() {
  const cutoff = Date.now() - 180 * 24 * 3600 * 1000;
  const invocations = readFileSync("logs/skill-invocations.jsonl", "utf8")
    .split("\n")
    .filter(Boolean)
    .map((line) => JSON.parse(line) as Record);
  const recent = invocations.filter((r) => new Date(r.ts).getTime() > cutoff);

  const hotMap = new Map<string, number>();
  for (const r of recent) {
    const k = `${r.name}:${r.version}`;
    hotMap.set(k, (hotMap.get(k) ?? 0) + 1);
  }
  const hot = [...hotMap.entries()]
    .sort((a, b) => b[1] - a[1])
    .slice(0, 20);

  const invokedNames = new Set(recent.map((r) => r.name));
  const allNames = new Set(
    JSON.parse(readFileSync("dist/skill-registry.json", "utf8")).map(
      (s: { name: string }) => s.name,
    ),
  );
  const cold = [...allNames].filter((n) => !invokedNames.has(n)).sort();

  console.log("=== Hot Skills (last 180d) ===");
  for (const [k, n] of hot) console.log(`  ${k}  ${n}`);
  console.log(`\n=== Cold Skills (deprecation candidates) (${cold.length}) ===`);
  for (const name of cold) console.log(`  ${name}`);
}
```

Concept: `evaluation`

## Decision matrix

| Decision | Right answer | Wrong answer | Why |
|---|---|---|---|
| Org has 200+ Skills across 15 teams | Team-namespaced directory ({team}/{name}) + shared registry + embeddings search + ACL layer | Flat folder, full-text search, no ACL ('we'll add permissions later') | Naming collisions, version drift, and cross-team ACL violations all become structural impossibilities at the directory + frontmatter level. Retrofitting them at 200 Skills costs an order of magnitude more than starting clean. |
| Skill case-facts is shipping a breaking change | Bump major (v2.0.0); existing callers stay on v1.x until they migrate; deprecation notice in v1's frontmatter | Edit v1 in place; tell teams to update their callers | Semver + Git tags let old callers keep working while new callers opt into v2 deliberately. Editing in place breaks every agent in the org silently. A class of incident that's painful to debug because the symptoms surface in agent loops, not in the Skill itself. |
| Support agent's prompt suggests calling finance/budget-approval | ACL gate denies pre-execution; structured ACL_DENIED error with request_url returned to the agent | Trust the prompt; finance Skills not in support agent's tool list | Prompt-only restriction leaks under prompt injection or clever phrasing. A deterministic ACL gate that runs before the Skill body executes is the only real boundary. Tool-list restriction is the second layer; ACL is the first. |
| Agent needs to find a Skill but doesn't know its exact name | search_skills(query). Embeddings/full-text returns top-k matches with metadata | Show all 200+ Skills in the agent's tool list | 200 tools in a single agent's tool list destroys routing accuracy (per Scenario P3.1's tool-count rule). Search-by-intent surfaces only the top-k relevant matches; the agent picks one and invokes it. Two tools (search + invoke) cover the whole space. |

## Failure modes

| Anti-pattern | Failure | Fix |
|---|---|---|
| AP-20 · Unbounded skill count in flat layout | 200+ Skills in .claude/skills/ flat folder. Naming collisions appear in week 1 (refund-resolver exists in support, growth, and finance contexts, all meaning different things). Discovery becomes a grep contest. | Team-namespaced layout: .claude/skills/{team}/{name}.md. Collisions become structurally impossible. support/refund-resolver and growth/refund-resolver are distinct paths. Past 50 Skills, add an embeddings-based search service. |
| AP-21 · No versioning | Skill case-facts ships a breaking change (frontmatter shape changes). Every agent in the org that depends on it starts failing silently. No way to roll back a single Skill's update. | Semver in frontmatter (version: 1.2.3) + Git tags. Callers pin major (case-facts:1.x); registry resolves to latest patch. Breaking changes bump major, callers migrate deliberately. |
| AP-22 · Naming collisions across teams | Two teams independently author a refund-resolver Skill. Both end up in .claude/skills/refund-resolver.md (last commit wins). Agents call the wrong one; nobody notices for weeks. | Team namespace prefix: support/refund-resolver vs growth/refund-resolver. The directory layout enforces uniqueness; the indexer rejects duplicates. PR review surfaces collisions before merge. |
| AP-23 · No access control | Support agent's prompt is cleverly engineered (or injected via PR content) to invoke finance/budget-approval. The Skill executes; an unauthorized $50K refund is approved. Audit log shows the agent did it; ACL log shows nothing because there is no ACL. | ACL gate (access_level: public | team | role-restricted | sensitive) on every Skill, checked pre-invocation. Denied calls return a structured ACL_DENIED error; the agent observes it and either escalates or routes differently. Deterministic, not prompt-based. |
| AP-24 · Skills as one-off prompts | Each agent's system prompt copy-pastes the relevant Skill content inline. When the Skill changes, 12 agents need updating. Nobody updates them all; behavior drifts over months. | Skills are reusable, composable, versioned units. Agents reference them via invoke_skill('support/refund-resolver:1.x', payload). One source of truth; one Skill update propagates to every caller automatically. |

## Implementation checklist

- [ ] Team-namespaced directory layout: .claude/skills/{team}/{name}.md (`skills`)
- [ ] Frontmatter schema documented and validated by the indexer (name, version, description, tags, access_level required) (`structured-outputs`)
- [ ] Semver enforced. Every Skill has a valid semver in frontmatter (`tool-calling`)
- [ ] CI indexer runs on push to main; reindex SLA <60s on 500 Skills
- [ ] Search service deployed with p95 <200ms; embeddings cached by body_hash (`context-window`)
- [ ] Two tools exposed to every agent: search_skills + invoke_skill (`tool-calling`)
- [ ] ACL gate runs PRE-invocation; denied calls return structured ACL_DENIED (`evaluation`)
- [ ] Dependency resolution: callers pin major; registry resolves to latest patch
- [ ] Deprecation lifecycle: zero-invocation Skills auto-flagged at 180d
- [ ] Usage log appended on every invoke_skill call (jsonl)
- [ ] PR review on every Skill change. Including the frontmatter shape

## Cost &amp; latency

- **Skill execution (avg 800 tokens):** ~$0.0024 per invocation, Skill body ~500 tokens system + ~200 input + ~100 output. Sonnet 4.5 pricing. Most Skills are narrow, focused units. No inflation from generic prompt scaffolding.
- **Search service (embeddings):** ~$0.0001 per query, Voyage / OpenAI embedding ~512 dims at fractional cost per query. Embeddings cached by body_hash so re-embedding only fires on content change. At 1M queries/month, ~$100.
- **Reindex CI job (per push to main):** ~$0 (compute) + ~$0.01 (embedding refresh), Indexer is pure parsing on GitHub Actions free tier. Only cost is re-embedding Skills with changed content. Typically <5% of the registry per push.
- **ACL check overhead:** ~+0.01ms per invocation, ~0% token cost, ACL is a deterministic dictionary lookup against frontmatter + agent role. No LLM call. Latency is unmeasurable in the pipeline; cost is in maintenance, not execution.
- **Annual registry hosting (5K Skills, 20K queries/day):** ~$3K-8K/year, Embeddings store + search service + reindex compute. Small relative to the per-invocation Skill execution cost which dominates total spend at scale.

## Domain weights

- **D3 · Agent Operations (20%):** Skill definition file + frontmatter schema + Git versioning + deprecation lifecycle
- **D2 · Tool Design + Integration (18%):** search_skills + invoke_skill tool design + ACL gate + dependency resolution

## Practice questions

### Q1. An enterprise has 200+ Skills across 15 teams. Skill-name collisions occur weekly (refund-resolver exists in support/, growth/, AND finance/, all meaning different things). How should you structure the registry to prevent this structurally?

Adopt a team namespace prefix convention enforced by the directory layout: .claude/skills/{team}/{name}.md, so the canonical name is support/refund-resolver vs growth/refund-resolver. Collisions become impossible at the filesystem level (different paths) and at the registry level (the indexer rejects duplicate name fields in frontmatter). Pair with PR review on every Skill change to catch deliberate naming drift before merge. Tagged to AP-22.

### Q2. A Skill for customer-support refund processing is updated frequently. Last week, an in-place edit broke 12 dependent agents silently. How do you prevent this?

Semver in frontmatter plus Git tags. Every Skill carries a version: MAJOR.MINOR.PATCH; every release tags the Git history. Callers pin a major (support/refund-resolver:1.x); the registry resolves to the latest patch within that major. Breaking changes bump the major (v2.0.0); existing callers continue against v1.x until they migrate deliberately. The in-place edit becomes structurally impossible. The indexer rejects two Skills with the same name and version. Tagged to AP-21.

### Q3. Finance team has sensitive Skills (e.g. budget-approval). The support team's agent must NEVER invoke them, no matter how cleverly prompted (or prompt-injected). How do you enforce this architecturally?

Every Skill carries an access_level in frontmatter (public | team | role-restricted | sensitive). An ACL gate runs before Skill invocation: read the calling agent's role + the Skill's access_level, deny pre-execution if not allowed, return structured {error: 'ACL_DENIED', skill, reason, request_url}. The agent observes the denial and either escalates or routes differently. It cannot bypass. This is deterministic, not prompt-based; cleverness in the prompt cannot defeat a hard pre-invocation check. Tagged to AP-23.

### Q4. An agent on the marketing team needs to discover the right Skill from 50+ available. Searching by exact name is slow and requires the agent to already know what's there. What infrastructure should you add?

An embeddings-based search service keyed over description + tags with optional filters by team and access_level. The agent calls search_skills('process customer refund up to $500', k=5) and gets the top-5 matches with {name, version, description, score}. Pair with a full-text fallback for exact-keyword queries. Cache embeddings keyed by body_hash so re-embedding only fires on content change. p95 query latency stays <200ms even at 5,000 Skills.

### Q5. A Skill captures enterprise knowledge (policies, procedures) for support. Should it be a single 500-line markdown file or modular across multiple files with depends_on?

Modular composition via depends_on. The Skill's frontmatter declares its dependencies (depends_on: ['support/case-facts:1.x', 'shared/escalation-queue:2.x']); the registry resolves them topologically at invocation time. Benefits: each unit is independently versioned (case-facts evolves separately from escalation-queue), reusable across multiple parent Skills, and easier to PR-review (smaller files). The dep resolver enforces no cycles and that every referenced version exists.

## FAQ

### Q1. What's the maximum number of Skills per organization?

Unbounded with the right infrastructure. Per-project (a single agent's working set), keep <12 for discoverability. Per-team, low hundreds is comfortable with a search service. Org-wide, thousands work with embeddings + namespaces + ACLs. The bottleneck is rarely raw Skill count. It's how the agent finds the right one and how the org governs change.

### Q2. Can a Skill depend on other Skills?

Yes, declared in frontmatter. depends_on: ['support/case-facts:1.x', 'shared/escalation-queue:2.x']. The registry validates dependencies exist at index time (CI fails on missing dep) and resolves them topologically at invocation time. Avoid cycles. The dep resolver detects them and rejects.

### Q3. How do you version Skills without breaking existing agents?

Semver in frontmatter + callers pin major. A Skill at v1.2.3 keeps backward compatibility for all v1.x callers. When a breaking change is needed, bump to v2.0.0; existing callers continue against v1.x until they migrate deliberately. Deprecation notices in the v1 frontmatter point to v2; the registry surfaces the warning in search_skills results.

### Q4. Is permission-aware RAG built into Claude?

No. You implement it. Claude's tool layer doesn't know about your org's roles. Implement an ACL gate that runs pre-invocation: read agent role + Skill access_level, deny if not allowed, return structured ACL_DENIED. The Skill body never executes if the ACL check fails. This is the same pattern as authorization middleware in any HTTP service. Deterministic, not LLM-judged.

### Q5. Should sensitive Skills be versioned differently?

No. Same versioning, different access control. Versioning is about backward compatibility; access control is about who can invoke. They're orthogonal. A sensitive Skill ships v1.2.3 just like a public one; the ACL gate gates who can call it, regardless of version.

### Q6. How do you find the right Skill from 200+?

Two tools, one query. First, search_skills(query, k=5, filters). Embeddings search returns top-k matches by intent. Second, invoke_skill(name, version, payload). Runs the chosen Skill with ACL check. The agent never sees raw access to the registry; it queries through the search tool. This keeps the agent's tool list small (just 2 tools) while exposing the entire Skills library.

### Q7. What happens to old Skill versions when a new major ships?

They stay queryable for 6 months by default. The deprecation lifecycle: ship v2.0.0 → mark v1.x with a deprecation note in frontmatter → registry serves v1.x to existing callers but flags the deprecation in search results → after 6 months of zero invocations, auto-archive. Active Skills stay forever; truly cold ones get cleaned up.

## Production readiness

- [ ] All Skills have valid frontmatter (CI fails the merge if not)
- [ ] Indexer reindex SLA <60s on the live registry
- [ ] Search p95 latency <200ms under steady-state load
- [ ] ACL gate unit-tested per access_level (public, team, role-restricted, sensitive)
- [ ] Dep resolver tested with cycle, missing-dep, and major-bump scenarios
- [ ] Usage log persisted append-only (jsonl) with retention policy documented
- [ ] Nightly deprecation report runs; cold Skills surfaced to owners
- [ ] Prod deploy of search service has fallback to full-text on embeddings outage

---

**Source:** https://claudearchitectcertification.com/scenarios/agent-skills-for-enterprise-km
**Vault sources:** ACP-T05 §Scenario 11 (🟢 confirmed beyond-guide; u/ZealousidealFill6044); ACP-T07 §Lab 11 spec (HIGHEST PRIORITY beyond-guide lab); ACP-T08 §3.11 (multi-file skills, registry, search, version, ACL); Course 15 Introduction to Agent Skills. Overview + lesson 5 (sharing skills); ACP-T06 (5 practice Qs tagged to components); COD-K12 Hermes agent architecture review (self-improving skill systems)
**Last reviewed:** 2026-05-04

**Evidence tiers**, 🟢 official Anthropic doc · 🟡 partial doc / inferred · 🟠 community-derived · 🔴 disputed.
