You'll walk away with
- How to enable Claude models in Vertex AI Model Garden and authenticate via gcloud Application Default Credentials
- How the
AnthropicVertexPython SDK differs from the directAnthropicSDK (project_id, region, model name format) - Which Claude features ride on top of Vertex unchanged (prompt caching, vision, PDF support, citations, extended thinking, batch)
- How regional model availability and quota management work on Vertex compared to the direct API
- How IAM, VPC Service Controls, and Cloud Logging fit into a Vertex-hosted Claude deployment
- When to choose Vertex AI vs the direct Anthropic API vs Amazon Bedrock for a given workload
Read these first
Lesson outline
Every lesson from Claude with Google Cloud's Vertex AI with our one-line simplification. The Skilljar course is the source; we summarize.
Show all 93 lessons
| # | Skilljar lesson | Our simplification |
|---|---|---|
| 1 | Welcome to the course | Course intro and what Vertex AI adds vs the direct Anthropic API. |
| 2 | Overview of Claude models | Model family overview; same models, accessed via Vertex. |
| 3 | Accessing the API | Request lifecycle: client to server to Vertex to model and back; never call from browser. |
| 4 | Vertex AI Setup | DEPLOYMENT-SPECIFIC: enable Anthropic models in Model Garden, install gcloud CLI, run gcloud auth application-default login. |
| 5 | Making a request | DEPLOYMENT-SPECIFIC: pip install anthropic[vertex], instantiate AnthropicVertex(region=..., project_id=...), model id format claude-sonnet-4@20250514. |
| 6 | Multi-turn conversations | Append assistant + user messages to maintain dialogue state; mirrors Course 6. |
| 7 | Chat exercise | Hands-on: build a minimal chat loop against Vertex. |
| 8 | System prompts | Set role and behavior via the system parameter; identical semantics to direct API. |
| 9 | System prompts exercise | Hands-on: experiment with system prompt variations. |
| 10 | Temperature | Sampling control 0 to 1; lower for deterministic, higher for creative. |
| 11 | Course satisfaction survey | Mid-course feedback prompt. |
| 12 | Response streaming | Stream tokens as they generate via SSE; reduces TTFT for chat UIs. |
| 13 | Controlling model output | max_tokens, stop_sequences, and response shaping basics. |
| 14 | Structured data | Coax JSON via prompting; introduce schema thinking before tool use. |
| 15 | Structured data exercise | Hands-on: extract structured fields with prompted JSON. |
| 16 | Quiz on accessing Claude with the API | Section quiz on auth, request shape, and response handling. |
| 17 | Prompt evaluation | Why systematic eval beats vibes; same content as Course 6. |
| 18 | A typical eval workflow | Test set + grader + iteration loop; canonical pattern. |
| 19 | Generating test datasets | Use Claude itself (via Vertex) to synthesize test inputs at scale. |
| 20 | Running the eval | Loop the test set through your prompt and capture outputs. |
| 21 | Model-based grading | LLM-as-judge: rubric-driven grading with a separate Claude call. |
| 22 | Code-based grading | Deterministic graders for format, regex, schema validation. |
| 23 | Exercise on prompt evals | Hands-on: stand up a small eval harness. |
| 24 | Quiz on prompt evaluation | Section quiz on eval workflow. |
| 25 | Prompt engineering | Intro to the canonical Anthropic prompt-engineering techniques. |
| 26 | Being clear and direct | State the task plainly; ambiguity costs more than verbosity. |
| 27 | Being specific | Specificity collapses the response space; vague asks invite drift. |
| 28 | Structure with XML tags | XML tags as structural anchors Claude attends to reliably. |
| 29 | Providing examples | Few-shot examples in the prompt steer style and format. |
| 30 | Exercise on prompting | Hands-on: refactor a weak prompt using the four techniques. |
| 31 | Quiz on prompt engineering techniques | Section quiz on the prompting toolkit. |
| 32 | Introducing tool use | Tools let Claude call your functions; same protocol on Vertex. |
| 33 | Project overview | Multi-tool project setup for the section. |
| 34 | Tool functions | Define the Python functions Claude will call. |
| 35 | Tool schemas | JSON schema for tool inputs; the model only sees this schema. |
| 36 | Handling message blocks | Iterate the response content blocks (text + tool_use) and dispatch. |
| 37 | Sending tool results | Append tool_result messages and re-call the model to continue. |
| 38 | Multi-turn conversations with tools | Maintain the agentic loop across multiple tool calls. |
| 39 | Implementing multiple turns | Hands-on: code the loop with stop_reason guards. |
| 40 | Using multiple tools | Register a tool list; let Claude pick the right one per turn. |
| 41 | The batch tool | Submit many requests at once for cost-and-throughput wins. |
| 42 | Tools for structured data | Tool schemas as the cleanest path to structured outputs. |
| 43 | The text edit tool | Built-in text-editor tool for file-edit workflows. |
| 44 | The web search tool | Built-in web search tool (note: availability differs across deployment platforms; check Vertex docs). |
| 45 | Quiz on tool use with Claude | Section quiz on tool-use mechanics. |
| 46 | Introducing retrieval-augmented generation | RAG = retrieve relevant docs and add them to context; Knowledge mitigation. |
| 47 | Text chunking strategies | Fixed-size, sentence-boundary, and semantic chunking tradeoffs. |
| 48 | Text embeddings | Dense vector representations for semantic search; on GCP often Vertex-hosted embedding models. |
| 49 | The full RAG flow | Embed -> store -> retrieve top-k -> augment prompt -> generate. |
| 50 | Implementing the RAG flow | Hands-on RAG pipeline. |
| 51 | BM25 lexical search | Keyword/term-frequency retrieval as a complement to embedding search. |
| 52 | A multi-index RAG pipeline | Combine BM25 + embeddings with reciprocal rank fusion. |
| 53 | Reranking results | Cross-encoder rerank of top-k retrievals before context insertion. |
| 54 | Contextual retrieval | Anthropic's contextual retrieval: prepend a chunk-aware summary before embedding. |
| 55 | Quiz on retrieval-augmented generation | Section quiz on RAG components. |
| 56 | Extended thinking | Reasoning mode where the model thinks before answering; surfaced as a content block. |
| 57 | Image support | Vision: pass images as base64 or URL content blocks. |
| 58 | PDF support | Native PDF inputs; large docs feed the Working Memory cliff. |
| 59 | Citations | Built-in citations: model returns spans tying claims back to source docs. |
| 60 | Prompt caching | Cache stable prefixes (system prompt, RAG context) for major cost wins. |
| 61 | Rules of prompt caching | TTLs, breakpoints, minimum sizes; cache-hit accounting on Vertex. |
| 62 | Prompt caching in action | Hands-on: measure cache-hit savings on a realistic workload. |
| 63 | Quiz on features of Claude | Section quiz on cross-cutting features. |
| 64 | Introducing MCP | Model Context Protocol: standard for connecting tools/data to Claude. |
| 65 | MCP clients | Claude Desktop, Claude Code, custom clients; all speak the same protocol. |
| 66 | Project setup | Stand up an MCP server scaffold for the section. |
| 67 | Defining tools with MCP | Expose tools through MCP rather than per-app tool schemas. |
| 68 | The server inspector | Anthropic's MCP inspector for debugging server output. |
| 69 | Implementing a client | Build a custom MCP client around Claude on Vertex. |
| 70 | Defining resources | MCP resources: model-pull data sources. |
| 71 | Accessing resources | Wire resources into the client and let Claude read them. |
| 72 | Defining prompts | MCP prompts: server-provided prompt templates. |
| 73 | Prompts in the client | Surface MCP-served prompts in client UI. |
| 74 | MCP review | Recap of tools/resources/prompts split. |
| 75 | Quiz on Model Context Protocol | Section quiz on MCP architecture. |
| 76 | Anthropic apps | Overview of Claude Desktop and Claude Code as Vertex-compatible clients. |
| 77 | Claude Code setup | Install + configure Claude Code; works against Vertex-backed deployments. |
| 78 | Claude Code in action | Live coding session demoing common workflows. |
| 79 | Enhancements with MCP servers | Plug MCP servers into Claude Code for repo/db/issue access. |
| 80 | Parallelizing Claude Code | git worktrees + multiple sessions for parallel feature work. |
| 81 | Automated debugging | Subagent-driven bug repro and fix loop. |
| 82 | Computer use | Computer-use tool: Claude controls a virtual desktop via screenshots + actions. |
| 83 | How computer use works | Action loop: screenshot -> reason -> click/type -> repeat. |
| 84 | Agents and workflows | Distinction: workflows are scripted, agents choose their own path. |
| 85 | Parallelization workflows | Fan-out workflow: same task across many inputs in parallel. |
| 86 | Chaining workflows | Sequential workflow: output of step N feeds step N+1. |
| 87 | Routing workflows | Classifier-driven routing to specialized downstream prompts. |
| 88 | Agents and tools | Agentic loop with a curated tool whitelist; same primitives on Vertex. |
| 89 | Environment inspection | Let the agent probe its environment before acting. |
| 90 | Workflows vs agents | Choose workflow when steps are known; choose agent when path is open. |
| 91 | Quiz on agents and workflows | Section quiz on agent design. |
| 92 | Final assessment quiz | End-of-course assessment across all sections. |
| 93 | Course wrap-up | Recap and pointers to deeper Vertex + Anthropic resources. |
The course in 7 paragraphs
This course is the platform-agnostic Claude API course (Course 6 `claude-api-foundations`) wrapped in Google Cloud; same prompt engineering, same eval workflow, same tool-use mechanics, same RAG patterns, same MCP protocol, same agent design. If you have done Course 6, roughly 85 of the 93 lessons will feel familiar verbatim. The ~8 lessons that *justify a separate Knowledge page* are the deployment seam: how you authenticate, how you address models, how regions and quotas work, and how Vertex's enterprise controls (IAM, VPC-SC, Cloud Logging) fit on top. This page focuses on those seams; for everything else, lean on claude-api-foundations as the canonical reference.
Authentication on Vertex is gcloud-mediated, not API-key-mediated. You install the gcloud CLI, run gcloud init and gcloud auth application-default login, set a project with gcloud config set project YOUR_PROJECT_ID, and from then on the AnthropicVertex SDK picks up Application Default Credentials automatically. There is no ANTHROPIC_API_KEY. The implication for your architecture is that auth is bound to a Google Cloud identity (a user account in dev, a service account in production), which means your IAM model becomes the security boundary. Grant roles/aiplatform.user to the service account, scope it to the project, and rotate via Google Cloud's normal service-account-key lifecycle (or Workload Identity Federation if you are running outside GCP).
The SDK surface differs in three load-bearing ways. First, the import: from anthropic import AnthropicVertex instead of from anthropic import Anthropic. Second, the constructor: AnthropicVertex(region="global", project_id="your-project-id"); the region and project_id are mandatory and bind every request to a specific Vertex tenant. Third, the model id format: Vertex uses `claude-sonnet-4@20250514` rather than `claude-sonnet-4-20250514` (note the @ instead of trailing dash). Everything below those three lines (messages.create, content blocks, tool schemas, streaming, prompt caching) is byte-for-byte identical to the direct API. pip install "anthropic[vertex]" pulls the right extras.
Regional model availability is a real operational concern. Not every Claude model is hosted in every Vertex region; new models often launch in us-east5 or us-central1 first and roll out elsewhere over weeks. The Vertex region="global" setting routes to the nearest available region and is usually the right default for production unless you have data-residency constraints. If you do need a specific region (EU data residency, regulated workloads), check the Model Garden listing for that region before you commit; a model that exists in `us-east5` will return a `not found` error in `europe-west4` even though both are valid Vertex regions. Quotas are per-project per-region and are managed in the Cloud Console under Quotas & system limits; default quotas are conservative and you will likely raise them before production traffic.
Enterprise controls ride on top of Vertex unchanged. This is the main reason customers choose Vertex over the direct Anthropic API: VPC Service Controls confine traffic to a security perimeter, Cloud Audit Logs capture every messages.create invocation with caller identity, Customer-Managed Encryption Keys (CMEK) wrap inputs and outputs, and Private Service Connect avoids public-internet egress. None of those are Anthropic features per se; they are GCP features that Vertex inherits because Claude is served as a first-class Vertex AI model. The compliance story is `Vertex's, not Anthropic's`: SOC 2, ISO 27001, HIPAA BAA (where applicable), FedRAMP for government workloads. If your org has a Google Cloud landing zone, deploying Claude on Vertex slots into existing policy controls instead of standing up a parallel data-flow review.
Feature parity is high but not perfect, and the gaps move over time. Prompt caching, vision, PDF support, citations, extended thinking, and the batch API generally land on Vertex within weeks of the direct API release; the message-format protocol is identical. The exceptions tend to be at the *tool* layer: the built-in web search tool and computer use have shipped with deployment-specific availability gates, so check the Anthropic on Vertex docs before depending on them. Pricing is set by Google Cloud (not Anthropic) and is typically priced per 1M input/output tokens at parity with the direct API, billed through your GCP invoice. Cache hits and batch requests get the same multipliers you see on direct API.
When to choose Vertex vs the direct Anthropic API vs Bedrock. Pick Vertex when your stack is already on Google Cloud; the auth, billing, IAM, audit, and data-residency stories all consolidate, and you avoid a second vendor relationship. Pick the direct Anthropic API when you want fastest access to new models and features, simpler key-based auth, and no cloud lock-in. Pick Bedrock (covered in claude-with-bedrock) when your stack is on AWS for the symmetric reasons. The application code is roughly 95% portable across all three; the differences are auth, model id format, the SDK constructor, and operational integrations. Picking a deployment platform is mostly an organizational decision, not a technical one; and the exam expects you to recognize that.
5 things that change when you move from the direct Anthropic API to Vertex
If you already know the direct API (Course 6), these are the deltas you actually need to internalize. Everything else is unchanged.
- Auth: gcloud ADC instead of API key
No
ANTHROPIC_API_KEY. Rungcloud auth application-default loginin dev; use a service account withroles/aiplatform.userin prod. Auth is bound to a Google Cloud identity, which means IAM is your security boundary. - SDK constructor: `AnthropicVertex(region, project_id)`
from anthropic import AnthropicVertexand passregionandproject_id.region="global"is a sensible default unless data residency dictates otherwise. Install withpip install "anthropic[vertex]". - Model id format uses `@` not `-`
claude-sonnet-4@20250514on Vertex,claude-sonnet-4-20250514on the direct API. A small but easy-to-trip-on difference; copy from the Model Garden listing rather than from the Anthropic docs. - Regional availability is real
Not every model is in every region. New models tend to launch in
us-east5first. Check Model Garden for your target region before you commit.region="global"routes to nearest available. - Enterprise controls come from GCP
VPC-SC, CMEK, Cloud Audit Logs, Private Service Connect, and the GCP compliance posture (SOC 2, ISO 27001, HIPAA BAA, FedRAMP) all apply because Claude is served as a Vertex model. Compliance story is GCP's, not Anthropic's directly.
6 takeaways with cross-pillar bridges
Vertex deployment is the same Claude API surface as claude-api-foundations plus a different auth and addressing model; about 85 of 93 lessons mirror Course 6 verbatim.
Authentication is gcloud Application Default Credentials, not an API key; production uses a service account with roles/aiplatform.user and IAM is the security boundary.
The AnthropicVertex SDK requires region and project_id, and the model id format uses @ (e.g. claude-sonnet-4@20250514) instead of a trailing dash.
Regional model availability is a real operational gate; new models launch in specific regions first and region="global" is the sensible production default unless data residency requires otherwise.
VPC Service Controls, CMEK, Cloud Audit Logs, and the GCP compliance posture (SOC 2, HIPAA BAA, FedRAMP) ride on top of Vertex unchanged; that is the main reason enterprises pick Vertex over the direct API.
Application code is roughly 95% portable across direct API, Vertex, and Bedrock; choosing a deployment platform is mostly an organizational decision (where your cloud landing zone lives), not a technical one.
How this maps to the CCA-F exam
2 hand-picked extras
These amplify the Skilljar course beyond what the course itself covers. Each was picked for a specific reason.
Claude on Google Cloud Vertex AI; Anthropic API documentation
Canonical reference for the SDK surface, model id formats, and feature-availability table on Vertex. Pair with Skilljar Lessons 4-5 when you start authenticating against a real GCP project.
Read source ↗Anthropic Claude in Vertex AI Model Garden
Google's own integration docs covering Model Garden enablement, regional availability, quota management, and the IAM permissions required to invoke Claude on Vertex.
Read source ↗Concepts in this course
Tool calling
Same protocol on Vertex; tool schemas are byte-identical to direct API
Concept: tool-calling ↗Prompt caching
Available on Vertex with same TTL/breakpoint rules as direct API
Concept: prompt-caching ↗Batch API
Available on Vertex; useful for cost-sensitive bulk workloads
Concept: batch-api ↗MCP
Protocol-level, deployment-independent; works with Vertex-backed Claude
Concept: mcp ↗Vision and multimodal
Image and PDF inputs work identically on Vertex
Concept: vision-multimodal ↗Where you'll see this in production
Claude for operations
Vertex's IAM, audit, and VPC-SC controls are the operational substrate this scenario depends on for enterprise rollouts
Scenario: claude-for-operations ↗Structured data extraction
Common Vertex workload: route extraction jobs through Cloud Run / Cloud Functions backed by Claude on Vertex
Scenario: structured-data-extraction ↗Other course mirrors you may want next
8 questions answered
Phrased as the way real students search. Tagged by intent so you can scan to what you actually need.
ComparisonWhat is the difference between using Claude through the Anthropic API and through Google Vertex AI?
ANTHROPIC_API_KEY, requires AnthropicVertex(region, project_id) instead of Anthropic(), uses @ in the model id (e.g. claude-sonnet-4@20250514), and inherits Google Cloud's IAM, audit, VPC-SC, and compliance controls. Choose Vertex when your stack is on GCP; choose direct API for simpler auth and fastest access to new features.How-toHow do I authenticate with Claude on Vertex AI?
gcloud init and gcloud auth login, set your project with gcloud config set project YOUR_PROJECT_ID, then run gcloud auth application-default login. The AnthropicVertex SDK picks up Application Default Credentials automatically. There is no API key; auth is bound to a Google Cloud identity (your user account in dev, a service account with roles/aiplatform.user in production).TroubleshootWhy does my Vertex AI request return a model-not-found error when the model exists?
us-east5 or us-central1 first. Check the Model Garden listing for your target region before you commit. The fix is usually to switch to `region="global"`, which routes to the nearest available region, unless data residency dictates a specific region. Also confirm the model id format; Vertex uses claude-sonnet-4@20250514 with an @, not a trailing dash.ScopeDoes prompt caching work on Claude through Vertex AI?
ComparisonShould I use Vertex AI or the direct Anthropic API for my production deployment?
How-toHow do I install the right Anthropic SDK for Vertex AI in Python?
[vertex] extras pull in the Google Auth dependencies needed to connect to Vertex. Then import AnthropicVertex (not Anthropic) and instantiate it with region and project_id. The same messages.create API works on both clients, so application code below the constructor line is unchanged.How-toWhat IAM permissions does my service account need to call Claude on Vertex?
roles/aiplatform.user on the project. For production, scope tightly: grant only that role on only the project hosting your AI workloads, and rotate service-account keys via Workload Identity Federation if your code runs outside GCP. Auth is bound to identity, so any IAM policy you apply to the service account flows through to your Claude calls; including organizational policies on which regions are allowed and which models are enabled.