examlab .net The most efficient path to the most valuable certifications.
In this note ≈ 19 min

Artifact Registry and Container Registry

3,650 words · ≈ 19 min read ·

Professional Cloud Developer deep dive into Artifact Registry: repository modes, formats, CMEK, IAM, VPC-SC, vulnerability scanning, Binary Authorization, cleanup policies, and replication.

Do 20 practice questions → Free · No signup · PCD

Introduction to Artifact Registry

Artifact Registry is Google Cloud's universal, fully managed artifact storage service. It is the strategic successor to Container Registry (GCR) and serves as the single repository for container images (Docker / OCI) plus language packages (Maven, npm, PyPI, Go, NuGet) and OS packages (APT, YUM). It is deeply integrated with Cloud Build, Cloud Run, GKE, Cloud Deploy, Binary Authorization, and Container Analysis.

For the Professional Cloud Developer (PCD) exam, Artifact Registry shows up in pipeline-design questions, IAM-permission scenarios, supply-chain-security trade-offs, and CMEK/VPC-SC compliance questions. The recurring decision points are: which repository mode to pick, which format to choose, which IAM role is sufficient, and when to enable vulnerability scanning + Binary Authorization.

The canonical hostname pattern is LOCATION-FORMAT.pkg.dev/PROJECT/REPOSITORY/PATH — for example asia-east1-docker.pkg.dev/my-project/web-images/api:v1.2.3. Remembering this URL shape alone solves a surprising number of exam traps.


白話文解釋(Plain English Explanation)

Before we touch IAM bindings and CMEK keys, three real-world analogies make the moving parts click.

Analogy 1 — The Bonded Warehouse with Multiple Sections

If your source code is the blueprint and Cloud Build is the assembly line, Artifact Registry is a bonded warehouse at the port. The warehouse has clearly labelled sections — one for shipping containers (Docker images), one for car parts (Maven JARs), one for chemical drums (npm tarballs), one for groceries (APT debs). Each section has its own loading dock (format-specific protocol) and its own security guard (IAM role on the repository). A customs officer (Container Analysis) inspects every container before it leaves, and a border patrol officer (Binary Authorization) refuses to let any container off the dock unless it carries the right paperwork (attestation). The warehouse manager (Cleanup Policy) periodically throws out crates older than 90 days that nobody touched.

Analogy 2 — The Private App Store with a Mirror Mode

Picture a corporate "App Store" that has three counter modes. Standard mode is the in-house counter where employees publish their own apps. Remote mode is a mirror counter that re-sells apps from a public store (Docker Hub, Maven Central, PyPI), but caches them locally so the office Wi-Fi doesn't get hammered when 50 engineers download the same version. Virtual mode is a help desk that presents one URL but transparently forwards requests to the right counter behind the scenes — "give me pandas, I don't care if it came from our internal counter or from PyPI's mirror." Developers only learn one address; the store handles the routing.

Analogy 3 — The Bank Vault with Customer-Held Keys

Imagine a bank vault where every safe-deposit box is encrypted, but customers can either let the bank manage the master key (the default Google-managed encryption) or bring their own key from a personal safe at home (CMEK via Cloud KMS). If the customer revokes the key, the bank physically cannot open the box anymore — not even with a court order. This is what CMEK on Artifact Registry does for regulated industries: the moment your KMS key is disabled, all docker pull operations fail with a PERMISSION_DENIED from the KMS layer, even for the project owner.


Artifact Registry vs Container Registry — The GCR Deprecation Story

Container Registry (gcr.io, us.gcr.io, eu.gcr.io, asia.gcr.io) was Google's first-generation container store, built on top of a Cloud Storage bucket named artifacts.PROJECT.appspot.com. Artifact Registry deprecates GCR and on 15 May 2024, Google froze new GCR repository creation; existing gcr.io traffic is now transparently redirected to Artifact Registry repositories with the same name.

Why the Replacement Was Necessary

  • Single-format limitation. GCR only stored Docker images. Multi-language teams had to manage GCR + a third-party Nexus/Artifactory + a GCS bucket for *.tar.gz. Artifact Registry collapses all three into one product.
  • Project-level IAM only. GCR inherited storage permissions from the underlying GCS bucket — you couldn't grant "push to repository A only." Artifact Registry has repository-level IAM.
  • No regional control. GCR had four multi-region buckets. Artifact Registry supports all GCP regions and multi-regions, with predictable hostnames per region.
  • No remote/virtual modes, no CMEK on early GCR, no native cleanup policies.

gcr.io Redirect — Effective 2024, Google routes gcr.io/PROJECT/IMAGE to an automatically-created Artifact Registry repo gcr.io in your project. The redirect preserves digests and tags but bills against the Artifact Registry SKU, not the legacy GCS bucket. You enable it via gcloud artifacts settings enable-upgrade-redirection --project=PROJECT.

Migration Mechanics

The gcloud artifacts docker upgrade migrate command copies images from gcr.io to a new *-docker.pkg.dev repo, preserves digests, and (optionally) configures redirection so existing docker pull gcr.io/... clients keep working with no code change. For language packages, no migration tool exists — teams republish or front the old store with a remote repository during transition.


Supported Formats — One Service, Ten Artifact Types

Artifact Registry supports ten formats, each with its own hostname suffix (-docker.pkg.dev, -maven.pkg.dev, -npm.pkg.dev, etc.) and its own native client (docker, mvn, npm, pip, helm, apt).

Container & OCI Formats

  • Docker — the most-used format; stores OCI-compliant images. Hostname: LOCATION-docker.pkg.dev. Pushed via docker push after gcloud auth configure-docker LOCATION-docker.pkg.dev.
  • Helm (OCI) — Helm 3.8+ charts pushed as OCI artifacts. Hostname: LOCATION-docker.pkg.dev (Helm rides the Docker hostname; it is not a separate *-helm.pkg.dev).
  • KubeFlow Pipelines — Stores compiled ML pipeline templates as a dedicated format used by Vertex AI Pipelines.

Language Package Formats

  • Maven — Java/Kotlin packages. Hostname: LOCATION-maven.pkg.dev. Authenticate using the artifactregistry-maven-wagon plugin in settings.xml.
  • npm — Node.js packages. Hostname: LOCATION-npm.pkg.dev. Uses an .npmrc with a Google Cloud-issued access token.
  • PyPI (Python) — Hostname: LOCATION-python.pkg.dev. Uses keyring + keyrings.google-artifactregistry-auth.
  • Go modules — Hostname: LOCATION-go.pkg.dev. Authenticated via gcloud auth print-access-token as a Go module proxy.
  • NuGet — .NET packages on LOCATION-nuget.pkg.dev.

OS Package Formats

  • APT — Debian/Ubuntu .deb packages, served as a full APT repository (Release, Packages.gz).
  • YUM — RHEL/CentOS/Fedora .rpm packages.

Generic Format

  • Generic — Any binary blob (*.tar.gz, *.zip, firmware images). No protocol semantics — pushed/pulled via gcloud artifacts generic upload/download. Useful for storing Terraform modules, FPGA bitstreams, or signed firmware.

On the PCD exam, if a question says "store both Docker images and Maven packages in one regional repository," the trap answer is "one Artifact Registry repository." You cannot mix formats in a single repo — each repository binds to exactly one format at creation time. The correct answer is "two repositories under the same project and region, sharing the same KMS key and IAM policy template."


Repository Modes — Standard, Remote, Virtual

Every Artifact Registry repository is created with one of three modes, set via --mode=STANDARD-REPOSITORY|REMOTE-REPOSITORY|VIRTUAL-REPOSITORY. The mode is immutable after creation.

Standard Repository

The default. Stores artifacts your team publishes. Pushes go here; pulls return what was pushed. This is what teams need for their own application builds. A standard repo for Docker created in asia-east1 is reachable at asia-east1-docker.pkg.dev/my-project/app-images/....

Remote Repository

A caching proxy for an upstream public registry. Configure once; thereafter every docker pull to the remote repo first checks the local cache, and only on miss fetches from upstream and stores the layer locally. Supported upstreams as of 2024: Docker Hub, Maven Central, npm Registry, PyPI, Debian, Ubuntu, CentOS, and a generic HTTP "custom remote" mode.

Use cases:

  • Build reliability — Docker Hub rate limits free anonymous pulls to 100 per 6 hours per IP. A remote repository serves cached layers from your GCP region and is not subject to Docker Hub rate limits for cached artifacts.
  • Egress cost savings — pulls stay inside the GCP backbone after the first miss.
  • VPC-SC compliance — internal builders never talk to hub.docker.com directly; they only talk to pkg.dev.

Virtual Repository

A logical front door that aggregates several upstream repositories (standard or remote) under one URL. Resolution order is configurable. Common pattern: a virtual python-all repo combines (1) an internal standard repo with your team's private libraries, (2) a remote PyPI repo. Developers configure pip against one URL, and the virtual repo serves the internal package if it exists, falling back to remote PyPI otherwise — exactly the same behaviour as JFrog Artifactory's "virtual repository" concept.

For supply-chain security, configure the corporate python virtual repo with the internal standard repo listed first and the remote-PyPI repo second. If an attacker uploads a typo-squatted package to PyPI with the same name as your internal library, the virtual repo's resolution order ensures developers receive your package, not the attacker's. This pattern mitigates the classic "dependency confusion" attack discovered by Alex Birsan.


IAM Roles — Repository-Level Authorization

Artifact Registry IAM is granted at three levels: organization, project, or repository. Repository-level scoping is the PCD exam's favourite trap.

Core Predefined Roles

  • roles/artifactregistry.reader — Read metadata and pull artifacts. Granted to runtime service accounts on GKE, Cloud Run, Cloud Functions — they only need to pull images.
  • roles/artifactregistry.writer — Reader + push artifacts and create new tags. Granted to Cloud Build service accounts and CI pipelines.
  • roles/artifactregistry.repoAdmin — Writer + delete artifacts, manage tags, manage cleanup policies. Granted sparingly to release managers.
  • roles/artifactregistry.admin — Full control: create/delete repositories, set IAM policies, configure CMEK. Granted at project level to platform teams.
  • roles/artifactregistry.serviceAgent — Internal role for the service agent service-PROJECT_NUMBER@gcp-sa-artifactregistry.iam.gserviceaccount.com that needs to encrypt/decrypt with your CMEK key.

Typical Bindings

  • Cloud Run runtime SA: roles/artifactregistry.reader on the specific repo.
  • Cloud Build worker SA: roles/artifactregistry.writer on the build-output repo.
  • GitHub Actions Workload Identity Federation SA: writer scoped to a per-team repo, not project-wide.
  • Developer humans: usually reader only — pushes happen via CI, not laptops.
gcloud artifacts repositories add-iam-policy-binding web-images \
  --location=asia-east1 \
  --member="serviceAccount:[email protected]" \
  --role="roles/artifactregistry.reader"

The Storage IAM Trap

A common PCD exam trap: granting roles/storage.objectViewer to a Cloud Run service so it can pull from gcr.io. After the GCR-to-AR redirect this stops working because the underlying GCS bucket is no longer in the request path. The correct fix is roles/artifactregistry.reader on the redirected repo.


CMEK Encryption — Customer-Managed Keys

By default, Artifact Registry encrypts artifacts at rest with Google-managed keys (no extra cost). For regulated workloads — banking, healthcare, public sector — you bind a Customer-Managed Encryption Key (CMEK) from Cloud KMS to the repository at creation time.

Configuration

gcloud artifacts repositories create regulated-images \
  --location=asia-east1 \
  --repository-format=docker \
  --kms-key=projects/sec-prj/locations/asia-east1/keyRings/ar/cryptoKeys/ar-key

Constraints

  • Region match required. The KMS key must live in the same region (or a global multi-region superset) as the repository. A regional asia-east1 repo cannot use a us-central1 KMS key.
  • Service Agent binding. The Artifact Registry service agent (service-PROJECT_NUMBER@gcp-sa-artifactregistry.iam.gserviceaccount.com) needs roles/cloudkms.cryptoKeyEncrypterDecrypter on the key.
  • Key revocation = pull failure. Disabling the key blocks all pulls within minutes (read failures surface as PERMISSION_DENIED).
  • Immutable assignment. You cannot change the CMEK key on an existing repository — create a new one and migrate.

For CMEK + multi-region repositories, you must use a multi-region KMS key matching the repo's multi-region. A repo in the us multi-region cannot use a regional us-central1 key, even though us-central1 is inside us. This region-alignment trap appears regularly on PCD CMEK scenarios.


VPC Service Controls and Private Connectivity

For workloads that may not egress to the public internet, Artifact Registry supports VPC Service Controls (VPC-SC) and Private Google Access.

VPC-SC Integration

Artifact Registry is a VPC-SC-supported service. Add it to a service perimeter to deny pulls/pushes from outside the perimeter, even if the caller has valid IAM. Inside the perimeter, builds from on-prem (via Cloud Interconnect + Private Google Access) and from Cloud Build private pools can still pull images, while a laptop on a public network gets a RESOURCE_NOT_IN_VPC_SC_PERIMETER error.

Private Service Connect Endpoint

For zero-egress topologies, configure a PSC endpoint for pkg.dev. Internal traffic reaches Artifact Registry through a private VIP inside your VPC — no external IP is needed on the workload, satisfying constraints/compute.vmExternalIpAccess org policy.

Cloud Build Private Pools

Cloud Build private pools can be placed inside your VPC and configured with --peered-network, so builders pull base images and push outputs to Artifact Registry over private addressing. This is the recommended pattern for regulated CI.


Container Analysis — Vulnerability Scanning

When you push a Docker image, Container Analysis (part of Artifact Analysis) automatically scans it for CVEs against an upstream OS vulnerability feed (Debian, Ubuntu, Alpine, Red Hat, Maven, npm, Go modules, etc.) — provided the service is enabled.

Scan Modes

  • On-Push Scanning — runs immediately when an image is pushed. Triggered by enabling the Container Scanning API (containerscanning.googleapis.com). Finding records are written to the Container Analysis service and viewable in the Artifact Registry UI.
  • Continuous Analysis — re-scans images already in the registry whenever the upstream vulnerability database publishes new CVEs, up to 30 days after the last pull. After 30 days of inactivity, an image stops being re-scanned. This means a fast-moving production image stays continuously monitored, while an abandoned test image eventually drops out.

Vulnerability Findings API

Every finding is stored as an Occurrence attached to the image's digest, with fields: cveId, severity (CRITICAL/HIGH/MEDIUM/LOW), fixAvailable, affectedPackage, affectedVersion, fixedVersion. Programmatic access via gcloud artifacts vulnerabilities list-pkgvulnerabilities.

Severity SLOs

A common policy pattern: block deployment if any CRITICAL finding exists and fixAvailable=true. This becomes a Binary Authorization rule rather than a manual gate.

Cheat sheet: Enable containerscanning.googleapis.com → push triggers on-push scan → findings written as Container Analysis Occurrences → continuous analysis re-scans for 30 days post-last-pull → query with gcloud artifacts vulnerabilities list-pkgvulnerabilities → enforce via Binary Authorization attestor.


Binary Authorization — Deploy-Time Policy Enforcement

Vulnerability scanning is informational; Binary Authorization is enforcement. It is a deploy-time admission controller for GKE, Cloud Run, Anthos, and Cloud Run for Anthos that blocks pods from starting if the image fails policy.

Core Objects

  • Attestor — a named entity (e.g., built-by-cloud-build, vuln-scan-passed) holding a public PGP/PKIX key.
  • Attestation — a signed assertion that a specific image digest passes a check. Stored as a Container Analysis Occurrence.
  • Policy — bound to a GKE cluster or Cloud Run service: "require attestations from attestors A and B for any image, with evaluationMode = REQUIRE_ATTESTATION."

Typical Supply-Chain Chain

  1. Cloud Build builds the image and pushes to Artifact Registry.
  2. The push triggers Container Analysis vulnerability scan.
  3. A Cloud Build step queries vulnerabilities; if zero CRITICAL findings, it calls gcloud beta container binauthz attestations sign-and-create to attach an attestation from the vuln-scan-passed attestor.
  4. The deploy step pushes to GKE. The Binary Authorization admission webhook inspects the image digest, finds the attestation, and allows the pod. If the attestation is missing, the pod is denied and an event surfaces in Cloud Audit Logs.

Break-Glass and Continuous Validation

  • Break-glass deployments — annotate the workload alpha.image-policy.k8s.io/break-glass: "true" to bypass enforcement; the deployment is allowed but logged as a high-severity audit event.
  • Continuous Validation (CV) — re-checks running pods against the latest policy and emits Cloud Logging events if a pod no longer complies (e.g., the attestation was revoked).

The most common PCD trap on Binary Authorization is assuming it scans the image at deploy time. It does not. Binary Authorization is a policy checker; it looks for pre-existing attestations attached to the digest. If your pipeline forgets to sign the image, deployment fails even when the image is perfectly safe. The fix is to add a binauthz sign-and-create step in Cloud Build before the deploy step — not to weaken the policy.


Cleanup Policies — Automated Garbage Collection

Storage costs creep when CI pushes a tagged image on every commit. Artifact Registry supports cleanup policies to automatically delete or dry-run old artifacts.

Policy Types

  • Keep policy — protects matching artifacts from deletion. Example: keep any image tagged prod-* or v[0-9]+.[0-9]+.[0-9]+.
  • Delete policy — deletes artifacts matching a condition. Example: delete untagged images older than 14 days.

Conditions

  • tagStateTAGGED, UNTAGGED, or ANY.
  • tagPrefixes — match by tag prefix list.
  • versionAge30d, 90d, etc. — only delete if older than N days.
  • packageNamePrefixes — match by package name.
  • mostRecentVersions.keepCount — keep the N most recent versions of each package.

Common Recipe

gcloud artifacts repositories set-cleanup-policies web-images \
  --location=asia-east1 \
  --policy=cleanup-policy.json

A canonical policy keeps the latest 10 tagged versions of every package and deletes any untagged version older than 7 days. Run with --dry-run for 30 days first to confirm no production tag gets caught.

Cost Impact

A team pushing two images per merge across 50 microservices accumulates several TB of layers per quarter; cleanup policies typically reclaim 60-80% of that storage. Storage is billed at $0.10/GB-month for standard repos.


Replication — Multi-Region for Disaster Recovery

Artifact Registry supports multi-region repositories for built-in geo-replication.

Regional vs Multi-Region

  • Regional repository (e.g., asia-east1) — single-region storage, single-region availability SLA.
  • Multi-region repository (e.g., us, eu, asia) — automatically replicates artifacts across at least two regions in the multi-region group. Higher durability and broader read locality at higher cost.

Cross-Region Copy Pattern

For workloads that need active presence in multiple specific regions (not one multi-region grouping), the recommended pattern is one regional repo per region, with a Cloud Build step that copies new images using gcloud artifacts docker tags add or crane copy immediately after the primary push. This pattern gives precise control over which images replicate where (e.g., only production tags, not commit-sha tags).

Cross-Project Replication

There is no first-party "Artifact Registry replication" service across projects. The supported pattern is: grant roles/artifactregistry.reader to the destination project's CI SA on the source repo, and use gcloud artifacts docker images add-tag or crane copy to mirror images. Tools like the OSS gcr-mirror can automate this with retention rules.

On the PCD exam, when a question describes "fail over between us-central1 and us-east1 with RTO < 5 minutes," the correct architecture is one multi-region us repository, not "two regional repos with a Cloud Function trigger to copy on push." Multi-region replication is automatic and faster than any custom copier.


Performance, Cost, and Quotas

Pricing Dimensions

  • Storage: $0.10/GB-month for both standard and remote repositories; multi-region storage is higher.
  • Network egress: standard GCP egress rates apply when pulls leave a region. Pulls within the same region from GCE/GKE/Cloud Run are free.
  • Container Analysis: charged per scanned image (first scan + continuous analysis).
  • Remote repository upstream fetches: free for the public mirrors; you pay for the cached storage.

Quotas

  • Repository count: 1000 per project per location (raisable).
  • Artifact size: up to 10 GiB per uploaded layer.
  • Tag count: no hard cap, but cleanup policies become essential past 10K tags per package.

Latency

Same-region pulls from Cloud Run or GKE typically complete in milliseconds for cached layers and seconds for large new layers. Cross-region pulls add inter-region latency and egress cost — another argument for multi-region or regional-mirror patterns.


FAQ — Artifact Registry

Q1. Should I still use gcr.io for new projects?

No. Container Registry is deprecated; create new repositories in Artifact Registry under the *-docker.pkg.dev hostname. Existing gcr.io references continue to work via redirect, but new tooling — CMEK, cleanup policies, remote repositories — only exists in Artifact Registry.

Q2. How do I authenticate Docker to push images?

Run gcloud auth configure-docker LOCATION-docker.pkg.dev once on the workstation or CI runner; this adds a Google Cloud credential helper to ~/.docker/config.json. In Cloud Build the credential helper is automatic.

Q3. Can one repository store both Docker and Maven artifacts?

No. A repository is bound to exactly one format at creation. Most teams create one repository per format per region (e.g., web-images for Docker, java-libs for Maven).

Q4. What is the difference between roles/artifactregistry.reader and roles/storage.objectViewer?

After the GCR-to-AR redirect, only roles/artifactregistry.reader works. The legacy roles/storage.objectViewer on the artifacts.*.appspot.com bucket no longer authorizes pulls because the bucket is not in the request path.

Q5. How do I block deployment of vulnerable images?

Enable Container Analysis (containerscanning.googleapis.com) to scan on push, then configure Binary Authorization with an attestor that signs only when zero CRITICAL findings exist. The deploy target (GKE, Cloud Run) refuses any image without the attestation.

Q6. How does CMEK affect pull latency?

Negligibly. Decryption happens at the storage layer with hardware-backed KMS keys; the added latency is sub-millisecond. The real risk is key revocation, which immediately blocks all pulls.

Q7. Can I mirror Docker Hub to avoid rate limits?

Yes — create a remote repository with --remote-repo-config-file pointing at Docker Hub. All subsequent pulls go through your Artifact Registry, cached after the first miss; Docker Hub rate limits no longer apply to cached layers.


Final PCD Tip

On the PCD exam, when you see keywords like "private package store," "vulnerability scanning," "signed image attestation," "prevent dependency confusion," or "CMEK + private connectivity for container images," the answer is almost always a combination of Artifact Registry + Container Analysis + Binary Authorization + VPC-SC. Memorize the four-piece chain and the exam questions practically answer themselves.

Official sources

More PCD topics