examlab .net The most efficient path to the most valuable certifications.
In this note ≈ 19 min

IAM for Application Identities

3,650 words · ≈ 19 min read ·

Deep dive into GCP IAM for applications: ADC, metadata server flow, service account impersonation, Workload Identity, Workload Identity Federation, IAM Conditions, and Deny policies.

Do 20 practice questions → Free · No signup · PCD

Introduction to IAM and App Identities

Identity and Access Management (IAM) is the security backbone of Google Cloud, and for the Professional Cloud Developer exam it is one of the most heavily weighted scenario domains. Where a human accesses GCP through a [email protected] Google identity, an application accesses GCP through a service account (<name>@<project>.iam.gserviceaccount.com) or, increasingly, through a federated identity that exchanges an external OIDC/SAML token for a short-lived Google access token. Modern GCP design pushes hard away from long-lived JSON key files (type: service_account keys) and toward short-lived credentials minted by the IAM Credentials API and Security Token Service (STS). Understanding that pipeline — how a Compute Engine VM, a GKE pod, or a GitHub Actions runner obtains a token, what role lets it do so, and which IAM condition can block it — is the centre of this topic.

This study note expands the original stub to cover the full app-identity lifecycle: Application Default Credentials (ADC) resolution, the GCE metadata server flow that powers VM/GKE/Cloud Run identity, service account impersonation with roles/iam.serviceAccountTokenCreator, GKE Workload Identity (the GSA↔KSA binding), Workload Identity Federation for AWS / Azure / generic OIDC providers including GitHub Actions, IAM Conditions written in CEL, lateral movement prevention, the iam.disableServiceAccountKeyCreation organization policy, IAM Deny policies, Resource Manager Tags combined with IAM Conditions, and short-lived credentials via STS. Concrete gcloud commands, API field names, and exam traps are included in every section.

Application Default Credentials (ADC) Resolution

How client libraries find credentials

google-auth libraries (Go, Python, Node, Java) all implement the same ADC search order. They check, in this exact sequence: (1) the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to a JSON file; (2) the well-known gcloud user credentials at ~/.config/gcloud/application_default_credentials.json (created by gcloud auth application-default login); (3) the attached service account on a Google Cloud compute resource via the metadata server. The library stops at the first match. This is why a developer who set GOOGLE_APPLICATION_CREDENTIALS=/tmp/key.json locally and then deploys to Cloud Run is surprised when production uses the Cloud Run service identity — the env var is not set in the container, so ADC falls through to the metadata server.

Local dev vs production parity

For local development the recommended pattern is gcloud auth application-default login --impersonate-service-account=<sa>@<proj>.iam.gserviceaccount.com. This lets the developer prove who they are with their human Google identity, then impersonate the deployment's runtime service account so that local code receives exactly the same scopes and IAM bindings as production. The alternative — downloading a JSON key for the production SA — is what iam.disableServiceAccountKeyCreation and IAM Deny policies are designed to stop.

Application Default Credentials (ADC): The Google client library convention that picks credentials from GOOGLE_APPLICATION_CREDENTIALS, then ~/.config/gcloud/application_default_credentials.json, then the GCE metadata server at 169.254.169.254, in that order, returning the first set found.

The Metadata Server Flow on GCE, GKE, and Cloud Run

The 169.254.169.254 endpoint

Every Google-managed compute resource (GCE VM, GKE node, Cloud Run revision, Cloud Functions instance, App Engine instance) reaches its identity by calling the metadata server at http://metadata.google.internal/ (which resolves to the link-local address 169.254.169.254). The two key paths are /computeMetadata/v1/instance/service-accounts/default/token (returns a short-lived OAuth2 access token) and /computeMetadata/v1/instance/service-accounts/default/identity?audience=... (returns a signed Google OIDC ID token). All requests must include the header Metadata-Flavor: Google, which prevents trivial SSRF attacks against this endpoint from outside the VM's user agent.

Per-runtime differences

On a GCE VM you assign the service account at instance creation with --service-account=<sa>@... plus --scopes=cloud-platform; the access scopes act as an extra filter on top of IAM. On Cloud Run the attached identity is set via --service-account= on the service or revision, and access scopes are implicitly cloud-platform. On GKE the node's default behaviour is to expose the node service account through the metadata server, which is why GKE Workload Identity replaces that with a per-pod identity served by gke-metadata-server. On Cloud Functions Gen2 the underlying Cloud Run service identity is what the function sees.

Tokens from the metadata server are cached by the client library and are valid for up to 3600 seconds. Never write code that re-fetches the token on every request; the library handles refresh ~5 minutes before expiry. Tightly polling /token from inside a hot loop is a classic anti-pattern that shows up in exam traps.

Service Account Impersonation and Token Creator

The IAM Credentials API

Impersonation is implemented by the IAM Credentials API (iamcredentials.googleapis.com) which exposes methods such as generateAccessToken, generateIdToken, signBlob, and signJwt. To call any of these against a target service account, the caller principal must hold roles/iam.serviceAccountTokenCreator on the target service account resource (not on the project). This is the single most important impersonation role to memorise: iam.serviceAccounts.getAccessToken is the underlying permission.

gcloud flags

The flag --impersonate-service-account=<target-sa> works on virtually every gcloud command and on Terraform's google provider. Behind the scenes, gcloud authenticates as the caller, then calls generateAccessToken to mint a 1-hour token for the target SA, and uses that token for the actual API call. The --lifetime parameter on generateAccessToken accepts up to 3600 seconds by default and up to 43,200 seconds (12 hours) only if the organization policy iam.allowServiceAccountCredentialLifetimeExtension lists the SA.

Impersonation chains

You can chain impersonation via the --delegates flag: principal A → SA B → SA C. Each link in the chain needs Token Creator on the next. This is how break-glass and just-in-time elevation are implemented without anyone ever holding the destination role permanently.

Replace every gcloud iam service-accounts keys create script in your CI with --impersonate-service-account. The CI runner authenticates via Workload Identity Federation, then impersonates the deployment SA — zero JSON keys leave Google Cloud, and the audit log on the target SA shows exactly who impersonated whom and when.

GKE Workload Identity

GSA ↔ KSA binding

GKE Workload Identity is the supported way to give pods a unique Google service account identity without mounting JSON keys. The mechanism is a two-way binding: (1) on the cluster you enable Workload Identity (--workload-pool=PROJECT_ID.svc.id.goog) and per-node-pool GKE_METADATA workload metadata mode; (2) in IAM you grant the Kubernetes ServiceAccount principal serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME] the role roles/iam.workloadIdentityUser on the target GSA; (3) on the KSA you set the annotation iam.gke.io/[email protected]. Pods using that KSA now receive tokens for the GSA via the in-cluster gke-metadata-server proxy.

Why the node SA disappears

Before Workload Identity, every pod on a node could read the node's compute engine default service account from the metadata server. With Workload Identity enabled on a node pool, requests to 169.254.169.254 are intercepted: pods see only the GSA bound to their KSA, and the underlying Compute Engine node SA is no longer reachable from inside pods. This eliminates a huge category of GKE privilege escalation paths.

Common failure modes

A 403 in GKE almost always traces to one of three things: the KSA annotation is missing or typo'd; the workloadIdentityUser binding lists the wrong [namespace/ksa]; or the node pool was created without --workload-metadata=GKE_METADATA. Use gcloud container node-pools describe to verify the workload metadata mode.

The deprecated Metadata Concealment feature (--workload-metadata-from-node=SECURE) is not the same as Workload Identity. Concealment only blocks kube-system from seeing instance metadata; pods still inherit the node SA. Only Workload Identity provides per-pod GSA identity. Any exam answer that fixes a GKE permission problem with metadata concealment is wrong.

Workload Identity Federation for AWS, Azure, and OIDC

Workload Identity Pools and Providers

Workload Identity Federation (WIF) lets external workloads (AWS roles, Azure managed identities, on-prem OIDC providers, GitHub Actions, GitLab, Terraform Cloud, CircleCI) call Google APIs without service account keys. You create a Workload Identity Pool (gcloud iam workload-identity-pools create) and inside it a Provider of type AWS, OIDC, or SAML. The Provider declares --issuer-uri, allowed audiences, and attribute mappings (google.subject = assertion.sub) and an attribute condition in CEL (e.g. assertion.repository == 'org/repo').

Token exchange via STS

The external workload presents its native token (an AWS GetCallerIdentity request, an Azure AD JWT, a GitHub Actions OIDC JWT) to Google's Security Token Service at sts.googleapis.com/v1/token with grant_type=urn:ietf:params:oauth:grant-type:token-exchange. STS validates the token against the provider config, applies attribute mapping, and returns a federated access token. That federated token is then used to impersonate a real GSA via generateAccessToken — so the external workload still needs roles/iam.workloadIdentityUser on the target GSA, scoped to the federated principal principal://iam.googleapis.com/projects/.../locations/global/workloadIdentityPools/POOL/subject/SUBJECT.

Per-provider notes

For AWS, the Provider uses --aws-account-id and federates from any IAM role/user in that account; the attribute condition typically pins assertion.arn. For Azure, the provider takes --issuer-uri=https://sts.windows.net/<tenant>/ and the attribute condition pins assertion.aud and the managed identity's sub. For generic OIDC the issuer URI must be reachable and serve a JWKS document.

Workload Identity Federation eliminates the need to download a JSON service account key for any external CI/CD or cross-cloud workload. This is the single biggest security win the PCD exam expects you to recommend whenever a question shows a JSON key stored in Jenkins, GitHub Secrets, or CircleCI environment variables.

GitHub Actions OIDC to GCP

The end-to-end wire-up

GitHub Actions issues an OIDC JWT to every job (when the job has permissions: id-token: write). The token's iss is https://token.actions.githubusercontent.com, sub looks like repo:org/repo:ref:refs/heads/main, and aud defaults to the audience you request. To wire this to GCP: create a WIF pool, add an OIDC provider with --issuer-uri=https://token.actions.githubusercontent.com, set --attribute-mapping=google.subject=assertion.sub,attribute.repository=assertion.repository, and set --attribute-condition=assertion.repository=='myorg/myrepo'. Then bind roles/iam.workloadIdentityUser on the target GSA to the principalSet principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL/attribute.repository/myorg/myrepo.

The workflow step

The official google-github-actions/auth@v2 action handles the STS exchange. You give it workload_identity_provider (the full resource name) and service_account (the GSA to impersonate). The action sets GOOGLE_APPLICATION_CREDENTIALS to a short-lived credential file backed by the federated token, so every subsequent gcloud and SDK call inside the job authenticates as the GSA.

Pinning to branches and environments

The exam loves this: a generic repo:org/repo:* subject lets any branch (including a malicious PR branch) federate. Always narrow the attribute condition to assertion.ref == 'refs/heads/main' for production deploys, or use GitHub Environments and pin assertion.environment == 'prod'. A misconfigured attribute.repository_owner == 'myorg' rule has been the root cause of multiple real-world breaches.

Lateral Movement Prevention

The impersonation graph problem

Every serviceAccountTokenCreator binding is an edge in a directed graph: principal A can become SA B. If A can become B, and B has Token Creator on C, then A transitively can become C. Attackers who compromise a low-privilege identity often walk this graph. The defence is hygiene: never grant Token Creator at the project level (roles/iam.serviceAccountTokenCreator on a project gives the principal impersonation rights over every SA in that project), always grant it on the specific target SA, and audit the graph using Policy Analyzer (gcloud asset analyze-iam-policy) with --permissions=iam.serviceAccounts.getAccessToken.

Removing the legacy Editor anti-pattern

The roles/editor primitive role includes iam.serviceAccounts.actAs, which on a project lets an editor attach any SA in the project to a new GCE VM or Cloud Function — and then read the SA's token from the metadata server. Replace Editor with roles/serviceAccountUser granted on specific SAs only.

roles/iam.serviceAccountUser (act-as) and roles/iam.serviceAccountTokenCreator (impersonate) are different. ActAs is required to attach an SA to a resource (VM, Cloud Run service, Cloud Build job) at create time; TokenCreator is required to mint a token for an SA on demand. The exam frequently swaps these to test if you know which is needed for gcloud run deploy --service-account=... (answer: ActAs on the target SA).

Disabling Service Account Key Creation

The org policy constraint

The boolean organization policy constraint constraints/iam.disableServiceAccountKeyCreation enforced at the organization or folder level blocks iam.serviceAccountKeys.create for everyone — including project owners. Apply it once at the org root and the entire .json key file attack surface disappears. The companion constraints worth knowing are constraints/iam.disableServiceAccountKeyUpload (block bring-your-own keys), constraints/iam.disableServiceAccountCreation (prevent SA sprawl), and constraints/iam.allowedPolicyMemberDomains (block external @gmail.com principals).

What to grant instead

Once keys are disabled, every workflow must use one of: an attached service account (GCE, GKE, Cloud Run, Cloud Functions, Cloud Build), Workload Identity (GKE), Workload Identity Federation (external CI / other clouds), or impersonation via Token Creator. There is no fourth option, and the exam expects you to make exactly this enumeration.

Per-project exception

If a single legacy workload genuinely needs a key, set the constraint to "denyAll" at the org level and add an explicit enforce: false rule scoped to a single project tag, rather than disabling the constraint org-wide. Combine with Secret Manager + 90-day automated rotation if the key truly cannot be eliminated.

IAM Conditions with CEL

The conditional binding shape

An IAM policy binding has fields role, members, and optionally condition. The condition is an object { title, description, expression } where expression is a Common Expression Language (CEL) snippet evaluated at access time. Example: bind roles/storage.objectViewer to serviceAccount:reports@... with the condition resource.name.startsWith('projects/_/buckets/finance-reports/'). The role applies only when the matching bucket prefix is touched.

Supported attributes

The most useful CEL functions and attributes for app developers: resource.name, resource.type, resource.service, request.time < timestamp('2026-12-31T23:59:59Z') for expiry, request.time.getHours('Asia/Taipei') < 18 for business-hours-only access, and 'tagKeys/123' in resource.matchTag('env', 'prod') for resource-tag-based gating. Conditional bindings are evaluated per request; an expression that returns true grants the role for that one call.

Where conditions don't work

IAM Conditions are not supported on every resource type. Project-level Owner/Editor/Viewer primitive roles cannot be conditional, and some legacy services (e.g. some App Engine APIs) ignore conditions entirely. Always check the docs' "Conditional bindings supported resources" table before designing around them.

CEL operators you must recognise on the exam: request.time < timestamp(...) for time-bound grants, resource.name.startsWith(...) for prefix scoping, resource.matchTag(KEY, VALUE) for tag-based scoping, request.auth.access_levels for VPC-SC context, and has(...) for safe attribute presence checks.

IAM Deny Policies

Deny vs Allow

IAM Deny policies are a separate, higher-precedence layer that explicitly forbids specific permissions for specific principals on a resource hierarchy node (organization, folder, or project). A matching Deny rule overrides every Allow binding, including primitive Owner. The policy shape lives under policies.googleapis.com and is managed with gcloud iam policies create with --kind=denypolicies.

Common Deny rules

The textbook examples are: deny iam.serviceAccountKeys.create to everyone except a hardened bootstrap group; deny storage.buckets.delete on the finance folder to all principals outside the SRE group; deny compute.instances.setServiceAccount to prevent attackers from swapping a low-priv VM's SA for a high-priv one. Each rule supports a deniedPrincipals list plus an exceptionPrincipals list and an optional CEL condition.

Where Deny shines

Deny policies are the right tool when you want a "no matter what anyone else grants below me" guarantee at the org or folder level. Allow bindings still need to be tidy, but Deny gives you a backstop that survives accidental over-grants by project owners.

Resource Manager Tags + IAM Conditions

Tags as a control plane

Resource Manager Tags are key-value labels attached to resources that participate in IAM Conditions. You create a tag key env and values prod, staging, dev at the organization level, then attach env=prod to specific resources (projects, VMs, GCS buckets, BQ datasets). In an IAM Condition you can then write resource.matchTag('123456789/env', 'prod') and grant a role that only fires for prod-tagged resources.

Tag-bound bindings

This is fundamentally different from labels: tags participate in IAM policy evaluation while labels are pure metadata. A common pattern is to grant roles/run.invoker to a service account, conditional on resource.matchTag('env', 'prod'), so the binding follows the tag — when you re-tag a Cloud Run service from staging to prod, access flips without editing IAM policy.

Bootstrapping warning

Tag-based conditions evaluate to "no match" for resources without the tag, so a permission that depends on env=prod will return 403 on any prod resource that someone forgot to tag. Build the tagging into your IaC (Terraform google_tags_tag_binding) rather than relying on manual gcloud resource-manager tags bindings create.

Short-Lived Credentials via STS

Why STS

Google's Security Token Service (sts.googleapis.com) is the OAuth2 token-exchange endpoint that backs every modern credential flow on GCP: WIF token exchange, downscoped credentials for Cloud Storage, and credential federation for third parties. STS issues access tokens scoped to a target principal and lifetime — typically 3600 seconds, configurable down to a minimum and up to 12 hours with org policy approval.

Downscoped credentials for GCS

A specialised STS flow is Credential Access Boundaries for Cloud Storage: the holder of a broad GCS token can call STS with a boundary policy that says "the resulting token can only access bucket customer-123-data with storage.objects.get," producing a tightly-scoped child token to hand to an untrusted process. This is the recommended pattern for multi-tenant SaaS that needs to give a customer's browser a one-shot upload URL without exposing the parent SA.

Why short lifetimes matter

A leaked 1-hour token is a much smaller incident than a leaked JSON key valid for 10 years. Combined with VPC-SC perimeters, conditional bindings, and Cloud Audit Logs, short-lived credentials make the blast radius of a compromise time-boxed by design. The PCD exam consistently picks the short-lived-token answer over any answer that involves "store the key in Secret Manager and rotate every 90 days".

白話文解釋(Plain English Explanation)

Analogy 1: The Employee Badge

A service account is like an employee badge for your application. It isn't a person, but it represents the application's identity. When the app wants to enter a room (access a resource), it taps the badge against the reader. The badge knows exactly which doors it can open. Workload Identity Federation is like letting a contractor from another company swipe their own corporate badge at your front desk; you don't issue them a permanent badge of your own, the receptionist checks their identity with their employer in real time and prints them a one-day visitor pass.

Analogy 2: The Hotel Master Keycard

The metadata server is the room phone in a hotel suite. You don't carry the master keycard around; instead, when housekeeping needs it, they pick up the in-room phone and the front desk authenticates the call and dispatches a one-hour-valid keycard to the door. That short-lived keycard is the access token. The hotel never gives you a copy of the master, and even if a thief steals one keycard, it stops working in 60 minutes.

Analogy 3: The Notarised Letter of Authority

Service account impersonation with Token Creator is like having a notarised letter of authority. The original principal (you) hands a notary public (the IAM Credentials API) a notarised letter saying "I authorise this person to act as the CFO for the next hour." The notary verifies your signature against the file (the Token Creator binding) and stamps a time-limited proxy. Everyone downstream sees the CFO's authority, but the audit trail records that you initiated the proxy. IAM Conditions are the fine print on the letter that says "valid only between 9am and 5pm Taipei time, only for transactions under NT$100k, only in the prod ledger."

Frequently Asked Questions (FAQs)

Q1: What is the difference between a User Account and a Service Account?

User accounts are tied to human identities (@gmail.com or your Workspace/Cloud Identity domain) and are managed in Google Workspace or Cloud Identity. Service accounts are resources inside a GCP project (<name>@<project>.iam.gserviceaccount.com) and represent non-human workloads. Service accounts can be attached to compute resources, impersonated, and federated; user accounts cannot be attached to a VM and have no JSON key concept.

Q2: When should I use Workload Identity Federation instead of a service account key?

Always, unless the workload is on a fully air-gapped network with no outbound HTTPS to sts.googleapis.com. WIF is the supported answer for GitHub Actions, GitLab CI, CircleCI, Jenkins on AWS or Azure, Terraform Cloud, on-prem Kubernetes, and any third-party SaaS. Service account keys are an anti-pattern in 2026; the exam reflects this.

Q3: How do serviceAccountUser and serviceAccountTokenCreator differ?

serviceAccountUser (iam.serviceAccounts.actAs) lets a principal attach the target SA to a new resource (VM, Cloud Run service, Cloud Build trigger). serviceAccountTokenCreator (iam.serviceAccounts.getAccessToken, signBlob, signJwt) lets a principal mint a token for the SA via the IAM Credentials API. Deploying needs ActAs; ad-hoc impersonation needs TokenCreator.

Q4: Can I block all JSON key creation across my org?

Yes. Enforce constraints/iam.disableServiceAccountKeyCreation at the organization node and the entire iam.serviceAccountKeys.create permission becomes denyAll regardless of role grants. Pair with constraints/iam.disableServiceAccountKeyUpload to also block bring-your-own-key.

Q5: What is the right way to give my GitHub Actions pipeline access to GCS?

Configure a Workload Identity Pool with a GitHub OIDC provider, pin the attribute condition to assertion.repository == 'org/repo' (and ideally assertion.ref == 'refs/heads/main'), bind roles/iam.workloadIdentityUser on a deployment SA to the federated principalSet, grant that SA roles/storage.objectAdmin on the target bucket, and use google-github-actions/auth@v2 in the workflow. No JSON key ever leaves Google.

Q6: How long is an access token from the metadata server valid?

Up to 3600 seconds by default. The client library auto-refreshes ~5 minutes before expiry. Custom lifetimes via the IAM Credentials API can be shorter, or up to 12 hours if the SA is listed in the iam.allowServiceAccountCredentialLifetimeExtension org policy.

Q7: What is the difference between IAM Conditions and IAM Deny policies?

IAM Conditions attach CEL expressions to Allow bindings — the role grants only when the condition is true. IAM Deny policies are a separate top-level resource that explicitly forbids permissions for principals, overriding any Allow. Use Conditions for fine-grained "only on this bucket prefix"; use Deny for organization-wide "nobody outside SRE can delete finance buckets, ever".

Official sources

More PCD topics