Resource Tagging and Labeling — GCP PCA Study Notes

Q: Q1. Can I use labels for security?

No. Labels are for organization and billing. They do not have an IAM-linked enforcement mechanism. Use Tags or IAM Conditions for security-related logic.

Q: Q2. Do labels show up in the BigQuery billing export?

Yes. This is their superpower. You can run SQL queries to see exactly how much the app: legacy-portal is costing you every month.

Q: Q3. What is the limit for labels?

Most GCP resources allow up to 64 labels per resource.

Q: Q4. What are "Network Tags"?

Network Tags are an older version of tags specifically for Compute Engine VMs and VPC Firewalls. While still widely used, the newer Resource Manager Tags are more powerful as they work across different resource types and folders.

Q: Q5. How do I update labels on 1000 VMs?

Use the gcloud compute instances add-labels command with a script, or preferably, update your Terraform code and run a plan/apply to bring the environment into the desired state.

Introduction to Resource Metadata

In a large-scale cloud environment, visibility and control are paramount. Google Cloud provides two primary mechanisms for organizing and identifying resources: Labels and Tags. While they sound similar, they serve distinct purposes. A Professional Cloud Architect must design a consistent metadata taxonomy to support billing, automation, security, and compliance.

Plain-Language Explanation: Labels and Tags

Analogy 1 — The Post-it Note vs. The Security Badge

Think of a Label as a Post-it note you stick on a laptop. It says "Property of Marketing" or "Used for Project X." It's great for taking inventory and knowing who pays the bill. A Tag, however, is like a Security Badge. It doesn't just describe the person; it actually grants them access to certain doors (Firewall rules) in the building.

Analogy 2 — The Supermarket Inventory

Labels are like the price tags and department signs in a supermarket (Produce, Dairy). They help the manager see which department is making the most money. Tags are like the Barcode that the automated system uses to decide if an item is allowed through the "Express Lane" (Network Security Policy).

Analogy 3 — The Library System

Labels are the subject categories (History, Science) on the spine of the book used for organizing the shelves. Tags are the RFID chips that trigger an alarm if you try to take the book out without checking it out (Policy Enforcement).

Key-value pairs attached to resources used for organizing resources and cost allocation. They are visible in billing exports.

Namespaced keys and values that provide a way to conditionally allow or deny policies and are used heavily in network security.

Labels vs. Tags: The Key Differences

Feature	Labels	Tags
Primary Use Case	Billing, Organization, Inventory	IAM Policy, Network Firewall, Security
Visibility	Billing Exports, Console, SDK	Resource Manager, IAM, Firewalls
Scope	Resource-level	Organization/Project level (Namespaced)
Inheritance	No	Yes (Hierarchical)
Format	`key: value` (lowercase, specific chars)	`tagKeys/tagValues` (namespaced)

Designing a Taxonomy

A successful tagging and labeling strategy requires a consistent Taxonomy. Common categories include:

Ownership: owner: marketing, team: data-eng.
Environment: env: prod, env: sandbox.
Cost Center: cost-center: 12345.
Application: app: checkout-service.
Compliance: compliance: hipaa.

Enforcing Policies at Scale

You cannot rely on manual entry for metadata.

Terraform: Define labels and tags in your IaC modules. Use default_labels in the Google provider to apply global labels to all resources.
Organization Policy: Use the "Required labels" constraint (where available) to prevent the creation of resources that lack mandatory metadata.
Cloud Asset Inventory: Regularly query your inventory to find "orphaned" resources or those with non-compliant labels.

::promoted

Architect's Insight: For the PCA exam, remember that Tags (introduced more recently as a successor to Network Tags) are now the preferred way to manage Firewall Policies across projects because they are governed by IAM and can be inherited down the resource hierarchy. ::

Resource Manager Tags: TagKey and TagValue Architecture

Resource Manager Tags are first-class resources living under tagKeys/{numeric_id} and tagValues/{numeric_id} namespaces, attached to the Organization, Folder, or Project node where they are created. Each TagKey is unique within its parent (e.g. 123456789012/environment) and each TagValue represents one allowed value (e.g. production, staging). Bindings are made through a TagBinding resource that points at a target resource via its full resource name (//compute.googleapis.com/projects/p/zones/z/instances/i).

Lifecycle and IAM separation

Creating tag keys/values requires roles/resourcemanager.tagAdmin at the parent (Org or Folder).
Binding existing tags to resources requires roles/resourcemanager.tagUser on the tag value plus tagUser on the target. This split lets a central platform team curate the taxonomy while application teams self-serve bindings.
Tag bindings are inherited down the resource hierarchy. A tag on a Folder applies to every Project (and most child resources) underneath it unless overridden.

gcloud workflow

gcloud resource-manager tags keys create environment \
  --parent=organizations/123456789012
gcloud resource-manager tags values create production \
  --parent=123456789012/environment
gcloud resource-manager tags bindings create \
  --tag-value=123456789012/environment/production \
  --parent=//cloudresourcemanager.googleapis.com/projects/my-prod-proj

Unlike labels, you cannot mass-edit tag bindings through gcloud ... update on the underlying resource — tags are managed through their own API surface. This is what makes them safe to use in IAM Conditions: the policy can trust that the tag was set by a principal with explicit tagUser permission, not by anyone with edit rights on the workload.

IAM Conditions Driven by Tags

Tags become powerful when combined with IAM Conditions. You can grant a role only when the target resource carries a specific tagValue, letting you build attribute-based access control (ABAC) without proliferating service accounts or projects.

Example: read-only on production buckets

bindings:
- role: roles/storage.objectViewer
  members:
  - group:[email protected]
  condition:
    title: only-prod-buckets
    expression: |
      resource.matchTag('123456789012/environment', 'production')

The resource.matchTag() CEL function evaluates inherited tag bindings, so a tag set at the Folder level is honoured for every bucket beneath it. Combine this with resource.matchTagId() if you want to pin to a numeric tag id that survives rename operations.

IAM Conditions on tags are evaluated at request time, so if you remove a tag binding the access loss is effectively immediate (subject to credential cache TTLs of a few minutes). This makes tags an excellent kill switch for time-boxed access — much faster than rotating IAM bindings across hundreds of projects.

Common ABAC patterns on GCP

Environment isolation: env=prod → block storage.objects.delete from CI service accounts.
Data classification: data-class=restricted → require principal to be in a privileged group AND access from inside VPC-SC perimeter.
Cost-recovery: cost-center=12345 → only members of that cost centre can spin up new instances.

Labels cannot do any of this. They are not part of the resource CEL object surfaced to IAM Conditions.

Billing Exports and Label-Based Cost Allocation

GCP billing is exported to BigQuery via the standard billing export. The schema includes a repeated labels field of STRUCT<key STRING, value STRING>, plus a separate system_labels field that GCP fills in for resources like GKE clusters, Cloud SQL instances, and Compute Engine VMs.

Example: cost per application per month

SELECT
  (SELECT value FROM UNNEST(labels) WHERE key = 'app') AS app,
  FORMAT_DATE('%Y-%m', usage_start_time) AS month,
  ROUND(SUM(cost), 2) AS cost_usd
FROM `billing.gcp_billing_export_v1_0123ABC`
WHERE _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
GROUP BY app, month
ORDER BY month DESC, cost_usd DESC;

Watch-outs

Labels are recorded at the time of usage — relabelling a VM today does NOT retroactively rewrite yesterday's billing rows. Get the taxonomy right at creation.

The BigQuery billing export's labels column is immutable historical metadata. If your chargeback model needs to fix a wrong label after the fact, you must layer a mapping table on top (old_app -> new_app) and join it in your reports, rather than trying to mutate the export. This is a recurring source of finance disputes that the PCA-level architect is expected to anticipate.

The labels array only contains labels that were actually applied. Unlabelled resources show up as app IS NULL; surface these as a "leakage" metric on your FinOps dashboard.
For BigQuery itself, the labels on a query job propagate into the billing export, which lets you attribute on-demand query costs to a specific dashboard, dbt model, or ad-hoc analyst.

Wrap the unnest-by-key pattern in a SQL UDF such as fn.label(labels, 'app'). Every analyst then writes fn.label(labels, 'team') instead of copy-pasting a subquery — this dramatically improves consistency across FinOps reports.

GKE Workload Labels and System Labels

On Google Kubernetes Engine, the cluster-level GCP labels and the Kubernetes pod labels are distinct but cooperate to enable cost allocation per namespace, workload, or team.

Two layers of labels

GCP resource labels on the cluster, node pool, and underlying Compute Engine VMs. These show up in the billing export against compute SKUs.
GKE cost allocation labels — enabled with --enable-cost-allocation on the cluster. GKE annotates billing rows with system_labels such as goog-k8s-cluster-name, goog-k8s-cluster-namespace, and goog-k8s-workload-name, even though you never set them in GCP directly.

Practical setup

gcloud container clusters update prod-cluster \
  --enable-cost-allocation \
  --region=asia-east1

Once enabled, you can split a single bin-packed node pool's bill across the workloads sharing it:

SELECT
  (SELECT value FROM UNNEST(system_labels) WHERE key = 'goog-k8s-cluster-namespace') AS ns,
  (SELECT value FROM UNNEST(system_labels) WHERE key = 'goog-k8s-workload-name') AS workload,
  SUM(cost) AS cost_usd
FROM `billing.gcp_billing_export_v1_0123ABC`
WHERE service.description = 'Kubernetes Engine'
GROUP BY ns, workload;

For label hygiene on the Kubernetes side, enforce mandatory pod labels (such as team, app.kubernetes.io/name) with Policy Controller (Gatekeeper) constraints. This keeps the Kubernetes-side labels aligned with the GCP-side taxonomy so dashboards stay coherent.

Cloud Asset Inventory: Filtering and Auditing by Metadata

Cloud Asset Inventory (CAI) is the canonical inventory of every resource and IAM policy across an Organization, and it indexes both labels and tagKeys on every asset. CAI is the right tool for "find me everything that's missing a cost-center label" or "list every VM with env=prod but no data-class tag".

Search and export

# Find all resources missing cost-center label across the org
gcloud asset search-all-resources \
  --scope=organizations/123456789012 \
  --query="NOT labels.cost-center:*"

# Find all resources with prod environment tag
gcloud asset search-all-resources \
  --scope=organizations/123456789012 \
  --query="tagKeys:123456789012/environment AND tagValues:production"

Continuous compliance

Feeds: Configure a CAI feed to a Pub/Sub topic so any change to a resource's labels or tag bindings emits a real-time event. Wire this to a Cloud Function that re-labels or alerts.
Exports to BigQuery: Daily exports of the full asset set let you join inventory to billing and spot resources that consume cost but lack required metadata.
Saved queries: Build a library of --query=... snippets for each compliance rule (missing label, wrong tag, orphan service account) and run them in scheduled GitHub Actions.

CAI is the only API that consistently lets you query metadata across services — gcloud compute instances list --filter only sees Compute Engine.

Label and Tag Quotas and Format Limits

Labels and tags have hard limits that frequently surprise teams designing their first taxonomy. Knowing these for the PCA exam saves you from designing strategies that quietly break at scale.

Labels

64 labels per resource (most services).
Key: 1–63 characters, lowercase letters, digits, underscores, or dashes; must start with a lowercase letter.
Value: 0–63 characters, same character set.
International characters and uppercase are not allowed — Owner: Alice is invalid; use owner: alice.

A Cost-Allocation Strategy That Survives an Audit

A defensible cost-allocation model relies on a small, mandatory set of labels applied consistently across every cost-bearing resource.

The minimum viable label set

Label key	Cardinality	Source of truth
`cost-center`	~50 values	Finance ERP
`business-unit`	~10 values	HR/Org chart
`app`	~500 values	Service catalogue / CMDB
`env`	4–5 values	Deployment pipeline
`data-class`	3–4 values	Data governance team

How the numbers flow

Each resource is labelled at creation by Terraform using the team's module defaults.
Billing export streams to BigQuery hourly.
A scheduled query splits cost by cost-center and joins to the finance ERP for chargeback codes.
Looker Studio (or Looker) renders a chargeback dashboard per business unit.
Unlabelled spend is shown as a separate "leakage" line item and assigned a target of < 2% of total spend.

Why not just use projects?

Projects already isolate billing, but you typically have far more applications and cost centres than you can comfortably model as projects. Labels let you slice within a project (multiple microservices in one project, multiple feature teams sharing a GKE cluster) while projects remain the unit of IAM and quota. The two work together.

Label Hygiene Policies: Making Mandatory Keys Stick

A label strategy is only as good as the enforcement around mandatory keys. There are several enforcement layers you should stack rather than picking one.

Layered enforcement

IaC defaults: In Terraform, set default_labels on the google provider so every resource Terraform creates inherits cost-center, env, app. Make these variables required (type = string, no default).
Module validation: Use validation blocks in your shared Terraform modules to reject empty labels at plan time.
Pre-commit hooks: Run tflint with a custom rule that fails the PR if any google_* resource is missing the mandatory keys.
Organization Policy: Where available, the gcp.resourceLocations and label-related constraints prevent un-labelled resources from being created in the first place.
CAI feed + Cloud Function: As a backstop for resources created outside Terraform (console clickops, third-party agents), a Pub/Sub-triggered function adds or removes resources from a "non-compliant" labels group and notifies the owner.
Scheduled audits: A BigQuery scheduled query joins billing export to a list of mandatory keys and posts a weekly Slack report of teams with the worst label coverage.

For the PCA exam, remember the canonical enforcement pyramid: prevent (Org Policy) → enforce in IaC (Terraform defaults + module validation) → detect (CAI feeds + scheduled BigQuery queries) → remediate (Cloud Function or Workflows). Multi-layer beats any single control.

Auto-Remediation with Cloud Functions and CAI Feeds

You can close the loop on label drift with an event-driven remediation pipeline. The pattern is identical regardless of which mandatory label is missing.

Reference architecture

Cloud Asset Inventory feed is created at the Organization scope, filtered to the asset types you care about (compute.googleapis.com/Instance, storage.googleapis.com/Bucket, bigquery.googleapis.com/Dataset), publishing every change to a Pub/Sub topic.
Cloud Function (2nd gen) subscribes to the topic. For each message:
- Parse the asset payload.
- Check if mandatory keys (cost-center, env, app) are present and non-empty.
- If missing and the parent project carries a default-cost-center label, copy that value down to the child resource.
- Otherwise, send a notification to the owning team via Slack/email and tag the resource with compliance=non-compliant.
Dead-letter topic captures messages the function could not process; an on-call engineer drains it weekly.

Code skeleton (Python)

def remediate(event, context):
    asset = json.loads(base64.b64decode(event["data"]))["asset"]
    labels = asset["resource"]["data"].get("labels", {})
    missing = [k for k in ("cost-center", "env", "app") if k not in labels]
    if not missing:
        return
    project = asset["ancestors"][0]
    default = lookup_project_default(project)
    if default:
        apply_labels(asset["name"], default)
    else:
        notify_owner(asset, missing)

Operational notes

Make the function idempotent: it will see the asset again after it applies labels, so check current state before writing.
Rate-limit: a noisy resource (e.g. a Dataflow job creating thousands of temp resources) can swamp the function. Use Pub/Sub flow control and a max-instance limit on the function.
Log every label change with the principal of the function for auditability.

BigQuery View: A Unified Label Catalog

A single BigQuery view that exposes "every resource and its labels, joined to billing and inventory" becomes the workhorse for FinOps, security, and platform teams. Build it once, query it everywhere.

View design

CREATE OR REPLACE VIEW finops.resource_catalog AS
SELECT
  r.name AS resource_name,
  r.asset_type,
  r.ancestors[OFFSET(0)] AS project,
  r.labels,
  (SELECT value FROM UNNEST(r.labels) WHERE key = 'cost-center') AS cost_center,
  (SELECT value FROM UNNEST(r.labels) WHERE key = 'env')        AS env,
  (SELECT value FROM UNNEST(r.labels) WHERE key = 'app')        AS app,
  b.last_30d_cost_usd
FROM `cai_export.assets` r
LEFT JOIN (
  SELECT
    project.id AS project,
    (SELECT value FROM UNNEST(labels) WHERE key = 'app') AS app,
    ROUND(SUM(cost), 2) AS last_30d_cost_usd
  FROM `billing.gcp_billing_export_v1_0123ABC`
  WHERE _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  GROUP BY project, app
) b
ON r.ancestors[OFFSET(0)] = CONCAT('projects/', b.project)
  AND (SELECT value FROM UNNEST(r.labels) WHERE key = 'app') = b.app;

Why this matters

One join replaces ten ad-hoc queries scattered across teams.
New columns (e.g. data_class) can be added without breaking downstream dashboards because the underlying labels array is preserved.
Row-level security on the view (using BigQuery row access policies) can scope each business unit to its own cost_center slice.
The same view is consumed by Looker Studio (cost dashboards), Looker (chargeback reports), and ad-hoc analytics, ensuring everyone sees the same numbers.

This pattern — CAI export + billing export + view — is the canonical "single pane of glass" for resource governance on GCP and is well worth memorising for the PCA exam.

FAQ — Resource Metadata

Q1. Can I use labels for security?

No. Labels are for organization and billing. They do not have an IAM-linked enforcement mechanism. Use Tags or IAM Conditions for security-related logic.

Q2. Do labels show up in the BigQuery billing export?

Yes. This is their superpower. You can run SQL queries to see exactly how much the app: legacy-portal is costing you every month.

Q3. What is the limit for labels?

Most GCP resources allow up to 64 labels per resource.

Q4. What are "Network Tags"?

Network Tags are an older version of tags specifically for Compute Engine VMs and VPC Firewalls. While still widely used, the newer Resource Manager Tags are more powerful as they work across different resource types and folders.

Q5. How do I update labels on 1000 VMs?

Use the gcloud compute instances add-labels command with a script, or preferably, update your Terraform code and run a plan/apply to bring the environment into the desired state.

Introduction to Resource Metadata

Plain-Language Explanation: Labels and Tags

Analogy 1 — The Post-it Note vs. The Security Badge

Analogy 2 — The Supermarket Inventory

Analogy 3 — The Library System

Labels vs. Tags: The Key Differences

Designing a Taxonomy

Enforcing Policies at Scale

Resource Manager Tags: TagKey and TagValue Architecture

Lifecycle and IAM separation

gcloud workflow

IAM Conditions Driven by Tags

Example: read-only on production buckets

Common ABAC patterns on GCP

Billing Exports and Label-Based Cost Allocation

Example: cost per application per month

Watch-outs

GKE Workload Labels and System Labels

Two layers of labels

Practical setup

Cloud Asset Inventory: Filtering and Auditing by Metadata

Search and export

Continuous compliance

Label and Tag Quotas and Format Limits

Labels

Tags

A Cost-Allocation Strategy That Survives an Audit

The minimum viable label set

How the numbers flow

Why not just use projects?

Label Hygiene Policies: Making Mandatory Keys Stick

Layered enforcement

Auto-Remediation with Cloud Functions and CAI Feeds

Reference architecture

Code skeleton (Python)

Operational notes

BigQuery View: A Unified Label Catalog

View design

Why this matters

FAQ — Resource Metadata

Q1. Can I use labels for security?

Q2. Do labels show up in the BigQuery billing export?

Q3. What is the limit for labels?

Q4. What are "Network Tags"?

Q5. How do I update labels on 1000 VMs?

Official sources

More PCA topics