Introduction to Google Cloud Pub/Sub
Google Cloud Pub/Sub is a globally distributed, fully managed, asynchronous messaging service that decouples publishers from subscribers. Unlike traditional regional message brokers, a Pub/Sub topic is a global resource: a publisher in us-central1 can publish to the same topic that a subscriber consumes from in asia-east1, and the Pub/Sub service plane automatically replicates and routes the messages. This makes it the default backbone for event-driven services on Google Cloud, ingestion pipelines into BigQuery and Dataflow, and asynchronous workloads behind Cloud Run, Cloud Functions, and GKE.
Pub/Sub is at-least-once by default with optional exactly-once delivery for pull-style subscriptions in a single region. It guarantees durable storage of every published message until a subscriber acknowledges it, with a default retention of 7 days (configurable up to 31 days). For the PCD exam, you need to know not just the data model (topic, subscription, message, ack) but also how delivery modes, ordering, schemas, and dead-letter behavior interact with Cloud Run, Cloud Functions, and BigQuery.
Core Resources and Vocabulary
- Topic: a named resource (
projects/PROJECT/topics/TOPIC) where publishers send messages. A topic alone does not retain messages — retention is a property of either the topic (for replay) or the subscription. - Subscription: a named resource attached to exactly one topic. Each subscription has an independent backlog, ack state, and delivery configuration. Multiple subscriptions on one topic are the basis for fan-out.
- Message: up to 10 MB of
data(binary) plus optional stringattributesand anordering_key. - Ack deadline: per-subscription
ackDeadlineSeconds(10 to 600 seconds) controls how long a subscriber has to process a message before Pub/Sub redelivers it.
A subscription in Pub/Sub is not a passive listener like a Kafka consumer group offset — it is a server-side resource with its own backlog, retention, ack state, and delivery type (pull, push, bigquery, or cloudStorage). Deleting a subscription deletes the backlog; creating a new one starts with an empty backlog (unless you seek to a snapshot or timestamp).
Push vs Pull Subscriptions
Push and pull are the two long-standing delivery modes, and they map to very different runtime models. Picking the wrong one is one of the most common architecture mistakes the PCD exam tests.
Pull Subscriptions
In a pull subscription, the subscriber application (or a Dataflow pipeline, or the client library's StreamingPull) explicitly asks Pub/Sub for messages. The client library opens a long-lived StreamingPull gRPC stream and Pub/Sub delivers messages as they become available. Pull is the right choice when:
- You need very high throughput (millions of messages per second per topic).
- The consumer is behind a firewall or has no public HTTPS endpoint.
- You need exactly-once delivery (only pull subscriptions support it).
- You want backpressure — the client controls how many messages are outstanding via
flow_control_settings.max_messagesandmax_bytes.
A Compute Engine fleet, GKE workload, or Dataflow streaming job typically uses pull. The gcloud pubsub subscriptions pull SUB --auto-ack command is useful for manual debugging.
Push Subscriptions
A push subscription delivers each message as an HTTPS POST to a pushEndpoint you configure. Pub/Sub will retry with exponential backoff (minimum 10 seconds, maximum 600 seconds) on any non-2xx response or timeout. Push is the natural fit for serverless consumers:
- Cloud Run, Cloud Run functions, and App Engine services that are scaled by HTTPS requests.
- Cross-project or cross-cloud webhooks that already accept signed HTTPS calls.
- Use cases where you want Pub/Sub itself to handle the rate limiting via the built-in flow control (slow-start algorithm based on ack rate).
Push has a per-subscription maximum delivery rate (Pub/Sub auto-tunes this from observed ack latency) and is generally less efficient than pull at very high QPS, but it is far simpler to operate.
The exam frequently asks "Cloud Run service that handles spiky event traffic — push or pull?" The answer is push with OIDC authentication. Pull from a Cloud Run service is awkward because Cloud Run scales on request count and a pull loop is a single long-running connection, not a request. Use push and let Cloud Run autoscale on the HTTP POSTs from Pub/Sub.
Choosing Between Push and Pull
| Scenario | Recommended | Reason |
|---|---|---|
| Cloud Run / Cloud Functions consumer | Push (OIDC) | Serverless scales per HTTPS request |
| Dataflow streaming into BigQuery | Pull (via I/O connector) | Throughput + ordering control |
| Need exactly-once delivery | Pull | Push does not support exactly-once |
| On-prem worker behind NAT | Pull | No inbound HTTPS required |
| Webhook to third-party SaaS | Push | Direct HTTPS POST |
Exactly-Once Delivery
By default Pub/Sub is at-least-once: duplicate deliveries are possible (for example when an ack arrives at the server after the ack deadline has expired). Since 2022, Pub/Sub also supports exactly-once delivery as a subscription option.
What Exactly-Once Actually Guarantees
When enableExactlyOnceDelivery: true is set on a pull subscription, Pub/Sub guarantees:
- No redelivery of an acknowledged message once the ack has been confirmed by the server (the client library exposes this confirmation via a future).
- No redelivery of a message while it is within the lease (ack deadline) of a subscriber.
- The guarantee applies only within a single Cloud region. To get the regional behavior, you must configure
messageStoragePolicyand use a regional endpoint such asus-east1-pubsub.googleapis.com:443.
Limitations and Trade-offs
- Exactly-once is pull only. Push subscriptions cannot enable it.
- It does not deduplicate across publishes — if your publisher retries a publish RPC and both attempts succeed, you get two distinct
messageIds. Use the publisher-sidemessageIdor an idempotency token in attributes for cross-publish dedup. - Throughput per subscription is somewhat lower than at-least-once because of the additional server-side coordination.
- The ack deadline minimum increases effectively (the server uses a longer lease internally).
"Exactly-once" does not mean "exactly-once end to end across the entire pipeline." It means "no redelivery from the subscription once acked, within one region." If your subscriber writes to BigQuery and then crashes before acking, the message will be redelivered and you must make the BigQuery write idempotent (for example with insert_id).
Message Ordering with Ordering Keys
By default Pub/Sub does not preserve order — messages can be delivered in any order, especially across different publisher hosts or regions. To get ordering, you must:
- Enable
enableMessageOrdering: trueon the subscription. - Set an
ordering_key(a string) on each published message. - Publish from a client that respects ordering (the official client libraries do — they serialize publishes per key).
Pub/Sub then guarantees that messages with the same ordering_key are delivered in publish order to a given subscriber. Different keys can still be delivered in parallel.
When Ordering Breaks Down
- If a publish for an ordering key fails, the client library will pause that key until you call
resume_publish(ordering_key). Failing to do so silently halts that key's throughput. - If you exceed 1 MB/s throughput per ordering key, you will see backpressure on that key.
- Ordering applies only within a region; for cross-region subscribers, ordering is not guaranteed.
- Using
seekresets the position and can deliver messages in a different relative order to the original publish.
A common pattern is to use the entity ID (user ID, order ID, device ID) as the ordering key. This gives per-entity ordering with massive parallelism across entities — no global bottleneck, but each entity's events stay in order.
Message Retention, Snapshots, and Seek
Retention
Pub/Sub stores every message until it is acked by every subscription, up to the subscription's messageRetentionDuration. Defaults and limits:
- Default subscription retention: 7 days.
- Maximum subscription retention: 7 days for un-acked messages, but with
retainAckedMessages: trueyou keep acked messages for the configured duration too (useful for replay). - Topic message retention: optional, 10 minutes to 31 days. When set on the topic, new subscriptions can
seekback into history even before they existed.
Snapshots
A snapshot captures the message ack state of a subscription at a point in time. You can later seek a subscription back to that snapshot, effectively replaying every message that was un-acked at snapshot time. Snapshots live up to 7 days and are subscription-scoped.
gcloud pubsub snapshots create my-snap --subscription=orders-sub
# ... deploy a buggy consumer that mis-acks messages ...
gcloud pubsub subscriptions seek orders-sub --snapshot=my-snap
Seek to Timestamp
You can also seek to an arbitrary RFC 3339 timestamp:
gcloud pubsub subscriptions seek orders-sub \
--time=2026-05-01T00:00:00Z
This requires that the topic (or subscription, with retainAckedMessages) has retention covering that timestamp. Seek is a destructive operation in the sense that it changes the subscription's read cursor — there is no per-message rewind.
Snapshots are tied to the subscription, not the topic. If you delete and recreate the subscription, the snapshot becomes useless. For long-term replay across recreated subscriptions, enable topic-level retention with --message-retention-duration up to 31 days.
Dead-Letter Topics and Retry Policy
A dead-letter topic (DLT) is a separate Pub/Sub topic that receives messages a subscription has failed to ack a configurable number of times. This isolates poison-pill messages from blocking your main pipeline.
Configuring a DLT
gcloud pubsub subscriptions create orders-sub \
--topic=orders \
--dead-letter-topic=orders-dlq \
--max-delivery-attempts=5 \
--min-retry-delay=10s \
--max-retry-delay=600s
max-delivery-attemptsranges from 5 to 100.- The Pub/Sub service account
[email protected]must havepubsub.publisheron the DLT andpubsub.subscriberon the source subscription. Thegcloudflag automates this, but if you provision via Terraform you must grant these explicitly.
Retry Policy
Independently of DLT, you can set an exponential backoff retry policy with min-retry-delay and max-retry-delay. This applies to both push and pull subscriptions. The original ack deadline still applies; the retry delay is the wait between redeliveries after a nack or ack deadline expiration.
What to Put on the DLT Side
Have a dedicated subscriber on the dead-letter topic that logs to Cloud Logging, writes to BigQuery, or pages on-call via Cloud Monitoring alerts. Never silently drop DLT messages — they almost always indicate a schema mismatch or downstream outage.
Message Filtering
Subscriptions can filter messages by attributes (not payload). A filter is a CEL-style boolean expression evaluated against the message attributes:
attributes.eventType = "order.created" AND attributes.region = "us"
- Filter syntax supports
=,!=,:(substring),hasPrefix,hasSuffix,NOT,AND,OR. - Filters are evaluated server-side. Filtered-out messages are auto-acked and do not count against your subscription throughput or your ack deadline.
- The filter is immutable after subscription creation — you must recreate the subscription to change it.
Filtering reduces cost and load when only a slice of a topic's messages is relevant to one consumer. It is also a clean way to share one topic across multiple microservices that each care about different event types.
Schemas and Schema Validation
Pub/Sub supports topic-level schemas to enforce message structure at publish time. Two formats are supported:
- Avro (
AVRO): JSON schema definition; payload can be Avro binary or JSON. - Protocol Buffers (
PROTOCOL_BUFFER):.protodefinition; payload can be Protobuf wire format or JSON.
Workflow
- Create the schema as a top-level resource:
gcloud pubsub schemas create order-v1 --type=AVRO --definition-file=order.avsc. - Attach it when creating the topic:
--schema=order-v1 --message-encoding=BINARY. - Publishers that send messages violating the schema receive
INVALID_ARGUMENTfromPublish. Subscribers receive validated messages.
Schemas are versioned via revisions. You can evolve a schema (add optional fields, add enum values) and pin a topic to a specific revision range to control compatibility.
For a BigQuery subscription, an Avro or Protobuf schema on the topic makes the destination table mapping straightforward — Pub/Sub maps schema fields to columns automatically and rejects malformed messages before they reach BigQuery.
BigQuery Subscriptions
A BigQuery subscription writes messages directly to a BigQuery table without any user-managed Dataflow or Cloud Run consumer. You set the destination table and Pub/Sub handles the streaming insert.
Two Write Modes
- Topic schema mode (
useTopicSchema: true): the topic has an Avro or Protobuf schema and the destination table has matching columns. Each schema field maps to a column. writeMetadatamode: the message data is written into a singledatacolumn (bytes or string), with optional metadata columnssubscription_name,message_id,publish_time,attributes.
Permissions
The Pub/Sub service agent ([email protected]) needs roles/bigquery.dataEditor and roles/bigquery.metadataViewer on the destination dataset.
Trade-offs vs Dataflow
BigQuery subscriptions are the cheapest and simplest path for raw ingest, but they cannot transform, enrich, join, or window data. If you need any of those, fall back to a Dataflow streaming pipeline reading from a regular pull subscription.
Cloud Storage Subscriptions
A Cloud Storage subscription writes batched messages to a GCS bucket as files. This is the right choice for archival, data lake landing zones, and offline ML training data.
Configuration
You specify:
bucketandfilenamePrefix/filenameSuffix.- A batching trigger:
maxDuration(default 5 minutes, range 1 minute to 10 minutes) andmaxBytes(1 KB to 10 GiB). - Output format:
text(newline-delimited) oravro(with optionalwriteMetadata).
Permissions
The Pub/Sub service agent needs roles/storage.objectCreator on the bucket. Files are named <prefix><timestamp>-<UUID><suffix> and arrive in near-real-time depending on your batching settings.
This subscription type replaces a common older pattern of "Pub/Sub → Dataflow → GCS" for simple archival, removing the Dataflow cost and operational burden.
OIDC Push Authentication
Public push endpoints are a security risk: anyone who knows the URL could POST to it. Pub/Sub solves this with OIDC token-based push authentication.
How It Works
- You attach a service account to the push subscription via
pushConfig.oidcToken.serviceAccountEmail. - Optionally set
pushConfig.oidcToken.audience(defaults to the push endpoint URL). - On every push, Pub/Sub generates a Google-signed OIDC ID token for that service account and sends it in the
Authorization: Bearer ...header. - Your endpoint (Cloud Run, Cloud Functions, App Engine, or any service behind IAP / your own validator) verifies the token.
Required Permissions
The Pub/Sub service agent must have roles/iam.serviceAccountTokenCreator on the push service account. The push service account must have whatever permission is needed to invoke the endpoint — for example roles/run.invoker on a Cloud Run service.
For private Cloud Run services, OIDC push is the only supported authentication mode. Set pushConfig.oidcToken.serviceAccountEmail and grant that SA roles/run.invoker. Cloud Run automatically validates the OIDC token; you do not write verification code.
IAM Roles and Security
Pub/Sub has a small, well-defined role surface. Use the principle of least privilege:
roles/pubsub.publisher: canPublishto a topic. Grant on the topic, not the project.roles/pubsub.subscriber: canPull,Ack,ModifyAckDeadline. Grant on the subscription or topic.roles/pubsub.viewer: read-only metadata, useful for dashboards.roles/pubsub.editor: create / update / delete topics and subscriptions in a project.roles/pubsub.admin: full control including IAM policy changes. Reserve for platform team.
VPC Service Controls
Pub/Sub supports VPC Service Controls so that publish and pull traffic from inside the perimeter cannot egress data to topics in projects outside the perimeter. This is the standard control for regulated environments. Combine with CMEK on the topic (messageStoragePolicy plus customer-managed Cloud KMS keys) for envelope encryption.
Service Account Impersonation
For cross-project event flows, prefer giving the producer service account the roles/pubsub.publisher role on the specific topic in the consumer project, rather than copying credentials. This keeps the audit trail clean in Cloud Audit Logs.
Pub/Sub Lite: When and Why
Pub/Sub Lite is a separate service (different APIs, different client library) optimized for cost over convenience. Key differences from regular Pub/Sub:
- Zonal or regional, not global. You choose the location at topic creation time.
- Pre-provisioned capacity in publish/subscribe MiB/s and storage GiB — billed for the capacity reservation, not per-operation. This makes it dramatically cheaper at sustained high throughput.
- Partitioned (Kafka-like): you choose a partition count and ordering is per-partition.
- No push subscriptions, no BigQuery subscriptions, no schemas, no exactly-once, no filtering. You get raw partitioned messaging only.
When to Choose Lite
- Predictable, very high throughput (hundreds of MB/s sustained) where standard Pub/Sub cost would be prohibitive.
- You already have a Kafka-style consumer and just want a managed broker.
- You can tolerate zonal availability or are willing to use regional Lite for HA.
The exam often tempts you with Pub/Sub Lite as a "cheaper Pub/Sub." It is cheaper, but the missing features matter: no BigQuery subscription, no push, no exactly-once, no schemas, and you must pre-provision capacity. Unless the scenario explicitly mentions sustained high throughput and Kafka migration, the answer is regular Pub/Sub.
Monitoring and Operations
Pub/Sub exposes per-topic and per-subscription metrics in Cloud Monitoring:
pubsub.googleapis.com/subscription/num_undelivered_messages— the backlog.pubsub.googleapis.com/subscription/oldest_unacked_message_age— the age of the oldest un-acked message; critical for SLOs.pubsub.googleapis.com/subscription/ack_message_count— successful acks per minute.pubsub.googleapis.com/topic/send_request_count— publish QPS.
Alert on oldest_unacked_message_age exceeding (say) 60% of your retention to catch broken consumers before data is lost. Alert on DLT publish count being non-zero to catch poison messages.
For per-message tracing, enable OpenTelemetry in your client and propagate the googclient_OpenTelemetrySpanContext attribute, which Pub/Sub preserves through the topic.
Pub/Sub fast-lookup numbers: max message size 10 MB; default subscription retention 7 days; max retention 7 days (subscription) or 31 days (topic with messageRetentionDuration); ack deadline range 10–600 seconds; DLT maxDeliveryAttempts range 5–100; ordering throughput 1 MB/s per key; exactly-once is pull-only and regional.
白話文解釋(Plain English Explanation)
Analogy 1: The Newspaper Delivery Service
A topic is the newspaper. Publishers are journalists; they file articles without knowing who reads them. Each subscription is a different reader's mailbox: the sports fan and the politics nerd subscribe to the same paper but their copies are independent — if the sports fan tears up his copy unread (fails to ack), the politics nerd still gets her copy. A push subscription is home delivery; a pull subscription is the reader walking to the newsstand on her own schedule. Ordering keys are like serial numbers on a comic book series — the publisher promises that within one series, issue #2 always arrives after issue #1.
Analogy 2: The Restaurant Order Ticket Rail
Imagine a busy kitchen with an order rail. Waiters (publishers) clip tickets to the rail (the topic). Cooks (subscribers) take tickets one by one. If a cook drops a ticket (no ack), it stays on the rail and another cook eventually picks it up — that is at-least-once delivery. Exactly-once delivery is when a kitchen manager stands at the rail and physically prevents two cooks from grabbing the same ticket. A dead-letter topic is the "weird orders" board: if five different cooks can't figure out what "extra spicy invisible noodles" means, the ticket moves to that board so the manager can investigate, instead of clogging the main rail.
Analogy 3: The Office Mail Room
The mail room (Pub/Sub) accepts envelopes from any department and routes copies into recipient pigeonholes (subscriptions). Each pigeonhole is independent — emptying yours doesn't empty anyone else's. The mail room keeps every envelope for seven days (default retention); if you take vacation and your pigeonhole fills up, on day eight the oldest envelopes are shredded. A snapshot is a Polaroid of your pigeonhole on Monday morning; if you accidentally throw out important mail on Wednesday, you can ask the mail room to "seek" your pigeonhole back to the Monday-morning state and redeliver everything that was there.
Frequently Asked Questions (FAQs)
Q1: How long are messages retained in Pub/Sub?
A1: A subscription retains un-acked messages for messageRetentionDuration, default 7 days, range 10 minutes to 7 days. A topic can independently retain messages for up to 31 days via its own messageRetentionDuration, which enables seek-to-timestamp and lets newly created subscriptions read historical messages.
Q2: What is the maximum message size?
A2: 10 MB per message. For larger payloads, store the blob in Cloud Storage and publish a Pub/Sub message containing the GCS URI. This is the "claim-check" pattern.
Q3: Does Pub/Sub guarantee exactly-once delivery?
A3: Yes, but with caveats. Exactly-once delivery is opt-in (enableExactlyOnceDelivery: true), pull subscriptions only, and single-region only. It prevents redelivery after a successful ack within that region. It does not deduplicate at publish time and does not span regions.
Q4: Push or pull for Cloud Run?
A4: Push with OIDC. Cloud Run autoscales on HTTPS request count, so a push subscription that POSTs each message to the Cloud Run URL gives you native autoscaling. Configure pushConfig.oidcToken.serviceAccountEmail and grant that SA roles/run.invoker on the Cloud Run service. Private Cloud Run services require OIDC push — there is no other supported pattern.
Q5: When should I use Pub/Sub Lite instead of Pub/Sub?
A5: Use Pub/Sub Lite only when you have sustained high throughput (hundreds of MB/s), can tolerate zonal or single-region operation, and do not need push, BigQuery subscriptions, exactly-once, schemas, or filtering. Lite is significantly cheaper because you pre-provision MiB/s and GiB capacity, but it is a different API and a different service tier — not a drop-in cheaper Pub/Sub.
Q6: How do I replay messages after a buggy consumer?
A6: Either create a snapshot before the bad deploy (gcloud pubsub snapshots create) and seek back to it after, or enable topic-level message retention (up to 31 days) and seek the subscription to a timestamp. Snapshots are subscription-scoped and last up to 7 days; topic retention persists across subscription recreation.
Q7: How do I prevent unauthorized POSTs to my push endpoint?
A7: Enable OIDC token authentication on the push subscription via pushConfig.oidcToken.serviceAccountEmail. Pub/Sub will send a signed ID token in the Authorization header on every request, and your endpoint (or Cloud Run / IAP) validates it. The Pub/Sub service agent needs roles/iam.serviceAccountTokenCreator on the push SA.