examlab .net The most efficient path to the most valuable certifications.
In this note ≈ 19 min

Configuring Serverless Environments

3,750 words · ≈ 19 min read ·

Professional Cloud Architect deep dive into GCP serverless: Cloud Run, Cloud Functions, Eventarc, traffic splitting, and cold start optimization.

Do 20 practice questions → Free · No signup · PCA

Introduction to Serverless Compute on GCP

For a Professional Cloud Architect, "Serverless" is about more than just "no servers." It is a philosophy of abstracting away infrastructure to focus on code, achieving automatic scalability, and implementing a pay-as-you-go cost model.

Google Cloud's serverless ecosystem is centered around Cloud Run (for containers) and Cloud Functions (for snippets of code), tied together by Eventarc for event-driven orchestration.

A managed compute platform that enables you to run stateless containers that are invocable via web requests or Pub/Sub events. It abstracts away all infrastructure management and scales automatically from zero to thousands of instances. Reference: https://cloud.google.com/run/docs/overview/what-is-cloud-run


Plain-Language Explanation: Serverless Compute

Serverless is like the difference between owning a kitchen and ordering from a high-end food delivery service.

Analogy 1 — The Ghost Kitchen (Cloud Run)

In a traditional restaurant (Compute Engine), you pay for the rent, the stoves, and the staff even if no customers show up. Cloud Run is like a Ghost Kitchen. You bring your own specialized pots and pans (Container Image). When an order (Web Request) comes in, the kitchen lights turn on, the chefs cook the meal, and as soon as the order is gone, the lights turn off and you stop paying. If 1,000 orders come in at once, 1,000 kitchens instantly appear.

Analogy 2 — The Automatic Door Sensor (Cloud Functions & Eventarc)

Cloud Functions are like automatic door sensors. They don't do anything until someone walks by (An Event occurs). When the sensor detects a person (A file is uploaded to GCS), it triggers a specific, small action: "Open the door." You don't need a full security guard (A VM) standing there 24/7 just to open the door.

Analogy 3 — The Water Meter (Pay-as-you-go)

Traditional servers are like paying a flat fee for unlimited water, even if you only drink one glass. Serverless is like having a precision water meter. You only pay for the exact milliliters of water (CPU/Memory per millisecond) that actually pass through the tap. If the tap is off, your bill is zero.

On the PCA exam, if a scenario asks to "run a website with highly variable traffic while minimizing costs during idle periods," the answer is Cloud Run. Reference: https://cloud.google.com/run/docs/overview/what-is-cloud-run


Deploying to Cloud Run

Cloud Run is the modern standard for serverless.

  • Container-based: If it can run in a container, it can run on Cloud Run.
  • Concurrency: Unlike Cloud Functions, a single Cloud Run instance can handle multiple requests at the same time (up to 250), which is more efficient for web apps.

On the PCA exam, when a scenario explicitly requires reaching a Cloud SQL private IP or a Filestore share from Cloud Run or Cloud Functions, the only correct answer is to attach a Serverless VPC Access Connector. Direct VPC peering or shared VPC alone will not work because serverless services run in a Google-managed tenant project outside your VPC. Reference: https://cloud.google.com/run/docs/overview/what-is-cloud-run

Memorize these Cloud Functions 2nd Gen limits because PCA scenarios use them as elimination clues: max execution time 60 minutes (1st Gen was only 9 minutes), max instance size 32 GB RAM, and concurrency above 1 request per instance (1st Gen was hard-capped at 1). Cloud Run shares the same underlying platform and supports up to 250 concurrent requests per instance.


Traffic Splitting and Revisions

Every time you deploy to Cloud Run, it creates a new Revision.

  • Blue-Green Deployment: Deploy a new version, test it, and then flip 100% of traffic to it.
  • Canary Testing: Send 5% of traffic to the new revision and 95% to the old one to ensure stability.
  • Rollback: If the new version has a bug, you can instantly revert traffic to the previous healthy revision.

Cloud Functions (1st vs. 2nd Gen)

  • 1st Gen: Original version, limited execution time (9 mins) and limited concurrency (1 request per instance).
  • 2nd Gen: (Recommended) Built on top of Cloud Run. Supports longer execution times (up to 60 mins), larger instance sizes (up to 32GB RAM), and better event integration.

Event-driven Triggers with Eventarc

Eventarc allows you to build event-driven architectures by routing events from 90+ Google Cloud sources to Cloud Run, Cloud Functions, or GKE.

  • Sources: Cloud Storage (file uploads), Pub/Sub (messages), Cloud Audit Logs (any API call).
  • Architecture Tip: Use Eventarc to decouple your services. Instead of Service A calling Service B, have Service A emit an event that Eventarc picks up.

Serverless VPC Access Connectors

Serverless services run in a Google-managed tenant project. To access resources in your private VPC (like a Cloud SQL instance with a private IP or a Filestore share), you must use a VPC Access Connector.

  • How it works: It acts as a bridge between the serverless environment and your VPC.

Secrets Management in Serverless

Never hardcode API keys or passwords in your container or code.

  • Secret Manager: Store your secrets centrally.
  • Integration: Cloud Run and Cloud Functions can mount secrets as environment variables or as files in a volume at runtime.

Concurrency and Cold Start Optimization

  • Cold Start: The delay when a serverless platform has to start a new instance from scratch to handle a request.
  • Optimization:
    • Min Instances: Keep a minimum number of instances "warm" (but you pay for them).
    • Language Choice: Go and Node.js generally start faster than Java or Python.
    • Concurrency: Higher concurrency reduces the frequency of cold starts as one instance can handle more traffic.

A frequent PCA misconception: setting max-instances does not eliminate cold starts. max-instances only caps the upper bound to protect downstream databases from runaway scaling — it does nothing for latency. To actually warm instances and remove the cold-start delay, you must set min-instances to a non-zero value, which removes the "scale-to-zero" cost benefit. The trade-off is latency vs. idle cost, not a free lunch.

For latency-sensitive APIs where occasional first-request delays are unacceptable, the PCA-expected pattern is Cloud Run with min-instances >= 1 plus a tuned concurrency value (default 80, max 250). Raising concurrency lets each warm instance absorb more traffic before a new instance must cold-start, which compounds the benefit of min-instances instead of paying for many idle instances.


Custom Domains and SSL

  • Automatic SSL: Cloud Run provides a *.a.run.app URL with managed SSL out of the box.
  • Custom Domains: You can map your own domain (e.g., api.example.com) using Global HTTP(S) Load Balancing as a frontend for your Cloud Run service.

Observability for Serverless

  • Cloud Logging: All stdout and stderr from your container are automatically captured.
  • Cloud Monitoring: Track request count, latency, and instance count.
  • Cloud Trace: Use distributed tracing to see how long requests take as they move through your serverless services.

IAM Roles for Serverless Services

  • Service Identity: Each Cloud Run/Function service should have its own dedicated Service Account.
  • Permissions: Use the principle of least privilege. A service that only reads from GCS should only have the roles/storage.objectViewer role.

FAQ — Serverless Compute Configuration

Q1. When should I choose Cloud Run over Cloud Functions?

Choose Cloud Run for most web applications, APIs, or when you need specific libraries not available in the standard Cloud Functions runtimes. Choose Cloud Functions for simple, single-purpose event handlers or small snippets of code.

Q2. Can Cloud Run scale to zero?

Yes. By default, Cloud Run will stop all instances if there is no traffic, meaning you pay $0 when your app is not in use.

Q3. What is the "Concurrency" setting in Cloud Run?

It is the maximum number of simultaneous requests that a single container instance can handle. Tuning this correctly is key to minimizing cold starts and optimizing costs.

Q4. How do I connect Cloud Run to a private database?

Use a Serverless VPC Access Connector and ensure your database is configured for Private IP access within the VPC.

Q5. Is serverless always cheaper than VMs?

No. For applications with constant, high traffic, a reserved instance or a MIG on Compute Engine might be more cost-effective. Serverless is cheapest for bursty or low-to-medium traffic.


Final Architect Tip

For the PCA exam, remember that Cloud Run is effectively "Serverless GKE." It uses the same Knative standard. If a question involves "moving a containerized app to a serverless platform," Cloud Run is almost always the intended answer. Also, pay attention to VPC Access—it's a common stumbling block in architectural scenarios.

Official sources

More PCA topics