Cloud Load Balancing

Introduction

Google Cloud Load Balancing (GCLB) is a fully distributed, software-defined managed service deployed across Google's global edge points of presence (PoPs). Unlike traditional appliance-based load balancers, GCLB has no hardware to provision, scales automatically to handle one million queries per second without pre-warming, and presents a single global anycast IPv4 or IPv6 address for global products. PCNE candidates must internalise the 2022 naming refresh: legacy "HTTP(S) Load Balancer" is now Application Load Balancer, "TCP/SSL Proxy" is the Proxy Network Load Balancer, and legacy "Network LB" is the Passthrough Network Load Balancer. The exam tests both names. This study note walks through every load balancer family, backend type (Instance Group, Zonal NEG, Internet NEG, Hybrid NEG, Serverless NEG, Private Service Connect NEG), protocol option (HTTP/1.1, HTTP/2, HTTP/3+QUIC, gRPC, TCP, UDP, ESP), and the advanced traffic features — URL maps, weighted backend services, session affinity, mTLS to backend, and traffic mirroring — that account for roughly 20 percent of PCNE scenario questions.

Forwarding rule + Target proxy + URL map + Backend service + Backend (MIG/NEG) is the canonical five-tier GCLB resource hierarchy. Every Application Load Balancer composes exactly these five resources, and each is created independently with its own gcloud compute verb. Knowing this hierarchy lets you debug any LB misconfiguration from the IP down to the VM.

Global vs Regional Application Load Balancer

Global External Application Load Balancer

The Global External Application Load Balancer (Premium Tier) uses Google's global network and anycast IP. A client in Tokyo and a client in Frankfurt hit the same 34.x.x.x IP but are absorbed at the nearest PoP, where the GFE (Google Front End) terminates TLS and forwards over Google's backbone to the closest healthy backend. It supports cross-region failover, Cloud CDN integration, Cloud Armor, Identity-Aware Proxy, and serverless NEGs. The classic variant (--load-balancing-scheme=EXTERNAL) uses the legacy targetHttpsProxies; the new variant (--load-balancing-scheme=EXTERNAL_MANAGED) uses Envoy-based proxies and unlocks advanced routing (header-based, weighted, fault injection).

Regional External Application Load Balancer

The Regional External Application Load Balancer is Envoy-based, lives entirely inside one region, and is the right choice when data residency demands traffic never leave a region (e.g. EU data sovereignty under europe-west4). It cannot use Cloud CDN, cannot use a global anycast IP, and uses Standard Tier networking by default. Cost is lower because egress stays in-region.

Cross-region Internal Application Load Balancer

Released GA in 2023, the Cross-region Internal Application Load Balancer spans multiple regions yet keeps the VIP private (RFC1918). It is the modern answer for active-active internal microservices that need a single internal anycast IP across us-central1 and europe-west1.

The Global External Application Load Balancer is the only Google Cloud LB family that supports Cloud CDN, Cloud Armor edge security policies, and HTTP/3 QUIC simultaneously on a single forwarding rule. If a scenario mentions any two of {global anycast IP, CDN, Cloud Armor, IAP}, the answer is Global External Application LB — never Network LB and never Regional LB.

Network Load Balancer: Passthrough vs Proxy

Passthrough Network Load Balancer (External / Internal)

The Passthrough Network Load Balancer uses Maglev consistent hashing to deliver packets directly to backend VMs without rewriting the source or destination IP. The backend sees the original client IP (no X-Forwarded-For needed). It is regional, supports TCP, UDP, ESP, GRE, ICMP, and is the only GCLB choice for non-TCP/UDP protocols. Use it for game servers, SIP, IPsec gateways, or when you absolutely need source-IP preservation. Health checks use compute.googleapis.com health check probes from 35.191.0.0/16 and 130.211.0.0/22.

Proxy Network Load Balancer (External / Internal)

The Proxy Network Load Balancer terminates TCP at the Envoy/GFE proxy and opens a new TCP connection to the backend. It supports SSL offload (formerly SSL Proxy LB), client-IP preservation via PROXY protocol v1/v2, and global anycast (External variant only). Use it for non-HTTP TCP services that still need TLS termination, e.g. a custom MQTT broker behind TLS.

Decision rule: passthrough vs proxy

Pick passthrough when you need client-IP preservation, non-TCP protocols, ultra-low latency (no proxy hop), or UDP. Pick proxy when you need TLS offload, global anycast, or session-aware features.

Candidates routinely pick "Passthrough Network LB" for a global anycast scenario. Passthrough is regional only. Only the External Proxy Network LB and the Global External Application LB give you a global anycast VIP. If the requirement says "single IP advertised from multiple continents", eliminate every passthrough option.

Internal vs External Load Balancing

External load balancers

External load balancers face the public internet with a routable public IP. They integrate with Cloud Armor, Cloud CDN, and Identity-Aware Proxy. Premium Tier delivers traffic over Google's backbone; Standard Tier uses the public internet from the region's edge.

Internal load balancers

Internal load balancers (Internal Application LB, Internal Proxy Network LB, Internal Passthrough Network LB) advertise an RFC1918 IP inside a VPC. They are reachable by VMs in the same VPC, by peered VPCs (via --purpose=PRIVATE_SERVICE_CONNECT or peering with import/export), by Cloud VPN clients, and by Cloud Interconnect attached on-premises networks. Internal LB has no public surface and no Cloud Armor edge integration, but supports backend-service-based security policies on the new EXTERNAL_MANAGED variant.

Proxy-only subnet requirement

Every Envoy-based regional LB (Regional External App LB, Regional Internal App LB, Regional Proxy Network LB) requires a proxy-only subnet with --purpose=REGIONAL_MANAGED_PROXY and --role=ACTIVE, sized at least /26 (64 addresses). Forgetting this subnet is the most common create-time error.

Backend Services, MIGs, and Weighted Traffic Splitting

Backend service balancing modes

A backend service distributes load using a balancing mode: UTILIZATION (target CPU 0.0-1.0), RATE (target requests-per-second per instance or per endpoint), or CONNECTION (target concurrent connections, used for TCP/SSL Proxy and Passthrough). Each backend in the service has a maxUtilization, maxRate, or maxConnections cap plus an optional capacityScaler from 0.0 to 1.0 used to drain traffic gracefully.

Weighted backend services for traffic splitting

The new EXTERNAL_MANAGED Envoy data plane supports weighted backend services inside a URL map's routeAction.weightedBackendServices. You can send 90 percent of traffic to backend-v1 and 10 percent to backend-v2 for canary releases without DNS changes. Weights are integers 0-1000 and must sum to a non-zero value across the route.

Health checks

Health checks live as a separate resource and probe from 35.191.0.0/16 and 130.211.0.0/22 (legacy) or via the Envoy data plane for new managed LBs. Configure --check-interval, --timeout, --healthy-threshold, --unhealthy-threshold. For VPC-native GKE pods, use a BackendConfig with a customised health check path.

Firewall rules for health checks must allow ingress from 35.191.0.0/16 and 130.211.0.0/22. This is the single most-tested firewall fact on PCNE. The Envoy-based managed proxies additionally require allowing the proxy-only subnet CIDR. Without these rules, every backend shows as UNHEALTHY even though the service itself responds fine.

URL Map and Path Matchers

Host rules and path matchers

The URL map is the Layer-7 routing table. It contains host rules (matching the Host: header) that point to path matchers; each path matcher contains path rules and an optional default service. Path matching uses prefix match (/api/*), full match, or regex (Envoy data plane only). Example: shop.example.com → path matcher shop-paths → /checkout/* to checkout-service, default to frontend-service.

Advanced routing actions

On the EXTERNAL_MANAGED scheme, route rules expose routeAction with:

urlRewrite — rewrite host or path before forwarding.
corsPolicy — server-side CORS handling.
faultInjectionPolicy — inject delay or HTTP abort (5xx) at a configurable percentage.
retryPolicy — automatic retries on 5xx, gateway-error, connect-failure.
requestMirrorPolicy — copy traffic to a shadow backend (see Traffic Mirroring).
timeout — per-route timeout override (default backend service timeout is 30 s).

Header-based routing

Match rules support headerMatches (exact, prefix, suffix, regex, range) and queryParameterMatches, enabling A/B testing by X-Beta-User or canarying by cookie.

Session Affinity

Affinity types

Session affinity sends requests from the same client to the same backend instance. Available types depend on the LB:

NONE — pure round robin (default).
CLIENT_IP — hash of client IP; works on all LB types including Passthrough.
CLIENT_IP_PORT_PROTO — five-tuple hash (Passthrough only).
GENERATED_COOKIE — GCLB cookie set by the LB (Application LB only).
HEADER_FIELD — hash of a named HTTP header (Envoy data plane only).
HTTP_COOKIE — custom cookie name + TTL (Envoy data plane only).

Affinity vs balancing mode interaction

When a backend exceeds its maxRate/maxUtilization, GCLB breaks affinity to avoid overload. To keep affinity at all cost, lower the balancing target or pin clients via consistentHash.minimumRingSize.

For sticky shopping carts on multi-region MIGs, use HTTP_COOKIE affinity with a TTL of 3600 seconds and consistentHash.minimumRingSize: 1024. This combination survives backend scale-up events without re-hashing every client, unlike CLIENT_IP which thrashes when corporate NATs change source IPs.

Hybrid NEG for On-Premises and Multi-Cloud

What is a hybrid NEG

A Hybrid Connectivity Network Endpoint Group (--network-endpoint-type=NON_GCP_PRIVATE_IP_PORT) lets a Google Cloud load balancer point to backends located on-premises, in AWS, in Azure, or in another Google Cloud project, reachable via Cloud VPN, Cloud Interconnect, or Cross-Cloud Interconnect. Each endpoint is an IP:port tuple of the remote target.

Use cases

Migration: front your on-prem datacenter with a global anycast IP during cloud migration.
Multi-cloud failover: primary backends in Google Cloud, failover hybrid NEG pointing to AWS ALB.
Compliance: keep workloads on-prem but expose them via Google's edge for DDoS protection.

Constraints

Hybrid NEGs require Premium Tier networking, are supported by Global/Regional External Application LB, Internal Application LB, and the Proxy Network LB families — but not by Passthrough Network LB. Health checks for hybrid NEG endpoints must use the proxy-only subnet egress path.

Serverless NEG: Cloud Run, App Engine, Cloud Functions

Anatomy of a Serverless NEG

A Serverless NEG (--network-endpoint-type=SERVERLESS) is a zero-endpoint NEG that proxies the LB to a serverless service. You attach it with --cloud-run-service, --cloud-run-tag, --app-engine-service, --app-engine-version, or --cloud-function. This unlocks Cloud Armor, Cloud CDN, custom domains with managed SSL certificates, and IAP for serverless workloads.

URL map fan-out

A single URL map can fan out /api/* to a Cloud Run NEG, /legacy/* to an App Engine NEG, and /static/* to a Cloud Storage backend bucket — letting you replatform incrementally behind one anycast IP.

Limits

Serverless NEGs only attach to External Application LB (Global classic, Global EXTERNAL_MANAGED, Regional). They do not support Passthrough Network LB or Internal Application LB (Internal App LB needs a separate PSC NEG to reach Cloud Run).

Private Service Connect Endpoints as Backends

PSC NEG types

Two PSC NEG types matter:

--network-endpoint-type=PRIVATE_SERVICE_CONNECT with --psc-target-service — point the LB to a published service via a PSC endpoint, used to consume a partner SaaS or a service in another VPC.
--network-endpoint-type=GCE_VM_IP_PORTMAP — used by the producer side to expose a service.

Consuming Google APIs through PSC

For Internal Application LB or Internal Proxy NLB, you can attach a PSC NEG that targets a Google-managed API (e.g. vertexai-googleapis-com) and reach that API from on-prem clients over Interconnect without using public IPs.

Why PSC NEG matters

PSC NEGs solve the "no transitive peering" problem: a hub VPC can expose downstream service VPCs to on-prem clients without flattening the IP space. Combined with cross-region Internal Application LB, you build a private global API gateway.

HTTP/2, HTTP/3 / QUIC, and gRPC

HTTP/2 to backends

Set the backend service --protocol=HTTP2 (or --protocol=H2C for cleartext) so the LB multiplexes streams to the backend. Required for gRPC services, since gRPC is HTTP/2 only. Also recommended for any latency-sensitive backend to avoid head-of-line blocking.

HTTP/3 and QUIC at the edge

Enable HTTP/3 with --quic-override=ENABLE on the target HTTPS proxy. The LB advertises alt-svc: h3=":443", and supporting browsers (Chrome, Edge, Firefox, Safari 16+) switch to QUIC over UDP/443. QUIC reduces handshake to 1-RTT or 0-RTT and recovers faster from packet loss on mobile networks. Backend protocol remains HTTPS/HTTP2 — QUIC is edge-only.

gRPC routing

For gRPC, the URL map supports gRPC-style serviceName and methodName matching when the LB scheme is INTERNAL_SELF_MANAGED (Traffic Director) or EXTERNAL_MANAGED. For simple deployments, treat gRPC like HTTP/2 and route by :path.

mTLS to Backend and Backend Authenticated TLS

mTLS from client to LB

On the Global External Application LB, attach a ServerTLSPolicy + ClientTLSPolicy (Certificate Manager + Network Security API) to enable mTLS at the edge. Clients must present a certificate signed by a trust config; the LB forwards the validated chain via X-Client-Cert-* headers.

Authenticated TLS to backend

Set --protocol=HTTPS on the backend service and bind a BackendAuthenticationConfig to require the LB to verify the backend's certificate against a Certificate Manager trust config. This is the only way to achieve true end-to-end TLS where the LB validates (not just trusts) the backend.

Use cases

End-to-end encryption for regulated workloads (PCI-DSS, HIPAA), zero-trust microservices where every hop is mutually authenticated, and replacing self-signed certificate workarounds for legacy backends.

Traffic Mirroring (Shadow Traffic)

How mirroring works

routeAction.requestMirrorPolicy.backendService copies the request to a shadow backend service while the original request continues to the primary. The mirrored response is discarded; only the primary's response reaches the client. Available on the EXTERNAL_MANAGED Application LB.

Why use it

Pre-production validation of a v2 deployment with real production traffic, capacity testing without user impact, and security inspection (forward a copy to an IDS backend).

Caveats

Mirrored requests count toward backend service quotas. Body is mirrored fully, so be mindful of cost and privacy — strip PII before forwarding to non-production environments.

白話文解釋（Plain English Explanation）

Cloud Armor and Identity-Aware Proxy Integration

Cloud Armor on the LB

Cloud Armor attaches a securityPolicy to a backend service for edge WAF, OWASP Top 10 rules, geo-based blocks, rate limiting with --action=rate-based-ban, and bot management. Adaptive Protection automatically detects L7 DDoS patterns and recommends rule additions. Edge policies (a separate --type=CLOUD_ARMOR_EDGE) sit closer to the client and run before backend selection — use them for IP allow lists and cookie-based filtering.

Identity-Aware Proxy

Setting --iap-enabled on the backend service forces every request to authenticate via Google Identity, OIDC, or external identities. IAP signs JWT headers (X-Goog-IAP-JWT-Assertion) the backend must verify. IAP is supported on Global External Application LB, Regional External Application LB, and Internal Application LB. The combination of IAP plus Cloud Armor plus signed URLs is the recommended zero-trust pattern for any internal admin surface.

Service Extensions

Service Extensions (callouts) let you plug a custom gRPC service into the request path for header rewriting, custom auth, or routing decisions. They run on Envoy and are configured via EXTENSION_CHAIN and EXT_PROC resources. Available on EXTERNAL_MANAGED Application LB and Internal Application LB. Useful when you outgrow URL map and want code-driven routing without writing your own ingress.

End-to-End gcloud Walkthrough

Reserve a global anycast IP

Run gcloud compute addresses create web-ip --ip-version=IPV4 --global to reserve a static anycast IPv4 address. For dual-stack, repeat with --ip-version=IPV6. The address stays free of charge while attached to a forwarding rule and incurs an idle fee when unused for more than an hour.

Create the backend service and attach NEGs

Use gcloud compute backend-services create web-bes --global --load-balancing-scheme=EXTERNAL_MANAGED --protocol=HTTPS --port-name=https --health-checks=https-hc --enable-cdn. Attach a Cloud Run Serverless NEG with gcloud compute backend-services add-backend web-bes --global --network-endpoint-group=run-neg --network-endpoint-group-region=us-central1. Repeat with hybrid NEG for on-prem failover.

Build the URL map with weighted routing

Author a YAML URL map with defaultRouteAction.weightedBackendServices containing web-bes-v1 weight 900 and web-bes-v2 weight 100, then import with gcloud compute url-maps import web-map --source=web-map.yaml --global. Bind a target HTTPS proxy with managed SSL certificate, then create the forwarding rule with --load-balancing-scheme=EXTERNAL_MANAGED --ports=443 --address=web-ip.

Tier selection at create time

Network Service Tier is set per forwarding rule via --network-tier=PREMIUM or STANDARD. Premium gives global anycast and Google backbone routing; Standard restricts you to a single region and the public internet. Mixing tiers within one load balancer is not allowed — the tier applies to all forwarding rules of the same scheme.

Logging, Monitoring, and Quotas

Cloud Logging fields

Enable LB logging with --enable-logging --logging-sample-rate=1.0 on the backend service. Each entry contains httpRequest, jsonPayload.statusDetails (e.g. backend_timeout, failed_to_connect_to_backend, client_disconnected_before_any_response), and jsonPayload.backendTargetProjectNumber. The statusDetails field is the single best debugging signal for any 5xx case — memorize the common values.

Cloud Monitoring metrics

Key metrics under loadbalancing.googleapis.com: https/request_count, https/backend_latencies, https/request_bytes_count. Cross-reference with https/backend_request_count to detect a routing-layer drop (LB receives but does not forward). For QUIC, https/frontend_tcp_rtt is replaced by https/frontend_quic_rtt.

Quotas worth knowing

Forwarding rules per project default to 75; URL maps default to 50; backend services per project default to 75; managed SSL certificates default to 100. Request increases via gcloud compute project-info describe --project=PROJECT --format='value(quotas)' and the Quotas page. The hard limit on URL map size is 256 KB serialized.

Set --logging-sample-rate=1.0 (100 percent sampling) only for non-production or during incident investigation; sustained 100 percent sampling at high QPS can drive Cloud Logging ingestion costs above the LB itself. Production default of 0.1 (10 percent) is a sensible balance, and statusDetails errors always emit regardless of sample rate.

Exam Tips and Common Traps

Decision tree shortcuts

HTTPS + global anycast + Cloud CDN → Global External Application LB.
TCP/UDP + preserve client IP → Passthrough Network LB (regional).
TCP + TLS offload + global anycast → External Proxy Network LB.
Internal microservice over HTTPS → Internal Application LB (regional or cross-region).
Cloud Run + Cloud Armor → Global External App LB + Serverless NEG.
On-prem backend behind global IP → Global External App LB + Hybrid NEG.

Backend protocol mismatch

If you set --protocol=HTTP2 on the backend service but your backend only speaks HTTP/1.1, the LB returns 502s. Match the backend service protocol to what the backend actually speaks.

Standard vs Premium Tier

Standard Tier networking forces regional egress and disables global anycast. Most exam scenarios mentioning "global" imply Premium Tier.

FAQs

Q: When should I pick Internal Passthrough Network LB over Internal Application LB? A: Pick Internal Passthrough when you need to load balance non-HTTP protocols (databases, custom TCP, UDP DNS, IPsec) or when you must preserve the original client IP for whitelisting on the backend. Pick Internal Application LB when you have HTTP(S) microservices that benefit from URL map routing, header-based traffic splitting, or Cloud Trace integration.

Q: Can a single forwarding rule handle both IPv4 and IPv6? A: No. You create two forwarding rules (one IPv4, one IPv6) pointing to the same target proxy and URL map. The Global External Application LB supports dual-stack this way; budget two static external IPs.

Q: How does GCLB handle WebSocket connections? A: The Application LB supports WebSocket on both HTTP/1.1 (Upgrade header) and HTTP/2 (Extended CONNECT). Long-lived WebSockets are subject to the backend service timeoutSec (default 30 s — raise to 86400 for hour-long sessions) and to TCP keepalive on the proxy.

Q: What is the difference between EXTERNAL and EXTERNAL_MANAGED load balancing schemes? A: EXTERNAL is the classic global GFE-based data plane. EXTERNAL_MANAGED is the new Envoy-based data plane that unlocks advanced routing (weighted backend services, fault injection, header rewrites, traffic mirroring) and Cross-region Internal Application LB topologies. New deployments should default to EXTERNAL_MANAGED unless they need a feature only the classic scheme supports.

Q: Can I use Cloud Armor with Internal Application LB? A: Yes, since 2023 Cloud Armor supports Internal Application LB via security policies attached to the backend service (not edge policies). You get WAF, rate limiting, and Adaptive Protection on internal traffic — useful for east-west zero-trust enforcement.

Q: Does Passthrough Network LB support TLS termination? A: No. Passthrough delivers raw packets to the VM, so TLS is terminated on the VM itself. If you need TLS offload at the LB layer for a non-HTTP protocol, use the External Proxy Network LB instead.

Q: How do I migrate a classic Global External HTTP(S) LB to the new EXTERNAL_MANAGED scheme? A: Create a parallel new LB with --load-balancing-scheme=EXTERNAL_MANAGED, share the same backend services, point a percentage of DNS traffic at the new IP via weighted DNS, validate, then cut over. There is no in-place migration; the underlying proxies are different.

Introduction

Global vs Regional Application Load Balancer

Global External Application Load Balancer

Regional External Application Load Balancer

Cross-region Internal Application Load Balancer

Network Load Balancer: Passthrough vs Proxy

Passthrough Network Load Balancer (External / Internal)

Proxy Network Load Balancer (External / Internal)

Decision rule: passthrough vs proxy

Internal vs External Load Balancing

External load balancers

Internal load balancers

Proxy-only subnet requirement

Backend Services, MIGs, and Weighted Traffic Splitting

Backend service balancing modes

Weighted backend services for traffic splitting

Health checks

URL Map and Path Matchers

Host rules and path matchers

Advanced routing actions

Header-based routing

Session Affinity

Affinity types

Affinity vs balancing mode interaction

Hybrid NEG for On-Premises and Multi-Cloud

What is a hybrid NEG

Use cases

Constraints

Serverless NEG: Cloud Run, App Engine, Cloud Functions

Anatomy of a Serverless NEG

URL map fan-out

Limits

Private Service Connect Endpoints as Backends

PSC NEG types

Consuming Google APIs through PSC

Why PSC NEG matters

HTTP/2, HTTP/3 / QUIC, and gRPC

HTTP/2 to backends

HTTP/3 and QUIC at the edge

gRPC routing

mTLS to Backend and Backend Authenticated TLS

mTLS from client to LB

Authenticated TLS to backend

Use cases

Traffic Mirroring (Shadow Traffic)

How mirroring works

Why use it

Caveats

白話文解釋（Plain English Explanation）

Cloud Armor and Identity-Aware Proxy Integration

Cloud Armor on the LB

Identity-Aware Proxy

Service Extensions

End-to-End gcloud Walkthrough

Reserve a global anycast IP

Create the backend service and attach NEGs

Build the URL map with weighted routing

Tier selection at create time

Logging, Monitoring, and Quotas

Cloud Logging fields

Cloud Monitoring metrics

Quotas worth knowing

Exam Tips and Common Traps

Decision tree shortcuts

Backend protocol mismatch

Standard vs Premium Tier

FAQs

Official sources

More PCNE topics