Introduction
Cloud VPN securely connects your peer network (on-premises data center, branch office, or another cloud provider) to your Google Cloud VPC network through an IPsec VPN connection. Traffic is encrypted by one VPN gateway and then decrypted by the other VPN gateway, protecting your data as it traverses the public internet. Cloud VPN is the cheaper, faster-to-provision alternative to Cloud Interconnect, designed for workloads that need site-to-site encryption but can tolerate per-tunnel bandwidth caps around 3 Gbps and the variable latency of the public internet path. For the Professional Cloud Network Engineer (PCNE) exam, you must understand the architectural differences between HA VPN and Classic VPN, the cryptographic parameters Google supports (AES-256-GCM, SHA-256, IKEv2), and how BGP via Cloud Router enables dynamic failover.
Google offers two flavors of Cloud VPN: HA VPN (the modern, recommended product with a 99.99% gateway SLA) and Classic VPN (legacy, single-interface, 99.9% SLA, scheduled for deprecation in many configurations). The PCNE blueprint heavily favors HA VPN in scenario questions because it is the only option that meets enterprise SLA requirements and is the only VPN compatible with Network Connectivity Center spoke topologies.
Cloud VPN is a managed IPsec VPN service that establishes encrypted tunnels between a Google-managed VPN gateway (google-compute-vpn-gateway resource) and a peer VPN device, using IKEv2 or IKEv1 over UDP 500 / UDP 4500 (NAT-T). Each tunnel carries an ESP payload encrypted with ciphers like AES-256-GCM and authenticated with SHA-256 or AEAD.
HA VPN: The 99.99% SLA Architecture
HA VPN is a single regional resource in Google Cloud that exposes two external IPv4 interfaces (interface 0 and interface 1), each on a separate Google-managed cluster in different availability zones within the region. To qualify for the 99.99% SLA, you must build two tunnels from your HA VPN gateway and terminate them on either (a) two peer VPN devices, or (b) one peer device with two interfaces (active/active or active/standby), with BGP sessions established on both tunnels.
Gateway Resource Model
The gcloud compute vpn-gateways create command provisions an HA VPN gateway and immediately allocates the two external IPs. You cannot pick the IPs yourself — Google assigns them from its address pool. Each interface has a stable IP for the lifetime of the gateway, which matters because your on-premises firewall must allow IKE and ESP from those exact addresses.
Two-Tunnel Topology
A supported HA VPN topology terminates interface 0 → peer device A, tunnel 1 and interface 1 → peer device B, tunnel 2. If you instead build only one tunnel from one interface, you fall back to the 99.9% SLA (single-tunnel). Building four tunnels (two from each interface) is valid for cross-cloud topologies but doesn't increase the SLA — it only adds bandwidth via ECMP.
BGP Requirement
HA VPN only supports dynamic routing via BGP; static routes are rejected at tunnel-create time. Each tunnel runs its own BGP session on a /30 link-local subnet (e.g., 169.254.0.0/30). The Cloud Router on the GCP side advertises VPC subnets, and the peer router advertises on-prem prefixes. BGP MED and AS-path prepending let you steer traffic between the two tunnels for active/active or active/passive behavior.
HA VPN's 99.99% SLA only applies when both tunnels are up and BGP is established on both. If you misconfigure one tunnel and only one BGP session is up, you are unknowingly operating at 99.9% SLA. Use Cloud Monitoring's vpn.googleapis.com/tunnel_established and router.googleapis.com/bgp/session_up metrics to alert on either tunnel/session dropping.
Classic VPN: Legacy Behavior and Limitations
Classic VPN (resource type targetVpnGateway) is the original Cloud VPN product. It exposes a single external IP on a single gateway, and supports either static (policy-based or route-based) routing or BGP-based dynamic routing. The maximum SLA is 99.9%, and Google has announced that creating new Classic VPN gateways for IPv4 BGP-based topologies has been restricted in most regions in favor of HA VPN. Existing Classic VPN gateways continue to operate.
Policy-Based vs Route-Based
Policy-based VPN matches traffic against IKE traffic selectors (left/right subnets). The peer's selectors must exactly mirror the GCP-side selectors; any mismatch causes Phase 2 to fail with TS_UNACCEPTABLE. Policy-based is brittle when on-prem subnets change.
Route-based VPN (Classic with static routes) negotiates 0.0.0.0/0 as the selector and uses VPC routes to direct traffic into the tunnel. This is more flexible but still static — no automatic failover if the peer device dies.
When Classic VPN Is Still Acceptable
- Lab or proof-of-concept environments where 99.9% is sufficient
- Peer devices that genuinely do not support BGP (rare in 2026)
- Migrating existing tunnels in place — no need to rebuild for the exam unless the scenario explicitly demands 99.99% SLA
Exam scenarios often describe a workload requiring "four nines SLA" and present Classic VPN as a candidate answer. Classic VPN cannot achieve 99.99% under any configuration — only HA VPN with two tunnels and BGP qualifies. Reject any answer that pairs Classic VPN with a 99.99% requirement.
IPsec IKEv2 and Supported Ciphers
Cloud VPN supports IKEv2 (RFC 7296) and IKEv1 (legacy). Google strongly recommends IKEv2 for new deployments because it supports modern AEAD ciphers, MOBIKE for endpoint mobility, and is more efficient at re-keying.
Phase 1 (IKE SA) Ciphers
Cloud VPN proposes a curated set of Phase 1 algorithms during IKE negotiation. Common combinations include:
- Encryption: AES-CBC-256, AES-GCM-256, AES-GCM-128
- Integrity (PRF): HMAC-SHA2-256, HMAC-SHA2-384, HMAC-SHA2-512
- DH Group: Group 14 (2048-bit MODP), Group 19/20 (ECP), Group 24
Phase 2 (Child SA / ESP) Ciphers
Phase 2 protects the actual data flowing in the tunnel:
- AES-256-GCM (AEAD — preferred; combines encryption + authentication)
- AES-256-CBC with HMAC-SHA2-256
- AES-128-GCM for lower-CPU peer devices
Pre-Shared Key (PSK)
Cloud VPN uses PSK authentication — there is no certificate-based IKE support. The shared secret is stored in the tunnel resource and should be at least 32 random characters. Rotate PSKs periodically by creating a new tunnel with a fresh secret and tearing down the old one (you cannot mutate a tunnel's PSK in place).
Cloud VPN AEAD recommendation: IKEv2 + AES-256-GCM (Phase 2) + SHA-256 PRF (Phase 1) + DH Group 14 or 19. UDP ports: 500 (IKE) and 4500 (NAT-T / IPsec encapsulation). ESP IP protocol number is 50 but Cloud VPN always wraps ESP in UDP 4500.
BGP Dynamic Routing via Cloud Router
Cloud Router is the BGP speaker on the GCP side of every HA VPN deployment. It is a managed, regional service that runs BGP without you provisioning any VMs.
ASN Assignment
You configure Cloud Router with a private ASN in the range 64512–65534 (16-bit private) or 4200000000–4294967294 (32-bit private). Public ASNs are also accepted if you own one. The peer's ASN is whatever your on-prem router uses; the two ASNs must be different (eBGP) for HA VPN. Google's reserved ASN for VPN peering is 16550 (used in Partner Interconnect, not VPN).
BGP Session Establishment
Each tunnel gets a /30 IPv4 link-local range. Cloud Router takes .1, peer takes .2. Once the tunnel is up, Cloud Router opens TCP 179 to the peer and exchanges OPEN messages. A successful session shows BGP state = Established in gcloud compute routers get-status.
Route Advertisement Modes
- Default mode: Cloud Router advertises all VPC subnets in the region (and globally if dynamic routing mode is
GLOBAL). - Custom mode: You explicitly list
--advertisement-rangesto control which prefixes leak to the peer.
Active/Passive vs Active/Active
Default BGP behavior with equal-cost routes on both tunnels yields active/active ECMP — flows hash across both tunnels for ~2x bandwidth. To force active/passive, prepend the AS-path on the standby tunnel (e.g., advertise with --advertised-route-priority=100 on primary and 200 on backup, with MED tweaks).
For maximum throughput across HA VPN, use active/active routing and ensure both peer devices accept asymmetric return paths. Stateful firewalls on the peer side often break ECMP because the return flow may arrive on a different tunnel than the egress. Pin flows with policy-based routing or accept the asymmetry on stateless ACLs.
Traffic Selectors and Subnet Negotiation
Traffic selectors define which source/destination prefixes are allowed to enter the tunnel. They are negotiated in IKE Phase 2.
HA VPN Traffic Selectors
HA VPN tunnels always use 0.0.0.0/0 ↔ 0.0.0.0/0 as the negotiated selector (effectively "route-based"). The actual routing decisions happen at the VPC route table and BGP layer — not at IKE. This dramatically simplifies negotiation and means you never see Phase 2 mismatches on HA VPN.
Classic VPN Policy-Based Selectors
Classic policy-based VPN allows you to specify left/right CIDRs (--local-traffic-selector, --remote-traffic-selector). The peer must propose the exact mirror — a single octet difference results in INVALID_SELECTORS and the tunnel won't establish.
Selector Best Practice
For any new build, prefer HA VPN with BGP. If you must use Classic policy-based VPN (e.g., legacy firewall vendor with no BGP), keep selectors as broad as possible (10.0.0.0/8 ↔ 10.0.0.0/8) to avoid re-negotiating every time a subnet is added.
MTU, Fragmentation, and Performance Tuning
IPsec adds ESP header + trailer + IV + ICV overhead, which reduces the effective payload MTU. Misconfigured MTU causes silent packet drops, fragmentation black holes, and TCP throughput collapses.
Cloud VPN MTU Defaults
- Cloud VPN tunnel MTU: 1460 bytes (IPv4)
- Cloud VPN tunnel MTU (IPv6 underlay): 1280 bytes minimum
- VPC default MTU: 1460 bytes (or 1500/8896 if you set jumbo)
TCP MSS Clamping
On the VPC side, GCP automatically clamps TCP MSS to MTU − 40 (1420 for default MTU). On the peer side, configure ip tcp adjust-mss 1380 (or similar) on the tunnel interface so TCP three-way handshakes negotiate an MSS that fits inside the IPsec payload after ESP overhead.
Don't Fragment (DF) Bit Handling
Cloud VPN respects the DF bit. If a 1500-byte packet with DF set hits the tunnel, Cloud VPN sends an ICMP "fragmentation needed" (type 3 code 4) back to the source and drops the packet. Some on-prem firewalls block ICMP, creating a PMTUD black hole — TCP connections hang at certain payload sizes. Mitigation: clamp MSS aggressively and ensure ICMP type 3 is allowed end-to-end.
The most common HA VPN performance complaint is "TCP works for small requests but stalls on large transfers." The root cause 90% of the time is MTU/MSS mismatch combined with ICMP being filtered. Set --mtu=1460 explicitly on the tunnel, clamp peer-side MSS to 1380, and verify with ping -M do -s 1432 from a VM (1432 + 28 = 1460).
VPN Tunnel Monitoring and Operations
Cloud VPN exports first-class Cloud Monitoring metrics for both the gateway and the tunnel.
Key Metrics
vpn.googleapis.com/tunnel_established— gauge, 1 = up, 0 = downvpn.googleapis.com/network/sent_bytes_countandreceived_bytes_count— per-tunnel throughputvpn.googleapis.com/network/dropped_sent_packets_count— drops due to MTU/encryption errorsrouter.googleapis.com/bgp/session_up— BGP session state per peer
Logs
Cloud VPN does not emit per-packet logs (that's the job of VPC Flow Logs), but the gateway resource writes IKE negotiation events to Cloud Logging under resource.type="vpn_gateway". Filter on jsonPayload.event_type for failures like IKE_AUTH_FAILED, DH_GROUP_MISMATCH, or LIFETIME_MISMATCH.
Alerting Patterns
A production alert policy should fire if either tunnel drops for more than 60 seconds, or either BGP session goes down. Don't alert only on "both tunnels down" — by then your SLA is already broken.
gcloud Operational Commands
gcloud compute vpn-tunnels list --region=us-central1
gcloud compute vpn-tunnels describe my-tunnel --region=us-central1
gcloud compute routers get-status my-router --region=us-central1
The get-status output includes per-peer BGP state, advertised/received route counts, and uptime.
Cross-Cloud and Multi-Region Topologies
HA VPN's two-interface design makes it the natural choice for cross-cloud and multi-region hybrid topologies.
GCP-to-AWS
Pair HA VPN with AWS Virtual Private Gateway (VGW) or Transit Gateway (TGW). AWS VGW exposes two public IPs per VPN connection — perfect for HA VPN's two interfaces. Build interface 0 → AWS tunnel-A, interface 1 → AWS tunnel-B, run BGP on both. ASNs: GCP uses 65001, AWS VGW typically uses 64512.
GCP-to-Azure
Azure VPN Gateway in active-active mode also exposes two public IPs. Same pattern: cross-connect interfaces. Note Azure's BGP timers default to 60s hold / 20s keepalive — set Cloud Router to match or you'll see flapping.
GCP-to-GCP (Inter-Region)
Two HA VPN gateways in different regions can be peered to create a region-to-region encrypted backbone, useful when VPC Network Peering's lack of transitivity is a problem. Each gateway sees the other's two IPs and builds four tunnels (full mesh).
When connecting HA VPN to AWS VGW, AWS terminates each tunnel on a different AZ-redundant endpoint. The two AWS tunnel IPs are in different AWS regions' edge networks — not literally different AZs, but topologically diverse. This means HA VPN-to-AWS gives you SLA on both sides only if you wire interface 0 to AWS tunnel-A and interface 1 to AWS tunnel-B (not both to the same AWS tunnel).
Bandwidth, Quotas, and Scaling
Each VPN tunnel is rate-limited by Google.
Per-Tunnel Throughput
- Maximum per tunnel: ~3 Gbps aggregate (ingress + egress combined), variable based on packet size and cipher choice
- AES-GCM is faster than AES-CBC + HMAC-SHA because it's AEAD and uses CPU AES-NI more efficiently
- Small packets (64-byte) cap throughput at hundreds of Mbps due to per-packet overhead
Scaling Beyond 3 Gbps
Use ECMP with multiple tunnels — build 4 tunnels from a single HA VPN gateway (2 per interface) and BGP-balance traffic for ~12 Gbps aggregate. Beyond that, switch to Cloud Interconnect (Dedicated 10/100 Gbps or Partner Interconnect 50 Mbps–50 Gbps), which is not encrypted in transit at L2 but lives on private fiber.
Quotas
- VPN gateways per region: 15 (soft, raisable)
- Tunnels per gateway: 8
- Cloud Router BGP peers per router: 8
- Routes advertised per BGP session: 100 (custom advertisement)
Security Hardening
Beyond the cipher choice, several knobs harden Cloud VPN further.
IKE Lifetime
Cloud VPN defaults to 36000 seconds (10 hours) for IKE SA and 10800 seconds (3 hours) for Child SA. Shorter lifetimes mean more frequent re-keying — costlier but more PFS. Match the peer's setting exactly or you'll see periodic 1-second blips.
Perfect Forward Secrecy (PFS)
PFS is enabled by default on Cloud VPN (negotiated via DH group in Phase 2). Always confirm the peer also has PFS on — without PFS, a compromised PSK exposes all historical traffic.
Dead Peer Detection (DPD)
Cloud VPN sends DPD probes every 10 seconds and tears down the tunnel after 3 missed responses (~30 seconds). This is what triggers BGP failover. The peer device should mirror this setting.
Firewall Rules
On GCP, you don't need ingress firewall rules on the VPN gateway itself (it's a Google-managed resource). But VMs receiving traffic from the VPN tunnel see source IPs from the on-prem range — you need VPC firewall rules permitting that traffic. Use network tags or service accounts to scope rules tightly.
白話文解釋(Plain English Explanation)
Analogy 1: The Diplomatic Mail Pouch
Imagine two embassies on opposite sides of a hostile city. They need to exchange documents but the city's postal system reads every letter. So each embassy gets a diplomatic pouch (the IPsec tunnel) with a tamper-evident seal (HMAC-SHA256) and a lock (AES-256-GCM) that only the receiving embassy can open. HA VPN is like having two pouches carried by two different couriers on two different routes — if one courier gets stuck in traffic (a tunnel drops), the other still gets through, guaranteeing a 99.99% on-time delivery rate.
Analogy 2: The Two-Phone Hotline
IKE Phase 1 is like two diplomats picking up encrypted phones and verifying each other's identity with a shared password (PSK). Once they trust each other, they negotiate a second, faster line for actual conversation — that's Phase 2 (Child SA) carrying the AES-256-GCM-encrypted business talk. BGP via Cloud Router is like a third line where the two embassies' switchboards continuously gossip about which streets are open ("here's how to reach 10.0.0.0/16 today") — so if a route changes, both sides adapt automatically without humans rewriting the address book.
Analogy 3: The Container Ship and the Customs Sticker
A regular packet is like a 1500-pound shipping container. IPsec adds a heavy customs sticker (ESP header + IV + ICV) that weighs about 40 pounds. If you stuff a full 1500-pound container into a tunnel that only allows 1460-pound total cargo (the MTU), the dockworkers either refuse the container (drop) or saw it in half (fragment). The fix is MSS clamping — telling the sender "ship containers no bigger than 1420 pounds so the sticker fits." When peer firewalls block the "container too big" ICMP message back to the sender, you get a PMTUD black hole — the sender keeps trying 1500-pound containers and nothing flows.
FAQs
Q: Can I use HA VPN with a peer device that doesn't speak BGP? A: No. HA VPN mandates BGP. If your peer device only supports static routes, you have two options: (1) deploy a small BGP-capable device (e.g., a Linux VM running FRR or BIRD) in front of it, or (2) fall back to Classic VPN with static routes — but you lose the 99.99% SLA and Network Connectivity Center spoke compatibility.
Q: How do I rotate the pre-shared key without an outage?
A: PSKs are immutable on a tunnel resource. To rotate, create a second tunnel on a different /30 BGP subnet with the new PSK, bring up BGP, verify routes are exchanged, then drain traffic off the old tunnel by AS-path prepending and finally delete it. Keep both up for at least 5 minutes to avoid flow disruption.
Q: Why is my HA VPN tunnel showing "First Handshake" forever?
A: This usually means Phase 1 cipher mismatch. Run gcloud compute vpn-tunnels describe and check detailedStatus. The most common culprits are: peer proposing IKEv1 while GCP expects IKEv2, DH group mismatch (peer using Group 5 when GCP requires Group 14+), or a typo in the PSK. Inspect peer device IKE logs side-by-side with Cloud Logging entries.
Q: Can Cloud VPN traffic skip the public internet? A: Cloud VPN gateway IPs are public, but the encrypted ESP packets can be sent over Cloud Interconnect via the HA VPN over Interconnect topology. This gives you encrypted, private-fiber transit — useful for compliance regimes that mandate encryption-in-transit even on private circuits.
Q: Does Cloud VPN support IPv6?
A: Yes. HA VPN supports dual-stack tunnels — IPv4 outer with IPv6 inner, or IPv4-only inner. Pure IPv6 outer is not yet supported. Configure with --ip-version=IPV6 on the BGP session and ensure the peer supports IPv6 BGP address-family negotiation (RFC 4760 multiprotocol extensions).
Q: What's the difference between HA VPN over Interconnect and regular HA VPN? A: Regular HA VPN sends encrypted traffic over the public internet. HA VPN over Interconnect uses your Dedicated or Partner Interconnect circuits as the underlay — so packets travel encrypted on private fiber. You still get IPsec encryption (which Interconnect alone doesn't provide), plus the private path's predictable latency. It's the highest-tier hybrid connectivity option in GCP.
Q: Does the 99.99% SLA cover the peer side? A: No. Google's SLA covers only the Google-managed HA VPN gateway — its two interfaces, the IPsec termination, and the BGP speaker on Cloud Router. Your peer device's availability, your on-premises power, and the internet path between your peer and Google are not covered. To get end-to-end 99.99%, you must engineer redundancy on your side too (two peer routers, dual ISPs, etc.).