examlab .net The most efficient path to the most valuable certifications.
In this note ≈ 20 min

Managing Google Kubernetes Engine (GKE)

3,850 words · ≈ 20 min read ·

Professional Cloud Architect deep dive into GKE: Autopilot vs Standard, Regional clusters, HPA/VPA, GKE networking, and Workload Identity.

Do 20 practice questions → Free · No signup · PCA

Introduction to GKE Cluster Management

Google Kubernetes Engine (GKE) is the industry-leading managed Kubernetes service. For a Professional Cloud Architect, GKE is often the preferred platform for modern, containerized microservices. The focus is on operational efficiency, security, and multi-zonal reliability.

GKE removes much of the "undifferentiated heavy lifting" of managing a Kubernetes control plane, allowing you to focus on the workloads.

A mode of operation in GKE where Google manages the entire cluster infrastructure, including nodes and node pools. You only pay for the pods you run, and Google handles the scaling, security, and maintenance. Reference: https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview


Plain-Language Explanation: GKE Cluster Management

Managing Kubernetes can be complex, so let's use some analogies to simplify it.

Analogy 1 — The Apartment Building (Standard vs. Autopilot)

  • GKE Standard is like owning the whole apartment building. You are responsible for maintaining the hallways (Nodes), the roof, and the heating system. You have total control, but you also have to do all the work.
  • GKE Autopilot is like renting a fully serviced apartment. You just show up with your furniture (Pods). The landlord (Google) takes care of the building maintenance, security, and the elevators. You only pay for the square footage you actually use.

Analogy 2 — The Elastic Waistband (HPA & Cluster Autoscaler)

Imagine you are wearing a pair of pants with an elastic waistband.

  • Horizontal Pod Autoscaler (HPA) is the elastic stretching as you eat more (traffic increases). It adds more "room" (Pods) inside the existing waistband.
  • Cluster Autoscaler is like buying a larger pair of pants when the elastic can't stretch any further. It adds more physical material (Nodes) to make sure everything fits.

Analogy 3 — The ID Badge (Workload Identity)

In a high-security office, you don't give every employee a master key to the whole building (Service Account Key files). Instead, you give them an ID Badge (Workload Identity) that is programmed only to open the doors they need. If they lose the badge, it expires automatically. It's the safest way for your apps to talk to other Google services like BigQuery or Cloud Storage.

On the PCA exam, if the requirement is "minimize operational overhead" and "pay only for pod resources," the answer is GKE Autopilot. Reference: https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview


GKE Standard vs. Autopilot Clusters

  • Standard: Best for highly custom configurations, specific kernel tuning, or using specialized hardware like TPUs in a very specific way. You manage the node pools.
  • Autopilot: (Recommended for most) Best for reducing operational toil. Google manages node health, scaling, and security hardening.

Regional vs. Zonal Clusters

  • Zonal Clusters: The control plane and nodes are in a single zone. Low cost, but vulnerable to zone failures.
  • Regional Clusters: The control plane is replicated across three zones, and nodes are spread across those zones. This is the standard for production as it survives the loss of a whole zone.

For any PCA scenario that mentions "production GKE" or "must survive a zone outage with zero control-plane downtime," pick a Regional cluster, not a Zonal one. A Zonal cluster has a single-zone control plane, so if that zone fails you cannot run kubectl operations, scale node pools, or roll out new Deployments until the zone recovers — the existing pods keep serving, but the cluster is effectively frozen. Regional clusters replicate the control plane across three zones in the region.


Node Pool Management and Auto-repair

  • Node Pools: A subset of nodes within a cluster that all have the same configuration (e.g., a "high-mem" pool and a "gpu" pool).
  • Auto-repair: GKE continuously monitors the health of nodes. If a node fails, GKE automatically drains it and creates a new one.
  • Auto-upgrade: GKE automatically keeps your nodes updated with the latest Kubernetes versions and security patches.

Cluster Autoscaler and HPA

  • Horizontal Pod Autoscaler (HPA): Scales the number of pod replicas based on CPU or custom metrics.
  • Vertical Pod Autoscaler (VPA): Automatically adjusts the CPU and Memory limits of your pods based on historical usage.
  • Cluster Autoscaler: Adds or removes nodes from the node pools when pods cannot be scheduled due to lack of resources.

Configuring GKE Networking

  • VPC-Native Clusters (IP Aliases): (Recommended) Pod IPs are part of the VPC network, making them directly routable and allowing for better integration with other GCP services.
  • Gateway API: The modern way to manage service networking in GKE, providing more expressive and role-oriented load balancing than the traditional Ingress.

GKE Ingress and Load Balancing

  • GKE Ingress Controller: Automatically creates a Google Cloud HTTP(S) Load Balancer when you create an Ingress resource.
  • Container-Native Load Balancing: Traffic goes directly from the Load Balancer to the Pod IP (using Network Endpoint Groups or NEGs), reducing latency and hop count.

Workload Identity Configuration

Workload Identity is the recommended way for GKE applications to consume Google Cloud services.

  • How it works: It maps a Kubernetes Service Account to a Google Cloud Service Account.
  • Benefit: Eliminates the need to manage and rotate JSON service account keys, which is a major security risk.

When a PCA question asks how a GKE pod should call BigQuery, Cloud Storage, or Pub/Sub, the correct answer is Workload Identity (bind a Kubernetes Service Account to a Google Cloud Service Account via the iam.workloadIdentityUser role) — never "mount a JSON key as a Kubernetes Secret." Workload Identity issues short-lived, automatically rotated credentials per pod, removing the long-lived key material that would otherwise need manual rotation and audit.


Managing ConfigMaps and Secrets

  • ConfigMaps: Store non-sensitive configuration data (e.g., environment variables).
  • Secrets: Store sensitive data (e.g., passwords, API keys).
  • Architect Tip: For production, consider using Secret Manager and mounting secrets into GKE using the CSI Secret Store driver for better security and auditing.

Deploying Applications with Helm and Kustomize

  • Helm: A package manager for Kubernetes (using "Charts"). Great for complex, multi-component apps.
  • Kustomize: A template-free way to customize Kubernetes manifests for different environments (Dev, Staging, Prod). Built directly into kubectl.

Upgrading Clusters and Node Pools

  • Surge Upgrades: Allows GKE to create extra nodes during an upgrade to ensure there is no drop in capacity.
  • Maintenance Windows: Define exactly when Google is allowed to perform automated upgrades to your cluster.

FAQ — GKE Cluster Management

Q1. Can I switch from GKE Standard to Autopilot?

No. The choice between Standard and Autopilot is made at cluster creation time and cannot be changed. You must create a new cluster and migrate your workloads.

It is tempting to assume you can "upgrade" a GKE Standard cluster into Autopilot once the team wants less node-pool maintenance, but the mode is fixed at cluster creation. The only path is to provision a new Autopilot cluster and migrate workloads — typically by re-applying manifests and cutting traffic over via a regional HTTP(S) Load Balancer or a multi-cluster Gateway. On the exam, any answer that promises an in-place conversion between Standard and Autopilot is wrong.

Three GKE autoscalers, three different jobs: HPA changes the number of pod replicas based on CPU or custom metrics; VPA changes each pod's CPU/memory limits based on historical usage; Cluster Autoscaler adds or removes nodes in a node pool when pods are unschedulable. Do not enable HPA and VPA on the same metric (e.g. both on CPU) — they will fight each other and produce thrashing.

Q2. What is a "VPC-Native" cluster?

It is a cluster where pod IP addresses are allocated from a secondary IP range in your VPC subnet. This makes networking faster and allows you to use VPC features like VPC Flow Logs and Shared VPC with your pods.

Q3. How do I handle stateful applications in GKE?

Use StatefulSets and PersistentVolumeClaims (PVCs). GKE will automatically provision Persistent Disks and attach them to the correct pods, even if they move between nodes.

Q4. What is the difference between HPA and VPA?

HPA adds more pods (horizontal scaling). VPA makes existing pods "bigger" by giving them more CPU/RAM (vertical scaling). In most cases, you should not use both on the same metric (like CPU).

Q5. Why is Regional GKE better for production?

Because the Kubernetes Control Plane (Master) is replicated across three zones. In a Zonal cluster, if the single zone hosting the master goes down, you cannot change anything in your cluster until it recovers.


Final Architect Tip

For the PCA exam, always remember the Shared Responsibility Model. Google manages the Control Plane and (in Autopilot) the Nodes, but you are responsible for the security of your containers and your application code. Use Binary Authorization to ensure that only trusted, scanned images are deployed to your GKE clusters.

Official sources

More PCA topics