Introduction to Data Sovereignty and Compliance Design
Data Sovereignty and Compliance Design is the discipline of architecting GCP workloads so that data location, encryption, access paths, and audit trails satisfy the laws of the jurisdiction the data belongs to. On the PDE exam, this shows up as a constraint baked into almost every scenario: customer records must stay in Frankfurt, healthcare claims need HIPAA controls, a defense customer needs FedRAMP High. The right answer is rarely the cheapest or fastest pipeline. It is the one that does not violate the regulator.
This guide walks through every control surface a Data Engineer touches when Data Sovereignty and Compliance Design becomes a hard requirement: where bytes physically live, who can decrypt them, who can look at them, and how you prove all of that to an auditor six months later.
白話文解釋(Plain English Explanation)
Before diving into Assured Workloads folders and EKM key URIs, it helps to anchor Data Sovereignty and Compliance Design with three concrete pictures. Compliance is abstract; warehouses, hotels, and bank vaults are not.
Think of Data Residency Like a Bonded Warehouse
Imagine a shipping company that imports wine into Germany. German customs law says the wine must sit inside a bonded warehouse on German soil until duty is paid. The warehouse operator can be Swiss, the trucks can be Polish, the forklifts can be Japanese, but the bottles themselves cannot leave Bavarian ground until the paperwork clears.
Data residency works the same way. A europe-west3 Cloud Storage bucket is your bonded warehouse in Frankfurt. The control plane that manages the bucket may be operated globally by Google, but the customer rows inside it never leave Germany unless you explicitly copy them. Org Policy constraints like constraints/gcp.resourceLocations are the customs officers who refuse to stamp the manifest if you try to provision a bucket in us-central1.
Think of Assured Workloads Like a Hotel Floor for Diplomats
Some hotels reserve an entire floor for visiting diplomats. The elevator will not stop on floor 14 unless you have the right keycard. The cleaning crew is background-checked. The kitchen is separate. Room service is logged. From the outside it looks like a normal hotel; inside, every routine has been hardened for one specific guest profile.
Assured Workloads carves out exactly this kind of floor inside your GCP organization. You pick a compliance regime, like FedRAMP High or HIPAA, and Google spins up a folder where the resource locations, the support personnel, and even the underlying hardware pool are restricted to match that regime. You did not build a separate hotel; you bought guarded rooms inside the one you already use.
Think of CMEK and EKM Like a Safe Deposit Box
A safe deposit box at a bank needs two keys: the bank's guard key and your personal key. Without both, the box does not open. The bank cannot reach your gold bars on its own, and you cannot reach them without showing up to the branch.
Customer-Managed Encryption Keys (CMEK) put the second key in your hands but still inside the bank. External Key Manager (EKM) goes further: your key lives in your own vault across the street, and the bank has to phone you every single time it wants to touch the box. If you stop answering the phone, no one reads your data, not even Google. That is the strongest expression of Data Sovereignty and Compliance Design that GCP offers.
Core Concepts of Data Sovereignty and Compliance Design
Several terms get thrown around interchangeably in vendor marketing, and the PDE exam writers exploit that confusion. Pin these definitions down before anything else.
A guarantee about the physical geography where data at rest is stored. Saying europe-west3 for a Cloud Storage bucket gives you residency in Frankfurt. Residency does not say anything about who can access the data, only where the bytes live.
Reference: https://cloud.google.com/architecture/framework/security/data-residency-sovereignty
A stronger claim that the data is subject to the laws of a specific jurisdiction and cannot be compelled by any other government. Sovereignty implies residency plus operational, technical, and personnel controls. EU data sovereignty, for example, requires that no US-based Google engineer can be compelled by a US warrant to hand over EU customer data.
Reference: https://cloud.google.com/sovereign-cloud
A logical perimeter, often enforced by VPC Service Controls and Org Policy, that prevents data from crossing into services or regions outside the boundary even when a misconfigured IAM grant would otherwise allow it.
Reference: https://cloud.google.com/vpc-service-controls/docs/overview
The other big set of vocabulary belongs to compliance frameworks. Each framework prescribes a different mix of controls, and Data Sovereignty and Compliance Design is the work of mapping framework requirements to concrete GCP knobs.
- GDPR (EU): focuses on lawful basis for processing, data subject rights (access, deletion, portability), and breach notification within 72 hours. Cross-border transfers out of the EEA need Standard Contractual Clauses (SCCs) or an adequacy decision.
- HIPAA (US healthcare): protects PHI. Requires a signed Business Associate Agreement (BAA) with Google. Audit trails, encryption, and minimum-necessary access are mandatory.
- SOC 2: an attestation report, not a law. Auditors evaluate Security, Availability, Processing Integrity, Confidentiality, and Privacy against Trust Services Criteria. GCP itself is SOC 2 Type II audited, which gives customers inheritable controls.
- FedRAMP (US federal): tiered as Low, Moderate, and High. Workloads need US-person support, US-located resources, and continuous monitoring. Assured Workloads has a dedicated FedRAMP High package.
Residency, sovereignty, and a compliance certification are not the same thing. A Frankfurt bucket gives you residency. Assured Workloads for EU Sovereign Controls gives you sovereignty. A SOC 2 report gives you a third-party attestation. The PDE exam will test whether you can pick the minimum control that satisfies the stated requirement without over-engineering.
Reference: https://cloud.google.com/architecture/framework/security/data-residency-sovereignty
Architecture and Design Patterns
Most regulated data platforms on GCP converge on a small set of patterns. Recognize them and the exam scenarios become pattern-matching exercises.
The first pattern is the single-region locked tenant. A folder is created with constraints/gcp.resourceLocations set to a single region, plus VPC Service Controls wrapping every API used by the pipeline. BigQuery datasets are pinned to that region. Cloud Storage buckets are regional. Dataflow workers use regional endpoints. Logs route to a regional log bucket. Nothing leaves the geography even if a developer tries.
The second pattern is the dual-region paired tenant. When a single region is not durable enough but the data still cannot leave a continent, dual-region buckets like eur4 (Netherlands and Finland) or nam4 (Iowa and South Carolina) provide synchronous replication across two regions in the same legal area. BigQuery offers similar multi-region locations (EU and US) that stay within a continent. This pattern survives a regional outage without violating residency.
The third pattern is the sovereign enclave. Assured Workloads provisions a hardened folder where Google support is restricted to vetted personnel, Access Transparency is on by default, and CMEK is enforced. This is the answer for FedRAMP High, IL4, EU Sovereign Controls, and similar regimes where contractual residency is not enough.
The fourth pattern is the hybrid key escrow. EKM is wired into BigQuery and Cloud Storage. The actual key material lives at a third-party HSM provider like Fortanix, Thales, or Equinix SmartKey. Each decrypt operation triggers an outbound RPC to the external HSM. If the customer revokes the key, the data becomes unreadable in seconds. This is how companies prove cryptographic sovereignty even when they choose a US-headquartered cloud.
For multi-region BigQuery, remember that EU and US are continent-bounded multi-regions, not global. A dataset in the EU location stores data only across European regions. This is usually enough for GDPR residency without paying for a strict single-region lock.
A pattern worth calling out separately is log routing for compliance. Default Cloud Audit Logs live for 400 days for Admin Activity and 30 days for Data Access. Most regulated industries demand 6 or 7 years. The pattern is to create an aggregated log sink at the organization or folder level, route to a Cloud Storage bucket with bucket lock, and set a retention policy. Bucket lock makes the retention policy immutable, which is what auditors want to see.
GCP Service Deep Dive
Each GCP service exposes its own knobs for Data Sovereignty and Compliance Design. The PDE exam tests whether you know the right knob on the right service.
Cloud Storage offers regional, dual-region, and multi-region buckets. The location is set at creation and cannot be changed. Dual-region (e.g. eur4) gives synchronous turbo-replication for an extra fee, with an RPO of 15 minutes guaranteed by SLA. Bucket lock plus retention policy creates Write-Once-Read-Many (WORM) storage that satisfies SEC 17a-4(f), FINRA, and CFTC retention rules. CMEK is configured per bucket via the --default-kms-key flag.
BigQuery datasets carry a location attribute. Once set, you cannot move a dataset; you must export and re-create. Cross-region copies require the BigQuery Data Transfer Service or a managed copy job, and the destination must be in a compatible region. CMEK applies at dataset and table level. Authorized views, row-level security, and column-level access tags layer on top for fine-grained control.
Dataflow jobs run in a region you specify, but the staging bucket, the worker subnet, and the regional endpoint must all line up. A common mistake is to set --region=europe-west3 while leaving the staging bucket in us-central1, which causes intermediate shuffle data to land in the US and breaks residency.
Pub/Sub by default uses a global routing layer that may cache message metadata outside your chosen region. For strict residency, use Pub/Sub message storage policies to pin the allowed persistence regions. Without this policy, messages can be persisted in any Google region for durability.
Cloud KMS provides software-backed and HSM-backed keys, both region-pinned. External Key Manager (EKM) lets a key reference point at a URI hosted by a third-party HSM. EKM via VPC means the HSM call traverses your private network, which removes the public internet from the trust path. Each decrypt operation generates a justification field that the external HSM can inspect and reject.
Assured Workloads is the umbrella service. You pick a compliance regime when creating the workload folder. Supported regimes include FedRAMP Moderate, FedRAMP High, IL4, IL5, HIPAA, CJIS, ITAR, EU Regions and Support, EU Sovereign Controls (with T-Systems for Germany), and Canada Controlled Goods. The folder enforces resource location restrictions, blocks non-compliant services, restricts Google support personnel, and pre-wires Access Transparency and Access Approval.
VPC Service Controls draws a perimeter around projects so that even with valid IAM, a service like BigQuery cannot read from or write to a resource outside the perimeter. This stops the classic exfiltration scenario where a stolen service-account key tries to copy a dataset into an attacker-controlled project. Ingress and egress rules control the narrow exceptions.
Cloud DLP (Sensitive Data Protection) does not enforce residency on its own, but it is the workhorse for finding PII before it crosses a boundary. Pair it with Dataflow templates to redact, tokenize, or pseudonymize records prior to export.
Access Transparency logs every action a Google employee takes against your customer content. Access Approval turns those actions into explicit approval requests; without your approval, the Google engineer cannot proceed. The two are often confused.
Access Transparency and Access Approval are not the same feature. Access Transparency is read-only logging, available on most regulated GCP services. Access Approval is gating, available on a smaller set of services and required for some Assured Workloads regimes. Confusing the two is a frequent PDE exam distractor.
Reference: https://cloud.google.com/cloud-provider-access-management/access-approval/docs/overview
Common Pitfalls and Trade-offs
Real teams trip on the same set of mistakes when implementing Data Sovereignty and Compliance Design. The exam mirrors these.
Forgetting non-data services. Engineers lock down BigQuery and Cloud Storage but forget that Cloud Logging, Cloud Monitoring, Error Reporting, and Cloud Build all store data too. Build logs may contain customer schemas. Default log buckets may sit outside your region. Org Policy must cover every service, not just the obvious data services.
Over-using multi-region for residency. A team picks the US BigQuery multi-region thinking it is "global" and then realizes a Canadian regulator considers Iowa-stored data a cross-border transfer. Multi-region is convenient for durability and locality but does not satisfy a strict country-level residency claim.
CMEK theatre. Enabling CMEK with a Google-managed key project gives you a control plane illusion without real sovereignty. If Google still operates the KMS instance, a US warrant could in principle compel disclosure. EKM with a customer-operated HSM is the only architecture that breaks that chain.
Ignoring the staging bucket. Dataflow, Dataproc, and Vertex AI training all need a staging bucket. Picking a multi-region staging bucket while running a regional job leaks intermediate data to other regions. Staging buckets must match the job region.
Treating BAA as encryption. Some teams assume that signing a HIPAA BAA with Google magically makes their workload compliant. The BAA only allows you to use the covered services to store PHI; you still have to encrypt, log, restrict access, and avoid non-covered services. Pub/Sub Lite, for example, has historically not been on the BAA list. Always check the current covered-services page.
Audit log gaps. Data Access logs are off by default for most services because they are noisy and expensive. If a regulator asks "who read this row last March?" and you never enabled Data Access logs, you have nothing. Enabling them retroactively does not recover history.
Default Data Access audit logs are disabled for almost every GCP service except BigQuery. If your compliance regime requires read auditability, you must explicitly enable DATA_READ and DATA_WRITE log types on the relevant services through IAM Audit Config. Discovering this gap during an audit is painful.
Reference: https://cloud.google.com/logging/docs/audit
The trade-offs are real. Strict single-region lock costs more (no cheap multi-region). EKM adds latency on every decrypt. VPC Service Controls breaks innocent automation that crosses perimeters. Assured Workloads is roughly 20 percent more expensive than vanilla GCP. The exam expects you to recognize when these costs are justified by the compliance requirement and when they are over-engineering.
Best Practices
A short, opinionated list of what consistently works in production for Data Sovereignty and Compliance Design.
- Set Org Policy first, build resources second. Apply
constraints/gcp.resourceLocations,constraints/storage.uniformBucketLevelAccess,constraints/iam.disableServiceAccountKeyCreation, andconstraints/compute.requireOsLoginat the organization or folder level before any project gets provisioned. Retrofitting policy onto live resources is order-of-magnitude harder. - Aggregate logs at the org level into a locked bucket. One log sink, one Cloud Storage destination with bucket lock and a retention policy that matches your longest regulatory requirement. Per-project log sinks fragment your audit story.
- Use Assured Workloads for any FedRAMP, IL4, IL5, or EU Sovereign Controls scope. Manual hardening cannot match what the service enforces, and auditors recognize the Assured Workloads attestation directly.
- Pair CMEK with key rotation. Set automatic rotation on KMS keys (90 days is a common cadence). Use separate keys per environment and per data classification level. A single global key is a single global blast radius.
- Wrap data perimeters with VPC Service Controls in dry-run before enforcement. Dry-run mode logs what would have been blocked without breaking traffic. Two weeks of dry-run data lets you build accurate ingress and egress rules before flipping to enforce.
- Tag everything for data classification. Use Data Catalog and BigQuery policy tags to mark PII, PHI, and confidential columns. Column-level access control then becomes declarative instead of buried in view SQL.
- Document the lawful basis. For GDPR especially, the technical control matters less than your ability to point at a Record of Processing Activities (RoPA) and explain why each dataset exists. Engineering owns the evidence; legal owns the document.
- Test data subject access and deletion paths. GDPR Article 15 (access) and Article 17 (erasure) require that you can find and remove a single user's data within 30 days. If your data lake has no user-id index, you cannot answer the request.
A practical rule of thumb: if a customer can sue you in court for storing their data in the wrong country, use single-region or sovereign multi-region. If they only care about uptime and latency, multi-region is fine. The exam scenarios almost always hint at the regulatory threat model in the first two sentences.
Real-World Use Case
A mid-sized European insurance company with about 4 million policyholders moves its legacy mainframe-driven claims warehouse to GCP. The board has three hard requirements: claims data must stay in Germany, no US-resident Google personnel can read claims content, and the SEC-equivalent regulator BaFin requires 10 years of immutable audit trails.
The team designs the platform around Data Sovereignty and Compliance Design from day one. They request an Assured Workloads folder under the EU Sovereign Controls package, partnered with T-Systems as the German sovereign operator. All resources are pinned to europe-west3 (Frankfurt) and europe-west10 (Berlin) for dual-region durability. BigQuery datasets sit in a region-locked dataset; Cloud Storage uses dual-region eur4 buckets configured with CMEK keys hosted in Cloud EKM, with the actual key material at a Fortanix HSM cluster physically located in a Frankfurt colocation facility.
Pipelines run on Dataflow with the --region=europe-west3 flag, staging buckets matched to the same region, and Pub/Sub topics configured with message storage policies restricting persistence to EU regions only. VPC Service Controls wrap the entire data perimeter; ingress rules allow the policyholder portal subnet, egress rules allow exports only to a regulator-controlled SFTP endpoint.
For audit, the team creates an aggregated org-level log sink routing every Admin Activity, Data Access, System Event, and Policy Denied log to a single Cloud Storage bucket with bucket lock and a 10-year retention policy. Access Transparency is enabled. Access Approval is wired to the CISO's PagerDuty so that any Google support engineer requesting access generates a real-time alert; without approval within four hours, the request expires.
The result satisfies BaFin, GDPR, and the board's sovereignty mandate. It costs roughly 25 percent more than a vanilla GCP deployment, but the alternative was a private data center, which would have cost 4x. Data Sovereignty and Compliance Design here is not a cost center; it is the reason the migration was approved at all.
Notice how every layer reinforces the same boundary: Org Policy at the perimeter, Assured Workloads inside, CMEK with EKM at the encryption layer, and VPC Service Controls at the API boundary. Defense in depth is the operating principle of Data Sovereignty and Compliance Design. A single layer can fail; multiple layers all failing simultaneously is the kind of event that makes news.
Reference: https://cloud.google.com/architecture/framework/security/data-residency-sovereignty
Exam Tips
The PDE exam tests Data Sovereignty and Compliance Design through scenario questions, not vocabulary recall. The pattern is: a paragraph of business context with a regulatory hint, followed by four architectures, three of which fail one specific compliance check.
Read for the jurisdiction in the first sentence. Phrases like "German healthcare provider", "US federal contractor", "Canadian bank" tell you which regulatory regime drives the answer. The wrong answers usually pick the cheapest or most familiar service without checking residency.
Watch for the residency vs. sovereignty distinction. If the prompt says "data must stay in the EU", any EU multi-region works. If the prompt says "must satisfy German digital sovereignty law" or "no US person may access", the answer is Assured Workloads with EU Sovereign Controls or EKM, not just a Frankfurt bucket.
Memorize the Assured Workloads regimes and which controls each one enforces. FedRAMP High and IL4 are the ones the exam most often tests. Both restrict support personnel to US persons; FedRAMP allows continental US, IL4 narrows further.
Remember that VPC Service Controls operates at the API layer, not the network layer. It blocks the BigQuery Storage Read API call regardless of where the calling VM lives, which is why it stops both internet exfil and lateral movement.
Cheat sheet for the exam:
- Residency only -> regional or dual-region resource + Org Policy
constraints/gcp.resourceLocations - Sovereignty / no foreign access -> Assured Workloads + CMEK + Access Approval
- Cryptographic sovereignty -> EKM with customer-operated HSM
- Boundary enforcement -> VPC Service Controls perimeter
- Long-term audit retention -> aggregated log sink to Cloud Storage with bucket lock
- HIPAA on GCP -> sign BAA + only covered services + Data Access logs on
- GDPR data subject deletion -> tag PII columns + maintain user-id index for erasure
Reference: https://cloud.google.com/architecture/framework/security
The exam will sometimes offer Cloud HSM as the answer when the correct one is EKM. Cloud HSM is a Google-operated FIPS 140-2 Level 3 HSM; key material never leaves the HSM, but the HSM itself is run by Google. EKM places the HSM outside Google. If the prompt requires that "Google must not be able to access the keys," EKM is the only correct answer.
Finally, do not over-pick Assured Workloads. It is the right answer for FedRAMP, IL4/5, and EU Sovereign Controls, but it is overkill for plain HIPAA or plain GDPR, where regular GCP plus a BAA plus residency policies suffice.
Frequently Asked Questions (FAQ)
Q1: What is the difference between data residency and data sovereignty in GCP?
Data residency is a guarantee about geography: your bytes physically live in a named region, like europe-west3. Data sovereignty is a stronger guarantee that the data is governed by a specific jurisdiction's laws and cannot be compelled by foreign governments. A regional Cloud Storage bucket gives you residency. Assured Workloads with EU Sovereign Controls plus EKM gives you sovereignty. The exam loves to test whether you know that residency alone does not satisfy sovereignty requirements.
Q2: When should I use Cloud EKM instead of Cloud KMS or Cloud HSM?
Use Cloud KMS for general-purpose encryption where you want customer-managed keys but trust Google to operate the key material. Use Cloud HSM when you need FIPS 140-2 Level 3 attestation but still let Google operate the hardware. Use Cloud EKM when the regulatory or contractual requirement is that Google must not have access to the key material at all. EKM places the key in a third-party HSM you control, and every decrypt operation triggers an outbound RPC that you can audit, throttle, or revoke. The trade-off is added latency and the operational overhead of running an HSM.
Q3: Do I need Assured Workloads for HIPAA compliance on GCP?
No. Standard HIPAA compliance on GCP requires that you sign a Business Associate Agreement (BAA) with Google, restrict your workload to BAA-covered services, enable encryption (CMEK is recommended but not strictly required), enable Data Access audit logs, and follow minimum-necessary access principles. Assured Workloads with the HIPAA package adds extra controls like resource-location enforcement and personnel restrictions, which simplify audit but are not strictly required by HIPAA itself. For US federal healthcare workloads under FedRAMP, Assured Workloads becomes mandatory.
Q4: How do VPC Service Controls differ from IAM and firewall rules?
IAM controls who can call an API. Firewall rules control which network packets can reach a VM. VPC Service Controls draw a perimeter around projects so that even a fully-IAM-authorized service account cannot move data outside the perimeter via a service API. The classic example: a developer with valid BigQuery permissions cannot copy a dataset to a personal project outside the perimeter, because the API call itself is blocked at the perimeter, regardless of IAM. VPC SC stops data exfiltration that IAM and firewalls cannot. It is essential for Data Sovereignty and Compliance Design where exfiltration risk is part of the threat model.
Q5: How long should I retain Cloud Audit Logs for compliance?
The right answer depends on the regulation. SOX usually requires 7 years for financial audit trails. HIPAA requires 6 years for PHI access logs. PCI DSS requires 1 year minimum with 3 months immediately accessible. GDPR does not specify a duration but expects retention proportional to the processing purpose. The default Cloud Logging retention is 400 days for Admin Activity logs and 30 days for Data Access logs, which is rarely enough. The standard pattern is to create an aggregated log sink at the organization level that routes to a Cloud Storage bucket with bucket lock and a retention policy matching your longest regulatory requirement.
Q6: What is the role of dual-region buckets in Data Sovereignty and Compliance Design?
Dual-region Cloud Storage buckets like eur4 and nam4 provide synchronous turbo-replication between two regions in the same continent. They give you a 15-minute RPO SLA and survive a regional outage without violating continental residency requirements. They are a sweet spot for GDPR workloads that need disaster recovery: data stays in the EU, but you survive losing Frankfurt. They are not appropriate for strict country-level residency, where you need a regional bucket plus a separate regional disaster recovery target.
Q7: How do I prove to an auditor that no Google employee accessed my data?
Enable Access Transparency, which logs every action Google personnel take against your customer content with justification codes. Route those Access Transparency logs to a Cloud Storage bucket with retention. For stronger evidence, enable Access Approval, which requires explicit customer approval before any Google support action proceeds; without approval, the action does not happen. Combine both with EKM, where the external HSM logs every decrypt request, and you have three independent evidence trails that converge on the same answer. Most regulated industries accept this combination as proof of "no unauthorized access by the cloud provider".
Related Topics
- Storage Security and IAM Best Practices
- PII De-identification with Cloud DLP
- Cloud Storage Data Lake Design
Further Reading
- Architecture Framework: Data residency and sovereignty — official guidance on mapping residency and sovereignty requirements to GCP controls.
- Assured Workloads documentation — full list of supported compliance regimes and the controls each one enforces.
- Cloud External Key Manager (EKM) — reference for wiring third-party HSMs into BigQuery, Cloud Storage, and other services.
- VPC Service Controls overview — perimeter design patterns and ingress/egress rule reference.
- Google Cloud compliance offerings — the canonical index of certifications, attestations, and frameworks GCP supports.