IAM and Lake Formation Fine-Grained Access - DEA-C01 Data Engineer Study Notes

Q: Q1 — When should I use IAM versus Lake Formation for data lake access control?

Use IAM for AWS API-level access — who can call glue:GetTable , athena:StartQueryExecution , s3:PutObject . Use Lake Formation for data-level access — who can SELECT from which database, table, column, and which rows. The two layers compose: IAM authenticates the caller and authorizes the API call, Lake Formation authorizes the data access. For new data lakes, adopt Lake Formation as the primary data permission model (revoke IAMAllowedPrincipals, grant explicitly). For legacy pipelines built on IAM-only, migrate gradually by enabling Lake Formation per database and revoking IAMAllowedPrincipals only after verifying Lake Formation grants are complete. The DEA-C01 exam plants this with scenario detail — "column-level access" is always Lake Formation; "API-level access for the Glue ETL job" is IAM.

Q: Q2 — How do I grant column-level access to a sensitive PII table without copying the table?

In Lake Formation: GRANT SELECT (customer_id, order_total, order_date) ON TABLE orders TO ROLE analyst-role . The analyst can SELECT only those three columns; queries referencing other columns return an access denied error. The query engine (Athena, Redshift Spectrum, EMR) projects only the allowed columns at the scan layer, so the analyst never sees PII columns even in query plans or error messages. For row-level filtering, add a Lake Formation data filter with expression like region = 'US-WEST' . For cell-level security, combine column include-list and row filter in the same data filter. The DEA-C01 exam plants this as the canonical answer to "fine-grained access without copying data."

Q: Q7 — How do I troubleshoot "access denied" errors on a Lake Formation-governed table?

Check four layers in order. Layer 1 — IAM : does the calling role have glue:GetTable , glue:GetPartitions , athena:StartQueryExecution (or equivalent)? Without these the query never reaches Lake Formation. Layer 2 — Lake Formation : is IAMAllowedPrincipals revoked on the database? Is the role granted SELECT on the table or on the specific columns? Is there a data filter that excludes the role's rows? Layer 3 — S3 : is the underlying S3 path registered under Lake Formation? If yes, the query engine should get credentials from Lake Formation. If no, the role needs s3:GetObject directly via IAM or bucket policy. Layer 4 — KMS : if the data is encrypted with SSE-KMS, does the role (or Lake Formation service role) have kms:Decrypt on the key? Walking through these four layers in order finds the problem in 95 percent of Lake Formation access-denied tickets.

IAM and Lake Formation fine-grained access control is the most-cited exam mistake in DEA-C01 community write-ups, because the two layers look like they do the same thing but actually compose in non-obvious ways that catch even experienced data engineers. On the DEA-C01 exam Domain 4 Task 4.2 plants this in roughly one out of every five questions: a candidate who thinks "Lake Formation grants are sufficient" or "IAM S3 policies are sufficient" or "Lake Formation deny overrides IAM allow" will pick the wrong answer in a high-stakes scenario about column-level access, cross-account sharing, or row-level filtering.

This guide untangles IAM and Lake Formation through the Data Engineer / MLOps lens: what each layer controls, how they compose, where the two-layer model bites in production, and how to design fine-grained access without leaving holes. It covers IAM identities and policies for data services, the Lake Formation permissions model (database, table, column, row, cell), data filters, LF-TBAC tag-based access control, registered locations, cross-account sharing via RAM and resource links, IAM Access Analyzer for data stores, and the canonical exam traps planted around the two-layer enforcement model.

Authentication vs Authorization — Two Different Questions

Before talking about IAM or Lake Formation, separate the two concerns.

Authentication — Who Are You

Authentication verifies identity: who is making this request? IAM users (with passwords or access keys), IAM roles (assumed by services or federated identities), and IAM Identity Center (formerly SSO) users are the three identity types DEA-C01 candidates need to know. Authentication answers "is this caller really Alice from the marketing team?" before authorization runs.

Authorization — What Can You Do

Authorization answers "given that we know it is Alice, can she read the customer table?" Authorization is where IAM policies and Lake Formation permissions both live, and where the two-layer composition creates the famous DEA-C01 trap.

Where DEA-C01 Tests Each

Task 4.1 tests authentication mechanisms: federated identity setup, IAM roles for services like Glue and Redshift, IAM Identity Center for analyst SSO. Task 4.2 tests authorization: IAM policies, Lake Formation grants, S3 bucket policies, KMS key policies, and how they compose. The two-layer trap is firmly in Task 4.2 territory.

IAM Identity Types For Data Services

The DEA-C01 exam expects fluency in IAM identity types as they relate to data engineering.

IAM Users And Long-Lived Access Keys

IAM users have passwords for console access and access keys for programmatic access. AWS guidance: minimize IAM users in favor of IAM Identity Center and assumed roles. The DEA-C01 trap: scenarios that suggest creating IAM users for analysts is wrong; the right answer is IAM Identity Center with role assumption.

IAM Roles For Services

Glue jobs, EMR clusters, Redshift clusters, Athena workgroups, and Lambda functions all assume IAM roles to access AWS resources on their behalf. The Glue ETL job role, for example, needs permissions on S3 source buckets, Glue catalog, and CloudWatch logs. The role's trust policy specifies which service can assume it (glue.amazonaws.com).

IAM Roles For Federated Users

Analysts and engineers authenticated via SAML, OIDC, or IAM Identity Center assume roles temporarily for the duration of their session. The role's trust policy specifies which IdP can assume it. Federated access is the right pattern for human users in production.

Cross-Account IAM Roles

A role in account A can be assumed by a principal in account B if the trust policy allows it. The cross-account pattern is critical for centralized data lake architectures where the lake lives in one account and consumers live in many.

Service-Linked Roles

Some AWS services create their own service-linked roles automatically (Lake Formation, Macie, OpenSearch). These roles have AWS-managed permissions and should not be modified.

IAM Policies For Data Engineering

IAM policies are JSON documents that allow or deny actions on resources. The DEA-C01 exam expects familiarity with policy structure and common condition keys.

Identity-Based vs Resource-Based Policies

Identity-based policies attach to an IAM user, group, or role and define what that identity can do. Resource-based policies attach to the resource (S3 bucket policy, KMS key policy, Glue Catalog resource policy, Lambda permission) and define who can access that resource. Both are evaluated at access time.

Policy Evaluation Logic

A request is allowed if any policy explicitly allows it AND no policy explicitly denies it. Explicit deny always wins. Within a single account, an allow in either identity-based or resource-based policy is sufficient (with caveats around organization SCPs, permission boundaries, and session policies).

Common Condition Keys For Data Engineering

aws:SourceVpc (allow only from a specific VPC), aws:SourceIp (allow only from a specific IP range), aws:RequestTag/* (require specific tags on resource creation), s3:prefix (allow access only to specific S3 prefixes), kms:ViaService (allow KMS only when used through a specific service like S3), aws:PrincipalTag/* (ABAC pattern based on principal tags).

Permission Boundaries

A permission boundary is a managed policy that defines the maximum permissions a role can have. Used to constrain delegated administrators — "you can create roles, but the roles cannot exceed this boundary." DEA-C01 plants this in scenarios about delegating IAM admin to a data engineering team without giving them root.

Lake Formation — Centralized Data Lake Governance

Lake Formation is the AWS service for centralized data lake permission management built on top of the Glue Data Catalog.

What Lake Formation Adds Over IAM

IAM controls access to AWS resources at the API level — glue:GetTable, s3:GetObject. Lake Formation adds fine-grained data permissions at the database, table, column, row, and cell level — Lake Formation lets you grant SELECT on three columns of a 100-column table without the analyst seeing the other 97. IAM cannot express column-level grants natively (you would have to write a complex policy with column-name conditions); Lake Formation makes column-level a first-class permission.

The Lake Formation Permissions Hierarchy

From coarsest to finest: Catalog (all databases), Database (all tables in a database), Table (one table), Column (subset of columns), Row (filter expression on rows), Cell (column + row filter combined). Grants flow downward — a database-level SELECT applies to all tables; a table-level SELECT applies to all columns unless restricted.

Granting And Revoking Permissions

Lake Formation uses a GRANT and REVOKE model familiar from SQL: GRANT SELECT ON DATABASE marketing TO ROLE analyst-role, GRANT SELECT (customer_id, order_total) ON TABLE marketing.orders TO ROLE analyst-role. The grants are stored in Lake Formation's permission store and evaluated at every Athena, Redshift, EMR, or Glue access.

Data Lake Administrators

A Lake Formation data lake admin has full control over Lake Formation permissions and can grant access to others. The first task when adopting Lake Formation is designating one or more data lake admins; do not use the AWS root account.

IAMAllowedPrincipals — The Backward-Compat Mode

By default, new Glue databases have IAMAllowedPrincipals granted, meaning Lake Formation defers to IAM for permission checks. This is the backward-compatible mode that lets existing IAM-based pipelines continue working. To enable Lake Formation enforcement, revoke IAMAllowedPrincipals and grant explicit Lake Formation permissions. The DEA-C01 trap: candidates who set up Lake Formation but forget to revoke IAMAllowedPrincipals find that IAM permissions still apply, and Lake Formation grants are not enforced as expected.

Lake Formation enforces fine-grained data permissions at database, table, column, row, and cell granularity, but only after IAMAllowedPrincipals is revoked from the catalog database — otherwise Lake Formation defers to IAM and your fine-grained grants do not take effect. The DEA-C01 exam plants this as the most-cited trap: a scenario describes "Lake Formation grants are configured but analysts still see all columns." The right answer is to revoke IAMAllowedPrincipals on the database (or migrate to Lake Formation cross-account version 4 model), not to add more Lake Formation grants. Always check the IAMAllowedPrincipals state when troubleshooting Lake Formation enforcement; it is the single most common source of "Lake Formation is not working" support tickets.

The Two-Layer Model — Lake Formation Plus S3 Bucket Policies

The most-cited DEA-C01 trap centers on how Lake Formation and S3 bucket policies compose.

Why Two Layers Exist

Lake Formation governs catalog-level access (database, table, column). S3 bucket policies govern object-level access (which IAM principals can read or write objects). When Athena, Redshift Spectrum, or EMR query a Glue table backed by S3 data, BOTH layers must allow the access — Lake Formation must grant SELECT on the table, AND the underlying S3 bucket policy or IAM policy must allow s3:GetObject on the data files.

Registered Locations

Lake Formation introduces "registered locations" — S3 paths that Lake Formation governs on behalf of consumers. When a path is registered, Lake Formation issues temporary credentials to query engines (Athena, Redshift Spectrum) that allow them to read the underlying S3 objects on behalf of the user, and the bucket policy is bypassed in favor of Lake Formation's enforcement. When a path is NOT registered, the bucket policy or IAM S3 permissions are required directly.

The Registered-Location Decision

Register S3 locations under Lake Formation governance when you want Lake Formation to be the single source of truth for access decisions on that data. Leave S3 locations unregistered when you want IAM-based access (legacy mode). The DEA-C01 trap: scenarios that mix registered and unregistered locations create complex permission flows — the right answer almost always involves registering all data lake S3 paths under Lake Formation.

Effect Of Combined Policies

If Lake Formation grants SELECT and the S3 bucket policy denies (or doesn't allow) — the access fails. If S3 allows and Lake Formation denies — fails. Both must allow. This is the layered defense-in-depth model that AWS recommends for data lakes.

IAM allow on S3 plus Lake Formation deny equals deny — the two-layer model requires BOTH layers to allow access, not either one. The most-cited DEA-C01 community trap describes a scenario where an analyst has s3:GetObject on a bucket via IAM but cannot read the table via Athena. The candidate who picks "broaden the IAM S3 policy" or "make the S3 bucket public" is wrong. The right answer is to grant Lake Formation SELECT on the table to the analyst's role (and ensure the S3 path is registered under Lake Formation governance). Conversely, granting Lake Formation without the underlying S3 access on unregistered locations also fails. Always verify both layers when troubleshooting access failures, and prefer registering all data lake paths under Lake Formation so that Lake Formation grants alone are sufficient and the bucket policy can be locked down to "Lake Formation service principal only."

Column-Level, Row-Level, And Cell-Level Security

Lake Formation's fine-grained permissions are the headline feature DEA-C01 tests.

Column-Level Security

Grant SELECT on specific columns: GRANT SELECT (customer_id, order_total, order_date) ON TABLE orders TO ROLE analyst. The analyst sees only those three columns; queries that reference other columns return an error. Implementation uses Lake Formation data filters at query time — the catalog returns only the allowed columns to the query engine.

Row-Level Security Via Data Filters

Lake Formation data filters express row-level filter conditions on a table. A data filter on the orders table with expression region = 'US-WEST' restricts the analyst to rows in the US-WEST region. Combine with column grants to deliver cell-level security: "analyst sees only customer_id and order_total columns, only for US-WEST rows."

Data Filter Anatomy

A data filter has a name, a target table, an optional column-include list, and a row-filter expression. Multiple filters can apply to the same table for different roles. The row-filter expression supports column references, comparison operators, AND/OR logic, and a subset of SQL functions.

Cell-Level Security (CLS)

Cell-level security is the combination of column-level and row-level filters — the analyst sees only specific cells (rows × columns). This is the GDPR/HIPAA pattern: allow analysts to see customer order totals but not credit card numbers, and only for customers who have consented to analytics use.

How Query Engines Honor Filters

Athena, Redshift Spectrum, EMR (with the Lake Formation integration), and Glue all consult Lake Formation at query time and apply column projection plus row filtering at the scan layer — bytes from columns the user cannot see are never returned to the query, and rows that fail the filter expression are dropped before aggregation. The exam tests that all these engines honor Lake Formation filters; standalone Spark on EC2 does not.

Lake Formation Tag-Based Access Control (LF-TBAC)

LF-TBAC is the attribute-based access pattern that scales to large catalogs without per-table grants.

What LF-TBAC Does

Tag both resources (databases, tables, columns) and principals (IAM roles), then write grants based on tag matches: "any principal with Department=Marketing can SELECT any resource with Department=Marketing." The number of grants is O(roles × tag categories), not O(roles × tables) — a massive simplification for catalogs with thousands of tables.

LF-Tags

LF-Tags are key-value pairs managed in Lake Formation. Common patterns: Sensitivity=Public/Internal/Confidential/Restricted, Department=Engineering/Marketing/Finance, Region=US/EU/APAC, DataClass=PII/PHI/Financial.

Policy Tags vs Resource Tags

LF-Tags are applied to catalog resources (database, table, column). Principal tags (IAM tags on roles) are matched against LF-Tags via Lake Formation grants. A grant might say: "principals with Sensitivity=Internal can SELECT resources with Sensitivity=Internal AND Department=Marketing."

When To Use LF-TBAC

LF-TBAC is the right answer for large catalogs where named-resource grants would explode. The DEA-C01 exam plants this with scenarios mentioning "5000 tables across 50 teams" or "scale-friendly permission model." The wrong-answer distractors are "individual table grants" (does not scale) or "use IAM tag-based policies" (cannot do column-level).

Rule Of Thumb

For catalogs under 100 tables or stable team structures, named grants are simpler. For catalogs over 1000 tables or rapidly-changing team structures, LF-TBAC is the right pattern.

Cross-account data lake patterns are a Domain 4 staple.

Why Cross-Account

Multi-account architectures separate workloads, blast radius, and billing. The data lake often lives in a central "data" account and consumer accounts (marketing, finance, ML) need read access without copying the data.

AWS Resource Access Manager (RAM)

RAM is the AWS service for sharing resources across accounts. Lake Formation uses RAM to share databases or tables with consumer accounts. The producer account creates a RAM share including the Glue catalog database or table; the consumer account accepts the invitation.

Resource Links

In the consumer account, a "resource link" is created that points at the shared database or table in the producer account. The consumer queries via the resource link as if it were a local table; behind the scenes Lake Formation forwards the catalog calls to the producer account and applies the producer's grants.

Cross-Account Permission Flow

Producer account: data lake admin grants SELECT on the table to the consumer account principal (account ID or IAM role ARN). RAM share is created. Consumer account: accepts the share, creates a resource link, grants its own users SELECT on the resource link. The producer's grants and the consumer's grants both apply — defense in depth across accounts.

S3 Cross-Account Considerations

The S3 bucket in the producer account must allow access from the consumer account principals OR the path must be registered under Lake Formation with credentials vending. The simplest production pattern: register the path under Lake Formation, lock the bucket policy to the Lake Formation service principal only, and let Lake Formation handle cross-account access.

IAM Access Analyzer For Data Stores

IAM Access Analyzer detects unintended cross-account access in S3 buckets, KMS keys, IAM roles, and Glue catalogs.

What It Detects

External principals (other accounts, federated users, public access) that have access to your resources via resource policies. The analyzer continuously evaluates policies and flags findings.

Glue Catalog Coverage

Access Analyzer checks Glue catalog resource policies for cross-account access. If a Glue catalog database is shared with another account that should not have access, Access Analyzer flags it.

S3 Coverage

Access Analyzer flags S3 buckets with public-read access or cross-account ACLs that grant unintended access. Critical for data lakes where one misconfigured bucket can leak the entire catalog.

Integration With Security Hub

Access Analyzer findings feed into Security Hub alongside Config and GuardDuty findings. The DEA-C01 exam plants this as the right answer for "ongoing audit of cross-account data access" or "automated detection of unintended public S3 buckets."

Plain-Language Explanation: IAM And Lake Formation Fine-Grained Access

Three concrete analogies make the two-layer model intuitive.

Analogy 1 — The Office Building With Both Building Security And Department Locks

IAM is the office building's main security desk: every employee needs a building badge to enter, and the badge is checked at the front door (authentication) and at the elevator (authorization to certain floors). The building security knows who you are and what floors you can visit. Lake Formation is the department-level locks: even after you reach the marketing-department floor, the file cabinets are locked individually — confidential client files require an additional department-issued key (column-level grant), and the cabinet for the European clients has a regional key (row-level filter). The DEA-C01 trap is thinking "I have a building badge, so I can read every file" — wrong, the cabinet locks are separate. The correct mental model is two layers: building security (IAM) handles "can you enter the data lake at all," department locks (Lake Formation) handle "which specific tables, columns, and rows you can read." Both layers must allow access; either one denying is enough to block the request. Cross-account sharing is the sister office in another building — your home building's badge is recognized, the sister building issues a visitor badge, and the file-cabinet keys are separately granted.

Analogy 2 — The Library With Library Cards And Restricted-Access Shelves

IAM is the library card you get at the front desk: it identifies you and lets you check out general-collection books. Lake Formation is the closed-stack and rare-book section permissions: even with a library card, you cannot enter the rare-book reading room without a researcher badge (table-level grant), you cannot photocopy specific pages without a special permit (column-level grant), and you cannot view donor-restricted materials before 1990 without a release (row-level filter). LF-TBAC is the academic-discipline tagging — books and researcher badges both carry "history department" or "biology department" tags, and the rule is "history-tagged researchers can read history-tagged shelves." Cross-account sharing via RAM is interlibrary loan: your home library agrees to lend a book to another library, the other library issues a temporary checkout to its patron, and the access flow involves both libraries' policies. The DEA-C01 trap is forgetting that your library card alone (IAM) does not unlock the rare-book section (Lake Formation grants) — both layers are needed.

Analogy 3 — The Safety Deposit Box With Bank Vault And Box Key

IAM is the bank vault door: only authenticated customers with a valid ID can enter the vault area at all. Lake Formation is the individual safety deposit box keys: even inside the vault, each customer's box requires a separate key, and the bank has master records of which customer can open which box. Column-level security is the box that has multiple internal compartments, each with its own combination — you can access the "documents" compartment but not the "jewelry" compartment. Row-level filters are the time-locked compartments that only open within certain conditions. Cell-level security is the combined effect: specific compartments accessible only at specific times. Registered locations are the boxes the bank governs centrally — you go to the bank, present your customer ID, and the bank issues the key to the right box; the box's actual lock mechanism is bank-controlled. Unregistered locations are the safe deposit boxes in the bank's old wing where customers must present both their ID at the vault door AND their personal box key — the dual-control pattern that the two-layer Lake Formation + IAM/S3 model formalizes.

Common Exam Traps For IAM And Lake Formation

Memorize all five.

Trap 1 — IAMAllowedPrincipals Still Granted

A scenario: "Lake Formation is set up but column-level grants do not take effect." Wrong: add more Lake Formation grants. Right: revoke IAMAllowedPrincipals on the database so Lake Formation enforcement actually engages.

Trap 2 — Bucket Policy Allows But Lake Formation Denies

A scenario: "Analyst has S3 GetObject permission but Athena query returns access denied on the table." Wrong: broaden S3 policy. Right: grant Lake Formation SELECT on the table; the two-layer model requires both.

Trap 3 — IAM Tag-Based For Column-Level

A scenario: "Need column-level access on a 1000-table catalog scaled by tags." Wrong: IAM ABAC with PrincipalTag conditions. Right: Lake Formation LF-TBAC. IAM cannot natively express column-level grants.

Trap 4 — Cross-Account Without Resource Links

A scenario: "Consumer account queries the shared database directly by name." Wrong: query producer_account.database.table directly. Right: create a resource link in the consumer account and query through the link.

Trap 5 — Lake Formation Encrypts Data

A scenario suggests Lake Formation encrypts data at rest. Wrong — Lake Formation governs access, KMS encrypts data. The two are complementary but distinct. Always look for KMS or S3 encryption settings as the answer to "encrypt the data."

Adopt Lake Formation tag-based access control (LF-TBAC) from day one for any data lake with more than 100 tables or more than 10 distinct teams. Named-resource grants scale as O(roles × tables) and become unmaintainable at production scale; LF-TBAC scales as O(roles × tag categories), typically a 10x to 100x reduction in grant count. The pattern: define a small number of LF-Tag categories (Sensitivity, Department, Region, DataClass), tag every catalog resource at creation time, tag every IAM role with matching attributes, and write grants based on tag matches. New tables inherit grants automatically when they receive the right tags; new teams get access by getting the right principal tags. The DEA-C01 exam plants this as the right answer for "scale fine-grained permissions across a large enterprise data lake without explosion of grants" — never pick "individual table grants per team" when LF-TBAC is an option.

Lake Formation Permissions Reference — The Canonical List

The DEA-C01 exam expects familiarity with the permission verbs.

Database-Level Permissions

CREATE_TABLE, DROP, ALTER, DESCRIBE
CREATE_TABLE_READ_WRITE (full access to create and manage tables)

Table-Level Permissions

SELECT (read), INSERT (write), DELETE (delete rows)
DESCRIBE (see metadata), ALTER (modify schema), DROP (delete table)
ALL (super-user access to the table)

Grantable Permissions

WITH GRANT OPTION — recipient can re-grant the permission to others
Used for delegating administration to data stewards

Data Filter Permissions

Column-include lists on SELECT
Row-filter expressions on SELECT
Combined into cell-level access patterns

Cross-Account Permissions

Granted to AWS account ID or to specific IAM role ARN
Recipient must accept RAM share in their account

Memorize the four-layer enforcement chain for Athena/Redshift/EMR queries on data lake S3 data: (1) IAM identity policy must allow the catalog API and Athena/Redshift action, (2) Lake Formation must grant SELECT (or column-level / row-level grants) on the table, (3) S3 access must be allowed (either via bucket policy + IAM, OR via Lake Formation registered location credentials vending), (4) KMS key policy must allow decryption if SSE-KMS is in use. All four layers must allow the access; any one denying causes the request to fail. The DEA-C01 exam plants permission-failure scenarios that test which layer is the culprit. Common-cause checklist when troubleshooting: did you revoke IAMAllowedPrincipals? Did you register the S3 location? Does the role have kms:Decrypt on the key? Is the resource link created in the consumer account? Memorize the four layers; nine out of ten Lake Formation troubleshooting scenarios are one of them.

DEA-C01 exam priority — IAM and Lake Formation Fine-Grained Access. This topic carries weight on the DEA-C01 exam. Master the trade-offs, decision boundaries, and the cost/performance triggers each AWS service exposes — the exam will test scenarios that hinge on knowing which service is the wrong answer, not just which is right.

Definition — IAM and Lake Formation Fine-Grained Access. This DEA-C01 topic covers a domain-specific AWS service or pattern. Confirm the canonical definition from official AWS documentation before relying on third-party summaries — service names and feature scoping have shifted over time.

FAQ — IAM And Lake Formation Fine-Grained Access Top Questions

Q1 — When should I use IAM versus Lake Formation for data lake access control?

Use IAM for AWS API-level access — who can call glue:GetTable, athena:StartQueryExecution, s3:PutObject. Use Lake Formation for data-level access — who can SELECT from which database, table, column, and which rows. The two layers compose: IAM authenticates the caller and authorizes the API call, Lake Formation authorizes the data access. For new data lakes, adopt Lake Formation as the primary data permission model (revoke IAMAllowedPrincipals, grant explicitly). For legacy pipelines built on IAM-only, migrate gradually by enabling Lake Formation per database and revoking IAMAllowedPrincipals only after verifying Lake Formation grants are complete. The DEA-C01 exam plants this with scenario detail — "column-level access" is always Lake Formation; "API-level access for the Glue ETL job" is IAM.

Q2 — How do I grant column-level access to a sensitive PII table without copying the table?

In Lake Formation: GRANT SELECT (customer_id, order_total, order_date) ON TABLE orders TO ROLE analyst-role. The analyst can SELECT only those three columns; queries referencing other columns return an access denied error. The query engine (Athena, Redshift Spectrum, EMR) projects only the allowed columns at the scan layer, so the analyst never sees PII columns even in query plans or error messages. For row-level filtering, add a Lake Formation data filter with expression like region = 'US-WEST'. For cell-level security, combine column include-list and row filter in the same data filter. The DEA-C01 exam plants this as the canonical answer to "fine-grained access without copying data."

The producer account (data lake) creates a Lake Formation grant with a consumer account ID or IAM role ARN as the recipient, then creates a RAM share that includes the Glue catalog database or table. The consumer account accepts the RAM share, creates a resource link in its own catalog pointing at the shared database, grants its own users SELECT on the resource link, and queries the resource link from Athena/Redshift/EMR. The S3 bucket in the producer account must either allow cross-account access via bucket policy OR (preferred) the path must be registered under Lake Formation with credentials vending. The DEA-C01 exam plants this with multi-account scenarios — the right answer always involves RAM share + resource link + Lake Formation grants on both sides.

Q4 — What is IAMAllowedPrincipals and why does it matter?

IAMAllowedPrincipals is a backward-compatibility setting on Glue databases that, when granted, makes Lake Formation defer to IAM for permission checks on that database. New databases created in the Glue console default to having IAMAllowedPrincipals granted. As long as IAMAllowedPrincipals is granted, your Lake Formation column-level or row-level grants are NOT enforced — IAM permissions on the Glue catalog and S3 take precedence. To enable Lake Formation enforcement, revoke IAMAllowedPrincipals on the database (and table) and grant explicit Lake Formation permissions to the roles that need access. This is the single most-cited "why is Lake Formation not working" troubleshooting issue. The DEA-C01 exam plants this trap directly — the answer to "Lake Formation grants not taking effect" is almost always "revoke IAMAllowedPrincipals."

Q5 — When should I use LF-TBAC versus named-resource grants?

Use LF-TBAC when the catalog has hundreds or thousands of tables, when teams change frequently, or when the permission model can be expressed as attributes (department, sensitivity, region, data class). LF-TBAC scales as O(roles × tag categories), while named-resource grants scale as O(roles × tables). Use named-resource grants for small catalogs (under ~50 tables) where the simplicity of "grant SELECT on this specific table to this specific role" outweighs the operational benefit of attributes, or for one-off permissions that don't fit a tag taxonomy. The DEA-C01 exam plants LF-TBAC as the right answer for enterprise-scale scenarios; for small focused scenarios, named grants are sufficient.

Q6 — Does Lake Formation encrypt data?

No. Lake Formation governs ACCESS to data; it does not encrypt data at rest or in transit. Data at rest is encrypted via S3 server-side encryption (SSE-S3, SSE-KMS) on the bucket, by Redshift cluster encryption with KMS, and by Glue job security configuration for intermediate results. Data in transit is protected by TLS on the connection. Lake Formation and KMS are complementary — Lake Formation says "Alice can read columns A and B," KMS says "the bytes are encrypted at rest with key K, and only roles with kms:Decrypt on K can read them." The DEA-C01 exam plants "Lake Formation encrypts" as a wrong-answer distractor; the right encryption answer always involves KMS or service-native encryption.

Q7 — How do I troubleshoot "access denied" errors on a Lake Formation-governed table?

Check four layers in order. Layer 1 — IAM: does the calling role have glue:GetTable, glue:GetPartitions, athena:StartQueryExecution (or equivalent)? Without these the query never reaches Lake Formation. Layer 2 — Lake Formation: is IAMAllowedPrincipals revoked on the database? Is the role granted SELECT on the table or on the specific columns? Is there a data filter that excludes the role's rows? Layer 3 — S3: is the underlying S3 path registered under Lake Formation? If yes, the query engine should get credentials from Lake Formation. If no, the role needs s3:GetObject directly via IAM or bucket policy. Layer 4 — KMS: if the data is encrypted with SSE-KMS, does the role (or Lake Formation service role) have kms:Decrypt on the key? Walking through these four layers in order finds the problem in 95 percent of Lake Formation access-denied tickets.

Authentication vs Authorization — Two Different Questions

Authentication — Who Are You

Authorization — What Can You Do

Where DEA-C01 Tests Each

IAM Identity Types For Data Services

IAM Users And Long-Lived Access Keys

IAM Roles For Services

IAM Roles For Federated Users

Cross-Account IAM Roles

Service-Linked Roles

IAM Policies For Data Engineering

Identity-Based vs Resource-Based Policies

Policy Evaluation Logic

Common Condition Keys For Data Engineering

Permission Boundaries

Lake Formation — Centralized Data Lake Governance

What Lake Formation Adds Over IAM

The Lake Formation Permissions Hierarchy

Granting And Revoking Permissions

Data Lake Administrators

IAMAllowedPrincipals — The Backward-Compat Mode

The Two-Layer Model — Lake Formation Plus S3 Bucket Policies

Why Two Layers Exist

Registered Locations

The Registered-Location Decision

Effect Of Combined Policies

Column-Level, Row-Level, And Cell-Level Security

Column-Level Security

Row-Level Security Via Data Filters

Data Filter Anatomy

Cell-Level Security (CLS)

How Query Engines Honor Filters

Lake Formation Tag-Based Access Control (LF-TBAC)

What LF-TBAC Does

LF-Tags

Policy Tags vs Resource Tags

When To Use LF-TBAC

Rule Of Thumb

Cross-Account Data Sharing — RAM Plus Resource Links

Why Cross-Account

AWS Resource Access Manager (RAM)

Resource Links

Cross-Account Permission Flow

S3 Cross-Account Considerations

IAM Access Analyzer For Data Stores

What It Detects

Glue Catalog Coverage

S3 Coverage

Integration With Security Hub

Plain-Language Explanation: IAM And Lake Formation Fine-Grained Access

Analogy 1 — The Office Building With Both Building Security And Department Locks

Analogy 2 — The Library With Library Cards And Restricted-Access Shelves

Analogy 3 — The Safety Deposit Box With Bank Vault And Box Key

Common Exam Traps For IAM And Lake Formation

Trap 1 — IAMAllowedPrincipals Still Granted

Trap 2 — Bucket Policy Allows But Lake Formation Denies

Trap 3 — IAM Tag-Based For Column-Level

Trap 4 — Cross-Account Without Resource Links

Trap 5 — Lake Formation Encrypts Data

Lake Formation Permissions Reference — The Canonical List

Database-Level Permissions

Table-Level Permissions

Grantable Permissions

Data Filter Permissions

Cross-Account Permissions

FAQ — IAM And Lake Formation Fine-Grained Access Top Questions

Q1 — When should I use IAM versus Lake Formation for data lake access control?

Q2 — How do I grant column-level access to a sensitive PII table without copying the table?

Q3 — How does cross-account data sharing work between a central data lake and consumer accounts?

Q4 — What is IAMAllowedPrincipals and why does it matter?

Q5 — When should I use LF-TBAC versus named-resource grants?

Q6 — Does Lake Formation encrypt data?

Q7 — How do I troubleshoot "access denied" errors on a Lake Formation-governed table?

Further Reading — Official AWS Documentation

Official sources

More DEA-C01 topics