SageMaker Role Manager and Least-Privilege Patterns — MLA-C01 ML Engineer Study Notes

Q: Q3 — How do I enforce that no SageMaker training job in our organization can use ml.p4d.24xlarge or larger instances?

SCP at the organization root with a Deny statement on sagemaker:CreateTrainingJob when sagemaker:InstanceTypes includes any of the disallowed types: "Condition": {"ForAnyValue:StringEquals": {"sagemaker:InstanceTypes": ["ml.p4d.24xlarge", "ml.p4de.24xlarge", "ml.p5.48xlarge"]}} . The SCP applies to all principals in all accounts and survives any inline-policy change in member accounts. For exception workflows, create a separate elevated role in the Modelling account that the SCP exempts via principal ARN, and require manual approval to assume that role.

SageMaker Role Manager and least-privilege patterns is the MLA-C01 topic that separates engineers who hand out AmazonSageMakerFullAccess to every team member from engineers who can design persona-scoped IAM that satisfies a security audit. Domain 4 Task 4.3 tests whether you can pick the right persona template for an ML engineer versus a data scientist versus an MLOps engineer, whether you understand how activity groups compose into role policies, when AWS-managed policies are sufficient and when you must hand-craft custom policies, and how IAM condition keys, ABAC tagging strategies, and Service Control Policies turn least-privilege from a principle into an enforceable guardrail. The community pain-point signal is loud and consistent: K21 Academy reports IAM least-privilege patterns specific to SageMaker Role Manager are under-studied, and live-exam reflections from Sourabh Sinha confirm security topics including Role Manager appeared more than expected.

This guide is engineered for the MLA-C01 ML engineer perspective — practical IAM design rather than IAM theory. We cover what SageMaker Role Manager actually generates, the four built-in personas and the activity groups that compose them, the gap between AWS-managed policies and what production teams need, the ABAC tagging blueprint that scales to many teams, the IAM condition keys that turn cost guardrails and security guardrails into enforceable code, the Service Control Policies that lock entire ML accounts into approved configurations, and the troubleshooting decision trees for when SageMaker Role Manager produces a role that does not work. Every callout points at the official AWS source so the canonical reference for SageMaker Role Manager and least-privilege patterns is one click away.

What Is SageMaker Role Manager?

SageMaker Role Manager is an AWS-built tool inside the SageMaker console that generates IAM execution roles from persona-based templates rather than requiring engineers to hand-craft IAM policy JSON. You pick a persona (ML engineer, data scientist, MLOps engineer, business analyst), pick activity groups that match the workflows that persona will perform, optionally apply IAM conditions and tags, and Role Manager produces a role with a tightly-scoped inline policy. The output is a starting point for least-privilege — not the final answer, but a well-structured baseline that beats hand-rolled policies and dramatically beats AmazonSageMakerFullAccess.

Why SageMaker Role Manager Exists

Before SageMaker Role Manager, two things happened in real organizations. First, teams attached AmazonSageMakerFullAccess to everything — the policy is overly permissive (s3:* on every bucket, iam:PassRole on *), but writing custom policies was painful. Second, security teams hand-crafted custom policies in a vacuum, often producing roles that broke at the first novel workflow. SageMaker Role Manager addresses both: it gives platform teams a structured starting point that maps cleanly to job functions, and it gives security teams a vocabulary (personas, activity groups) for reasoning about what an ML role should be allowed to do.

Where SageMaker Role Manager Sits in the Console

Navigate to SageMaker AI → Admin configurations → Role Manager. The flow has four steps: select persona, configure permissions and conditions, configure network access, review and create. The output is an IAM role in your account with an inline policy and a trust policy targeting the SageMaker service principal. From the moment of creation, the role behaves like any other IAM role — you can attach managed policies, edit the inline policy, attach permission boundaries, and audit it with IAM Access Analyzer.

What Role Manager Does Not Replace

SageMaker Role Manager does not configure resource-level permissions on the destination side — it does not write S3 bucket policies, KMS key policies, or VPC endpoint policies. It produces the principal-side IAM. For least-privilege patterns to actually work, the resource-side policies (bucket policy, key policy, endpoint policy) must align with what the role's IAM allows. SageMaker Role Manager and least-privilege patterns are a system, not a single tool.

Plain-Language Explanation: SageMaker Role Manager and Least-Privilege

Three concrete analogies make SageMaker Role Manager and least-privilege patterns stick.

Analogy 1 — The Hospital Staff Badge System

Imagine a hospital with four staff personas — surgeon, nurse, lab technician, billing clerk. The hospital does not hand every staff member a master key card that opens every door. Instead, HR runs a badge issuance system that says "if your job title is surgeon, your badge opens operating rooms, supply closets in the surgical wing, and the surgeon's lounge — nothing else." That badge issuance system is SageMaker Role Manager. The personas are the four built-in templates. The activity groups are the modular units — "perform surgery," "handle controlled substances," "review patient charts" — that each persona is granted in different combinations. A nurse and a surgeon both have "review patient charts" but only the surgeon has "perform surgery." A billing clerk has neither. When the hospital onboards a new staff member, HR picks the persona, the activity groups, and possibly extra conditions ("only badges valid 7am to 7pm") and the badge prints with exactly those rights. SageMaker Role Manager and least-privilege patterns work the same way for ML teams.

Analogy 2 — The Locksmith's Master Key System

Picture a corporate office with a locksmith who manages master keys. There is a building master key (AmazonSageMakerFullAccess) that opens every door — the locksmith hates handing it out because anyone with it can do anything. There are departmental master keys (the four built-in personas) that open all doors in a single department — IT, finance, marketing, legal — but nothing outside. There are floor master keys (custom personas with handpicked activity groups) that open a specific subset of doors on one floor. There are role-specific keys (IAM-condition-restricted roles) that only open doors during business hours, only when the holder's badge is in the building, and only if the door is in the approved list. A mature ML platform issues role-specific keys via SageMaker Role Manager and treats the building master key as a break-glass artifact stored in the safe. The least-privilege journey is moving every team from building master keys to role-specific keys, one persona at a time.

Analogy 3 — The Restaurant Kitchen Stations

Imagine a high-end restaurant kitchen with stations — pastry, grill, sauté, garde manger, dishwasher. Each station has its own tools, ingredients, and heat sources. A head chef supervises everyone but does not personally hand out knife sets — the restaurant uses a station-based equipment policy: pastry chefs get rolling pins and pastry torches but not chef's knives; grill cooks get tongs and chef's knives but not pastry torches; dishwashers get sponges and access to the dish pit but no cooking equipment. Equipment access is mapped to station (persona), not to the individual person. New cooks rotating into pastry inherit the pastry kit; cooks moving to grill turn in the pastry kit and pick up the grill kit. The kitchen runs efficiently because tools match jobs and nobody has tools they shouldn't. SageMaker Role Manager is the equipment policy; activity groups are the individual tool kits; personas are the stations.

The Four Built-In SageMaker Role Manager Personas

SageMaker Role Manager ships with four pre-defined personas that map to the four common ML team functions. Memorizing them is mandatory for MLA-C01.

Persona 1 — ML Engineer

The ML engineer persona is the broadest. It covers training jobs, processing jobs, model creation, endpoint creation and deployment, and pipeline orchestration. The role can read training data from approved S3 buckets, write model artifacts to approved buckets, register models in the Model Registry, and deploy to endpoints. The ML engineer persona is the closest thing to a "ship it to production" role and is the persona used by SageMaker Pipelines execution roles in many teams.

Persona 2 — Data Scientist

The data scientist persona is scoped to experimentation. It covers SageMaker Studio access, notebook usage, training jobs, processing jobs, and Feature Store reads — but not endpoint deployment, model registry approval, or production CI/CD operations. This split is intentional — data scientists experiment, ML engineers ship. A data scientist who needs to deploy to production uses a separate elevated role with explicit approval.

Persona 3 — MLOps Engineer

The MLOps engineer persona is the operations-focused role. It covers model deployment, endpoint update operations, deployment guardrails, A/B testing configuration, Model Registry approval workflows, monitoring schedule management, and CloudWatch alarm configuration. It does not automatically include training job creation — MLOps engineers operate the production fleet, they do not author models.

Persona 4 — Business Analyst

The business analyst persona is read-mostly. It covers SageMaker Canvas (no-code ML), reading dashboards, querying Model Monitor reports, and consuming model predictions through endpoints — but not training, deployment, or any write to production resources. This persona is for stakeholders who need ML insights without touching the ML platform.

The four SageMaker Role Manager personas — ML engineer, data scientist, MLOps engineer, business analyst — map directly to the four most common ML team functions, and each persona's permissions are intentionally non-overlapping at the production-write level. A data scientist cannot deploy endpoints. An MLOps engineer cannot author training scripts. A business analyst cannot create training jobs. This separation enforces the "experiment → ship → operate" pipeline — different people own each phase, and IAM enforces the boundary. On the MLA-C01 exam, scenario stems with phrases like "the data science team needs to experiment but the production deployment must be controlled" point at this persona split. The right answer is two roles, not one role with broad permissions.

Activity Groups — The Building Blocks of Role Manager Permissions

Each persona is composed of activity groups — modular permission sets that map to specific ML workflows.

What an Activity Group Is

An activity group is a named bundle of IAM actions plus the resource scope and condition keys that make those actions safe. Examples include "Run training jobs," "Manage Model Registry approvals," "Deploy endpoints," "Read from Feature Store online store," "Write to Feature Store offline store." Each activity group corresponds to a distinct workflow step.

How Activity Groups Compose

The ML engineer persona enables activity groups for training, processing, model creation, endpoint creation, pipeline execution, and Model Registry write. The data scientist persona enables activity groups for training, processing, Studio access, and Feature Store read — but not endpoint creation. By picking activity groups, the platform admin builds custom personas without writing IAM JSON.

Custom Personas

The four built-in personas are starting points. SageMaker Role Manager allows defining custom personas with hand-picked activity groups. A typical custom persona pattern: "ML engineer minus production deployment" for a junior engineer, or "MLOps engineer plus training job emergency-launch" for a senior reliability engineer.

Activity Groups vs Raw IAM Statements

Each activity group ultimately renders to IAM policy statements with Action, Resource, and Condition blocks. The output role's inline policy is human-readable IAM, not opaque metadata — you can audit the rendered policy and edit it after generation. Many teams use SageMaker Role Manager to bootstrap a role and then version-control the rendered policy in CloudFormation or Terraform.

AWS-Managed Policies vs SageMaker Role Manager Output vs Custom

Understanding the policy hierarchy is essential for SageMaker Role Manager and least-privilege patterns.

AmazonSageMakerFullAccess — The Anti-Pattern

AmazonSageMakerFullAccess is the AWS-managed policy attached by the console wizard. It grants s3:* on every bucket containing sagemaker in the name (which an attacker can manipulate by naming buckets with sagemaker prefix), iam:PassRole on * (any role), and a wide swath of SageMaker actions. It is acceptable for prototyping, never for production. The MLA-C01 exam will plant scenarios where the answer is "replace AmazonSageMakerFullAccess with a scoped policy."

AWS-Managed Service-Specific Policies

AmazonSageMakerCanvasFullAccess, AmazonSageMakerPipelinesIntegrations, AmazonSageMakerFeatureStoreAccess — these are narrower AWS-managed policies for specific SageMaker features. They are still AWS-managed, so they update when AWS adds new actions. For least-privilege, prefer them over the FullAccess super-policy, but be aware they may grant more than your specific workload needs.

SageMaker Role Manager Output

The role inline policy generated by Role Manager sits between AWS-managed and fully custom. It is hand-tuned by AWS engineering for specific personas and activity groups. It is generally significantly tighter than AmazonSageMakerFullAccess. For most production teams, this is the right starting point.

Fully Custom Inline Policies

For regulated workloads — HIPAA, PCI-DSS, FedRAMP — a fully custom inline policy may be required, hand-crafted from the SageMaker API permissions reference. The custom policy is version-controlled in IaC, peer-reviewed, and audited via IAM Access Analyzer. This is the gold standard but takes engineering effort.

There is no single "right" choice between AWS-managed policies, SageMaker Role Manager output, and fully custom inline policies — the right choice depends on the workload's compliance posture and the team's IAM maturity. AWS-managed policies auto-update as SageMaker adds features, which is good for staying current and bad for predictable least-privilege. SageMaker Role Manager output is hand-tuned and snapshot-stable but not version-controlled in your IaC. Fully custom policies give you complete control but require continuous maintenance as SageMaker adds new features. For SageMaker Role Manager and least-privilege patterns, a typical mature pattern is: bootstrap with Role Manager, export the rendered policy, version-control in CloudFormation, peer-review on every change, and run IAM Access Analyzer continuously to detect drift.

ABAC — Tag-Based Access Control for SageMaker

Attribute-Based Access Control (ABAC) using resource tags is the second pillar of SageMaker Role Manager and least-privilege patterns at scale.

The ABAC Problem ABAC Solves

A team of 50 ML engineers cannot have 50 hand-crafted IAM roles. Maintenance is impossible. Adding a new project would require updating every role. ABAC solves this with a single role policy that checks resource tags at API call time — "you can call CreateTrainingJob if the job's Project tag matches your Project principal tag." One role serves all engineers; the tag matching enforces project-level isolation.

Principal Tags vs Resource Tags

Principal tags are attached to IAM principals (roles, users, federated identities). Resource tags are attached to AWS resources (training jobs, endpoints, models). A SageMaker IAM policy with the condition aws:RequestTag/Project = aws:PrincipalTag/Project requires that any new training job carry a Project tag matching the caller's principal tag. The same condition on the resource — aws:ResourceTag/Project = aws:PrincipalTag/Project — restricts read/update operations to resources matching the caller's project.

The Tag Enforcement Pattern

For ABAC to work, every SageMaker resource must be created with the right tags. Two enforcement layers: (1) IAM policy denies CreateTrainingJob without a Project tag — "Condition": {"Null": {"aws:RequestTag/Project": "true"}} produces a deny. (2) AWS Config detects untagged resources after creation and triggers SSM remediation. The double-layer ensures no resource escapes tagging.

ABAC vs RBAC Trade-Offs

ABAC scales beautifully — one policy for many principals — but requires disciplined tagging. RBAC (one role per project, hand-crafted) is simpler for small teams but does not scale past 10 projects. SageMaker Role Manager and least-privilege patterns at scale use ABAC; RBAC is the bootstrapping pattern for new platforms.

IAM Condition Keys for SageMaker

Condition keys are the lever that turns SageMaker Role Manager and least-privilege patterns from a permissions concept into a hard guardrail.

sagemaker:InstanceTypes

Restricts which EC2 instance types CreateTrainingJob and CreateEndpointConfig can request. Example deny: "Condition": {"ForAnyValue:StringNotEquals": {"sagemaker:InstanceTypes": ["ml.m5.large", "ml.m5.xlarge", "ml.p3.2xlarge"]}}. This blocks expensive instance types like ml.p4d.24xlarge unless explicitly approved. Cost guardrail and security guardrail in one.

sagemaker:VolumeKmsKey

Requires that training jobs and endpoint configurations specify a customer-managed KMS key for EBS encryption. Example deny: "Condition": {"Null": {"sagemaker:VolumeKmsKey": "true"}} rejects jobs that omit the key. Combine with an aws:ARN allow-list of approved keys for full control.

sagemaker:NetworkIsolation

Forces network isolation mode on every training job. Example deny: "Condition": {"BoolIfExists": {"sagemaker:NetworkIsolation": "false"}}. Pair with sagemaker:VpcSubnets to require VPC config and sagemaker:VpcSecurityGroupIds to require approved security groups.

sagemaker:RootAccess

Controls whether SageMaker notebook instances allow root access for the user. Set to Disabled via condition to prevent users from installing arbitrary packages or modifying the instance.

sagemaker:OutputKmsKey

Forces customer-managed KMS key for model artifact output. Combined with VolumeKmsKey, ensures all data at rest in the training pipeline is customer-encrypted.

The most powerful pattern in SageMaker Role Manager and least-privilege is combining all five SageMaker IAM condition keys in one Deny statement at the SCP level: sagemaker:InstanceTypes, sagemaker:VolumeKmsKey, sagemaker:OutputKmsKey, sagemaker:NetworkIsolation, sagemaker:VpcSubnets. A single Deny like "deny CreateTrainingJob if any of these conditions fail" turns the entire AWS account into a mandatory-encrypted, mandatory-isolated, cost-bounded ML environment. Engineers can iterate freely within those boundaries; nobody can ship outside them. For MLA-C01 stems asking "how do I enforce these properties at scale," the SCP plus condition keys answer beats every alternative.

Service Control Policies for ML Accounts

SCPs are the organization-wide guardrail that backstops account-level IAM.

Why SCPs for ML Accounts

Account-level IAM policies can be edited by anyone with iam:* in the account. SCPs cannot — only the organization management account can modify them. For mandatory controls (no public S3 buckets containing training data, no internet-mode training jobs in regulated OUs, no instance types above approved size), SCPs are the only enforcement that survives compromised account-level admin.

Common SCP Patterns for ML

Pattern 1 — "Deny all SageMaker actions in non-approved regions" — restricts ML to us-east-1 and us-west-2 to satisfy data residency. Pattern 2 — "Deny CreateTrainingJob without VPC config and KMS keys" — enforces the full IAM, KMS, VPC stack. Pattern 3 — "Deny iam:CreatePolicy and iam:AttachRolePolicy to non-Security-account principals" — prevents teams from elevating their own roles. Pattern 4 — "Deny kms:DisableKey and kms:ScheduleKeyDeletion" — protects training data keys from accidental destruction.

SCPs Apply to Roles, Not Just Users

SCPs apply to all IAM principals in the account including service-linked roles, execution roles, and cross-account assumed roles. A SageMaker training execution role hits the SCP boundary just like a human IAM user. If the SCP denies an action, the role cannot perform it regardless of inline policy.

SCPs and Confused Deputy

SCPs do not protect against confused deputy at the resource policy layer — a bucket policy in account A allowing account B's role does not pass through the management account's SCP for account A. For confused-deputy protection, use aws:SourceArn and aws:SourceAccount conditions on resource policies. SCPs and confused-deputy conditions are complementary, not substitutes.

Cross-Account ML Workflows and Role Chaining

SageMaker Role Manager and least-privilege patterns must extend across account boundaries.

The Three-Account ML Pattern

A mature ML org has three accounts: Data (curated datasets in S3), Modelling (SageMaker training and experimentation), Production (endpoints and Model Registry). Each account has its own Role Manager personas. Cross-account flows use role chaining — Modelling-account training role assumes a Data-account read role to fetch data, Production-account deployment role assumes a Modelling-account model-pull role to fetch artifacts.

Cross-Account Model Registry

Production deploys from a Model Registry in the Modelling account. The Production-account deployment role needs sagemaker:DescribeModelPackage and sagemaker:CreateModel with cross-account permissions. The Modelling-account Model Registry has a resource policy granting Production-account role principals the right to read approved packages. Combined with KMS key sharing for the model artifact bucket, this is the canonical multi-account deployment pattern.

Confused Deputy in Cross-Account ML

A Modelling-account role assumed by Production must include aws:SourceAccount conditions in the trust policy — otherwise, any other AWS customer who happens to know the role ARN could attempt to assume it. The MLA-C01 exam plants stems where the trust policy lacks SourceAccount and asks "what is the security flaw" — the answer is confused deputy.

Audit and Drift Detection — IAM Access Analyzer for SageMaker

Defining roles is half the job. Continuously verifying they remain least-privilege is the other half.

IAM Access Analyzer for External Access

Access Analyzer continuously monitors IAM resource policies (S3 bucket policies, KMS key policies, IAM role trust policies) for unintended external access. For SageMaker workloads, it flags S3 buckets shared with other accounts via bucket policy, KMS keys shared via key policy grants, and IAM roles trusted by external accounts via trust policy. Findings appear in Security Hub for centralized review.

IAM Access Analyzer Unused Access

Analyzer's "Unused Access" feature flags IAM permissions that have not been used in 90 days. For SageMaker Role Manager and least-privilege patterns, this is the audit loop — generate the role, watch which permissions remain unused after a quarter, and trim them. The output is a successively tighter role over time.

Custom Policy Checks

For pre-deployment validation, IAM Access Analyzer policy checks evaluate proposed policies against custom requirements — "no policy may allow s3:* on *," "no policy may grant iam:PassRole without a resource constraint." Run these checks in CI/CD before deploying CloudFormation or Terraform.

Cost Guardrails as Least-Privilege

Cost is a security concern. A role that can launch ml.p5.48xlarge instances has the power to create five-figure cost spikes.

Instance Type Allow-Lists

The sagemaker:InstanceTypes condition key restricts the instance types a role can request. For most teams: ml.m5.large, ml.m5.xlarge, ml.m5.2xlarge for CPU; ml.g4dn.xlarge for entry GPU; ml.p3.2xlarge for production training. Anything above requires a separate approval workflow and a separate elevated role.

Max Runtime Conditions

The sagemaker:MaxRuntimeInSeconds condition caps training job duration. A common pattern: cap at 24 hours for general-purpose roles, 168 hours (7 days) for foundation-model fine-tuning roles. Runaway training jobs are stopped automatically.

Spot vs On-Demand Conditions

The sagemaker:UseSpot condition can require Spot training for cost-conscious teams. A role with "Bool": {"sagemaker:UseSpot": "true"} cannot create on-demand training jobs at all.

SageMaker Role Manager does not auto-apply to existing roles. Generating a new role via Role Manager creates a new IAM role; it does not modify roles that already exist. Many teams expect Role Manager to "audit and tighten" existing AmazonSageMakerFullAccess roles — it does not. The migration path is: generate the new persona-scoped role, test it on a non-production workload, update SageMaker Pipelines and CodePipeline configurations to use the new role ARN, validate in production, then remove the old over-permissioned role. The MLA-C01 exam plants stems where teams "ran Role Manager but the audit still flags the old role" — the answer is Role Manager does not retroactively edit; you must explicitly migrate and decommission.

Common Exam Traps for SageMaker Role Manager and Least-Privilege

Trap 1 — Role Manager Updates Existing Roles

Wrong. It only generates new roles.

Trap 2 — AWS-Managed Policies Are Always Tighter Than Hand-Crafted

Wrong. Often broader, especially AmazonSageMakerFullAccess.

Trap 3 — ABAC Tags Are Optional When the Role Has the Right Permissions

Wrong. ABAC requires tags to enforce — without tags, the policy collapses to allow-all or deny-all depending on the condition operator.

Trap 4 — SCPs Override Resource-Side Policies Like Bucket Policies

Wrong. SCPs apply to principals in member accounts, not to resource policies in other accounts. Confused-deputy protection still requires aws:SourceArn on resource policies.

Trap 5 — sagemaker:InstanceTypes Applies to Endpoint Invocation

Wrong. It applies to CreateTrainingJob and CreateEndpointConfig — the launch-time decisions. InvokeEndpoint inherits the endpoint's existing instance type.

Trap 6 — Permission Boundaries Replace Inline Policies

Wrong. Permission boundaries cap the maximum effective permission. The role still needs an inline or attached policy that grants the actual permissions; the boundary just limits how broad those grants can be.

Trap 7 — IAM Access Analyzer Detects Drift Inside Inline Policies

Partial. Access Analyzer detects external access exposure and unused permissions, not policy-syntax drift. For drift, use AWS Config managed rules or external IaC drift detection.

Trap 8 — Each Persona Maps to Exactly One Activity Group

Wrong. Each persona is composed of multiple activity groups. The data scientist persona enables training, processing, Studio, and Feature Store read groups simultaneously.

Trap 9 — Cross-Account Role Trust Does Not Need aws:SourceAccount

Wrong. Without SourceAccount, you have a confused deputy vulnerability.

Trap 10 — A Single Role Per Team Is the Least-Privilege Pattern

Wrong. A single role per function — separated by ABAC tags for project — is the scalable pattern. One role for all ML engineers across all projects, with project tags enforcing isolation.

The canonical SageMaker Role Manager and least-privilege production pattern combines four layers: persona-scoped role from Role Manager, ABAC tags for project-level isolation, IAM condition keys for instance/KMS/network mandates, and SCP at the OU level for organization-wide guardrails. Layer 1 (Role Manager persona) gives you the right baseline permissions. Layer 2 (ABAC) gives you project isolation without creating one role per project. Layer 3 (condition keys) enforces encrypt/isolate/cost-bound at API call time. Layer 4 (SCP) prevents account-level admins from disabling layers 1-3. All four together produce a SageMaker environment where engineers can iterate freely within a hard boundary that survives credential compromise, configuration drift, and well-meaning misconfiguration. On the MLA-C01 exam, scenario stems with "enforce these constraints across the entire ML organization" point at this four-layer pattern.

Key Numbers and Must-Memorize Facts for SageMaker Role Manager and Least-Privilege

SageMaker Role Manager

Four built-in personas: ML engineer, data scientist, MLOps engineer, business analyst
Activity groups compose into personas; custom personas pick activity groups directly
Output is an IAM role with inline policy in your account
Does not modify existing roles; only creates new ones

IAM Condition Keys

sagemaker:InstanceTypes — restrict instance types
sagemaker:VolumeKmsKey — require customer-managed EBS KMS
sagemaker:OutputKmsKey — require customer-managed output KMS
sagemaker:NetworkIsolation — force network isolation
sagemaker:VpcSubnets and sagemaker:VpcSecurityGroupIds — require VPC config
sagemaker:RootAccess — control notebook root
sagemaker:MaxRuntimeInSeconds — cap training job duration

ABAC

aws:PrincipalTag/Project = aws:RequestTag/Project enforces project tag at create
aws:PrincipalTag/Project = aws:ResourceTag/Project enforces project tag on read/update
Both layers needed for end-to-end ABAC

SCPs

Apply to all principals in member accounts (users, roles, service-linked roles)
Only modifiable from organization management account
Do not apply to resource policies in other accounts

AWS-Managed Policies

AmazonSageMakerFullAccess: prototyping only, never production
AmazonSageMakerCanvasFullAccess: Canvas-specific
AmazonSageMakerPipelinesIntegrations: Pipelines-specific

MLA-C01 exam priority — SageMaker Role Manager and Least-Privilege Patterns — MLA-C01 ML Engineer Study Notes. This topic carries weight on the MLA-C01 exam. Master the trade-offs, decision boundaries, and the cost/performance triggers each AWS service exposes — the exam will test scenarios that hinge on knowing which service is the wrong answer, not just which is right.

FAQ — SageMaker Role Manager and Least-Privilege Top Questions

Q1 — A team of 30 ML engineers across 6 projects needs SageMaker access. How many roles do I create?

One role, with ABAC. The role has the ML engineer persona's inline policy plus condition keys requiring aws:RequestTag/Project = aws:PrincipalTag/Project on every CreateTrainingJob, CreateEndpointConfig, and similar. Each engineer's federated identity carries a Project principal tag corresponding to their project assignment. New engineers join by adding the principal tag; project moves are a tag change, not a role change. The alternative — 30 hand-crafted roles or 6 project roles — does not scale.

Q2 — SageMaker Role Manager generated a role but the training job fails with AccessDenied to the input S3 bucket. What did Role Manager miss?

Role Manager generates principal-side IAM, not resource-side policies. The training role needs s3:GetObject (which Role Manager provides scoped to the buckets you configured), but the S3 bucket policy on the input bucket must also grant the role principal access — Role Manager does not edit bucket policies. Cross-check three places: the role's inline policy lists the bucket ARN, the bucket policy lists the role ARN, and (if the bucket is KMS-encrypted) the KMS key policy lists the role principal. Missing any of the three produces AccessDenied at job start.

Q3 — How do I enforce that no SageMaker training job in our organization can use ml.p4d.24xlarge or larger instances?

SCP at the organization root with a Deny statement on sagemaker:CreateTrainingJob when sagemaker:InstanceTypes includes any of the disallowed types: "Condition": {"ForAnyValue:StringEquals": {"sagemaker:InstanceTypes": ["ml.p4d.24xlarge", "ml.p4de.24xlarge", "ml.p5.48xlarge"]}}. The SCP applies to all principals in all accounts and survives any inline-policy change in member accounts. For exception workflows, create a separate elevated role in the Modelling account that the SCP exempts via principal ARN, and require manual approval to assume that role.

Q4 — A new MLOps engineer joined and needs to deploy endpoints, but I do not want to give them training-job creation. Which persona?

The MLOps engineer persona, generated via Role Manager. It includes endpoint creation, endpoint update, deployment guardrails, monitoring schedule management, and Model Registry approval — but not training job creation, processing job creation, or model authoring. The persona enforces the operational role versus the authoring role separation. If the MLOps engineer occasionally needs emergency training-job launch, create a separate "MLOps emergency" role with sagemaker:CreateTrainingJob plus a CloudWatch alarm on use, and require break-glass procedure.

Q5 — How do I detect over-permissioned SageMaker roles in production?

Three layers. First, IAM Access Analyzer "Unused Access" identifies role permissions that have not been used in the last 90 days; trim those to tighten the role. Second, IAM Access Analyzer "External Access" flags S3 bucket policies, KMS key policies, and role trust policies that grant access outside your account or organization — review and tighten. Third, run AWS Config managed rule iam-policy-no-statements-with-admin-access and a custom rule checking for AmazonSageMakerFullAccess attachment in production. Schedule findings to Security Hub for centralized triage.

Q6 — Cross-account SageMaker training: the Data account holds the dataset, the Modelling account runs training. How do I configure the principals and resources?

Four steps. Step 1: in the Modelling account, generate an ML engineer execution role via Role Manager scoped to the Data-account bucket ARN and KMS key ARN. Step 2: in the Data account, add a bucket policy statement allowing the Modelling-account execution role principal to call s3:GetObject and s3:ListBucket on the dataset prefix, with aws:SourceAccount and aws:SourceArn conditions for confused-deputy protection. Step 3: in the Data account, add the Modelling-account role principal to the KMS key policy with kms:Decrypt permission. Step 4: validate with a dry-run training job that simply downloads a single file and exits. The four-step pattern generalizes to multi-account ML at any scale.

Q7 — IAM Access Analyzer flagged my SageMaker execution role as having external access. Where is the leak?

External access in Access Analyzer means a resource policy in your account grants a principal in another account or organization access. For SageMaker, the most common sources are: (1) an S3 bucket policy on a model artifact bucket allowing a partner account, (2) a KMS key policy on a training data key allowing another account, (3) an IAM role trust policy allowing another account's principals to assume the role. Open the Access Analyzer finding to see the specific resource and the external principal. If the external access is intentional (planned cross-account ML pipeline), suppress the finding with a documented justification. If unintentional, tighten the resource policy immediately and investigate how it was added — typically through a careless wildcard or an outdated cross-account configuration.

What Is SageMaker Role Manager?

Why SageMaker Role Manager Exists

Where SageMaker Role Manager Sits in the Console

What Role Manager Does Not Replace

Plain-Language Explanation: SageMaker Role Manager and Least-Privilege

Analogy 1 — The Hospital Staff Badge System

Analogy 2 — The Locksmith's Master Key System

Analogy 3 — The Restaurant Kitchen Stations

The Four Built-In SageMaker Role Manager Personas

Persona 1 — ML Engineer

Persona 2 — Data Scientist

Persona 3 — MLOps Engineer

Persona 4 — Business Analyst

Activity Groups — The Building Blocks of Role Manager Permissions

What an Activity Group Is

How Activity Groups Compose

Custom Personas

Activity Groups vs Raw IAM Statements

AWS-Managed Policies vs SageMaker Role Manager Output vs Custom

AmazonSageMakerFullAccess — The Anti-Pattern

AWS-Managed Service-Specific Policies

SageMaker Role Manager Output

Fully Custom Inline Policies

ABAC — Tag-Based Access Control for SageMaker

The ABAC Problem ABAC Solves

Principal Tags vs Resource Tags

The Tag Enforcement Pattern

ABAC vs RBAC Trade-Offs

IAM Condition Keys for SageMaker

sagemaker:InstanceTypes

sagemaker:VolumeKmsKey

sagemaker:NetworkIsolation

sagemaker:RootAccess

sagemaker:OutputKmsKey

Service Control Policies for ML Accounts

Why SCPs for ML Accounts

Common SCP Patterns for ML

SCPs Apply to Roles, Not Just Users

SCPs and Confused Deputy

Cross-Account ML Workflows and Role Chaining

The Three-Account ML Pattern

Cross-Account Model Registry

Confused Deputy in Cross-Account ML

Audit and Drift Detection — IAM Access Analyzer for SageMaker

IAM Access Analyzer for External Access

IAM Access Analyzer Unused Access

Custom Policy Checks

Cost Guardrails as Least-Privilege

Instance Type Allow-Lists

Max Runtime Conditions

Spot vs On-Demand Conditions

Common Exam Traps for SageMaker Role Manager and Least-Privilege

Trap 1 — Role Manager Updates Existing Roles

Trap 2 — AWS-Managed Policies Are Always Tighter Than Hand-Crafted

Trap 3 — ABAC Tags Are Optional When the Role Has the Right Permissions

Trap 4 — SCPs Override Resource-Side Policies Like Bucket Policies

Trap 5 — sagemaker:InstanceTypes Applies to Endpoint Invocation

Trap 6 — Permission Boundaries Replace Inline Policies

Trap 7 — IAM Access Analyzer Detects Drift Inside Inline Policies

Trap 8 — Each Persona Maps to Exactly One Activity Group

Trap 9 — Cross-Account Role Trust Does Not Need aws:SourceAccount

Trap 10 — A Single Role Per Team Is the Least-Privilege Pattern

Key Numbers and Must-Memorize Facts for SageMaker Role Manager and Least-Privilege

SageMaker Role Manager

IAM Condition Keys

ABAC

SCPs

AWS-Managed Policies

FAQ — SageMaker Role Manager and Least-Privilege Top Questions

Q1 — A team of 30 ML engineers across 6 projects needs SageMaker access. How many roles do I create?

Q2 — SageMaker Role Manager generated a role but the training job fails with AccessDenied to the input S3 bucket. What did Role Manager miss?

Q3 — How do I enforce that no SageMaker training job in our organization can use ml.p4d.24xlarge or larger instances?

Q4 — A new MLOps engineer joined and needs to deploy endpoints, but I do not want to give them training-job creation. Which persona?

Q5 — How do I detect over-permissioned SageMaker roles in production?

Q6 — Cross-account SageMaker training: the Data account holds the dataset, the Modelling account runs training. How do I configure the principals and resources?

Q7 — IAM Access Analyzer flagged my SageMaker execution role as having external access. Where is the leak?

Further Reading — Official AWS Documentation for SageMaker Role Manager and Least-Privilege

Official sources

More MLA-C01 topics