AWS IAM and Role Design
Intro: This page treats IAM as one of the main architectural control planes in AWS. Good role design reduces blast radius, makes review easier, and prevents cloud access from turning into a hard-to-audit maze of standing privilege.
What this page includes
- a role model for humans, workloads, automation, and break-glass access
- practical design choices for trust policies, permissions boundaries, ABAC, and session controls
- review questions that expose identity sprawl early
Working assumptions
- long-lived credentials and role sprawl are signs of weak operating design, not unavoidable cloud reality
AWS IAM design is strongest when it starts with federation and temporary credentials, then narrows outward into purpose-built roles, policy guardrails, and auditable exceptions.
Design principle
Treat every role as a statement of three things:
- who or what may assume it;
- under which trust conditions;
- what maximum action set is actually needed.
If any of those answers are vague, the role is usually too broad.
Role families worth separating
| Role family | Typical caller | Security goal |
|---|---|---|
| Human access roles | engineers, platform operators, responders | short-lived interactive access with clear accountability |
| CI/CD automation roles | trusted delivery pipelines | narrowly scoped deployment and artifact actions |
| Workload runtime roles | applications and controllers | least-privilege service access without embedded keys |
| Admin platform roles | cloud platform owners | privileged configuration changes with tighter review |
| Break-glass roles | emergency responders | exceptional access with extra approval and logging |
Recommended baseline
Federate humans first
Prefer identity-provider federation for people and issue temporary credentials through roles. Standing IAM users should be rare and justified.
Separate human, machine, and workload access
Do not reuse the same role family across interactive engineers, pipelines, and runtime workloads. Their trust conditions, audit expectations, and blast radius are different.
Design for small, named trust boundaries
Good examples include:
- one runtime role per workload or tightly related workload set;
- one deployment role per environment tier or platform function;
- read-only discovery roles distinct from mutation-capable roles;
- a separate break-glass path with tighter scrutiny.
Trust policy patterns
A trust policy is not boilerplate. It is the gate that decides which principal can even begin to request the permissions of a role.
Human access patterns
For human roles, strong patterns include:
- federation from the corporate identity provider;
- session duration aligned to the task, not the whole day by default;
- clear role naming by environment and privilege level;
- MFA and contextual restrictions where applicable;
- source identity or session tagging to preserve attribution.
Workload access patterns
For workloads, make the caller identity explicit:
- EKS workloads should prefer workload identity patterns such as IRSA rather than borrowing a node role;
- serverless or service-native workloads should use the serviceโs native role attachment model;
- CI/CD deploy roles should trust only the pipeline identity path that actually needs them.
Cross-account patterns
Cross-account access should make the trust boundary obvious:
- specify exactly which principal or role path may assume the role;
- use conditions when they materially narrow trust;
- review the necessity of every wildcard in the trust relationship;
- make external access auditable at the account and organization layers.
Permissions design patterns
Prefer role-per-boundary over โone giant platform roleโ
Broad reusable roles save short-term setup time but create long-term review failure. Instead, design by boundary:
- repository or pipeline trust level;
- workload identity;
- environment tier;
- business domain;
- admin versus runtime action set.
Use permissions boundaries when delegating role creation
If teams can create or modify roles, permissions boundaries help define the maximum permissions those identities may ever receive, even if an identity-based policy is broader than intended.
Use ABAC where it simplifies scale, not where it hides complexity
ABAC can reduce policy sprawl when your tagging model is disciplined. It works best when:
- principal tags come from a trusted identity source or controlled role design;
- resource tags are required and reviewed;
- service coverage for tag-based authorization is understood;
- broad admin policies do not silently bypass the model.
Validate policies before attachment
Use policy validation and access analysis before production use. The point is not only grammar correctness. The point is to catch accidental broad access, weak conditions, and public or cross-account exposure paths early.
EKS and workload identity
If Kubernetes workloads need AWS access, a common target state is:
- bind a dedicated Kubernetes service account to the workload;
- map that service account to a dedicated IAM role;
- scope the role to the workloadโs actual AWS calls;
- keep node roles smaller because they no longer need to carry application permissions.
This keeps runtime identity closer to workload ownership and makes review more understandable.
Example review checklist
Ask these questions in every IAM design review:
- Are humans using federation and temporary credentials by default?
- Which roles still rely on long-lived credentials or standing users?
- Does each role have a single clear purpose and owner?
- Are trust policies tighter than the permission policies they protect?
- Can a pipeline or workload assume a role that is broader than its business function?
- Where are permissions boundaries used, and where should they be?
- Which roles use ABAC or session tags, and who controls the tag source?
- Are workload identities separated from node or host identities?
- Is there a distinct break-glass path with logging and review?
- Has policy validation or access analysis been performed before rollout?
Common anti-patterns
- keeping permanent IAM users because migration to federation feels inconvenient;
- using the same admin-like role for people, pipelines, and workloads;
- attaching broad managed policies first and never narrowing them later;
- letting node roles carry application permissions in EKS when workload identity is available;
- writing trust policies with weak principals or broad wildcard assumptions;
- treating tags as ABAC truth when the tag assignment process itself is untrusted.
Example role catalog
| Example role | Intended caller | Typical scope |
|---|---|---|
eng-readonly-prod |
humans | read-only production inspection |
platform-admin-nonprod |
cloud platform owners | controlled admin changes outside production |
gitlab-deploy-prod-service-a |
protected pipeline lane | deploy one service to one environment |
eks-sa-payments-writer |
payments workload service account | scoped access to required AWS services only |
breakglass-security-admin |
emergency response only | time-bound exceptional admin access |
Related pages
- Infrastructure and Cloud Security
- AWS Networking and Policy Baseline
- IaC and Policy as Code
- Kubernetes Hardening
- Cloud Attack Chains
Suggested reference links
- AWS IAM best practices
- AWS permissions boundaries
- AWS ABAC introduction
- AWS session tags
- AWS IAM Access Analyzer policy validation
- IAM roles for service accounts in EKS
Author attribution: Ivan Piskunov, 2026 - Educational and defensive-engineering use.