๐งญ Threat Modeling Methods and Workflows
Intro: In a healthy DevSecOps program, threat modeling is not a rare architecture ritual. It is a repeatable engineering workflow that helps teams understand what changed, what can go wrong, what control decisions matter, and what must be monitored after release.
What this page includes
- a practical threat-modeling workflow for fast-moving product teams
- how to model systems when design docs are incomplete or changing
- how to slice attack surface by layer for cloud-native systems
- how to turn model output into backlog items, gates, and detections
The core problem in modern delivery
In waterfall delivery, security teams often expected a stable design document before review. In Agile and DevSecOps, that assumption breaks down. The design is often partial, the architecture evolves continuously, and teams may ship small increments before the whole system is visible. That does not remove the need for threat modeling. It changes the operating model.
Threat modeling in DevSecOps should assume:
- design is incremental;
- scope will change;
- release frequency is high;
- many risks are introduced by integration, automation, identity, and infrastructure choices rather than only by application code.
A practical four-question loop
A simple and durable way to run threat modeling is to keep returning to four questions:
- What are we building or changing?
- What can go wrong?
- What are we going to do about it?
- Did we do a good enough job?
This sounds basic, but it works well because it scales from a 30-minute feature review to a multi-session design review.
Use three modeling depths
| Review depth | Best fit | Duration | Expected output |
|---|---|---|---|
| Lightning model | new endpoint, integration, admin workflow, release-path change | 30-45 minutes | top abuse paths, 2-3 control actions, owner |
| Design review model | new service, new data flow, authz or tenancy change | 60-90 minutes | trust-boundary diagram, attack paths, control plan |
| Deep-dive model | new platform pattern, payment flow, admin plane, multi-tenant boundary | 2-4 sessions | design decisions, risk register entries, release gates, detection asks |
How to model when the design is still moving
When teams say, โthe design is not final yet,โ the right move is not to wait. Instead:
- model the current intended path;
- record assumptions explicitly;
- tag unresolved points as follow-up design risks;
- require a quick delta review when the design changes materially.
This keeps threat modeling aligned with real delivery instead of turning it into a late-stage audit.
Model by change, not by universe
Most teams fail when they try to model the whole platform every time. Start with the delta:
- new public endpoint;
- new trust boundary;
- new service identity;
- new queue, topic, or async worker;
- new third-party integration;
- new deployment control;
- new sensitive data flow.
If the change is small, the model can still be small. The quality bar is not document size. It is decision quality.
Slice the system by layer
One of the most useful ideas from cloud-native security teaching is to avoid stopping at the node or Pod level. Attackers do not stop there. Your model should look through the stack.
Minimum layer stack for cloud-native systems
- edge and ingress
- application or API service
- service-to-service calls
- identity and authorization path
- data stores and caches
- secrets and key handling
- container image and runtime
- cluster and orchestration layer
- cloud control plane and IAM
- CI/CD and artifact path
- logs, telemetry, and admin actions
That structure usually surfaces better questions than a generic list of web vulnerabilities.
Attack-surface prompts that produce useful findings
For application and API changes
- what object identifiers can be guessed, enumerated, replayed, or tampered with?
- where is authorization decided, and can downstream services bypass it?
- what metadata, documentation, or API discovery behavior helps an attacker map the system?
- what bulk-read, export, admin, or recovery actions would hurt most if abused?
For cloud and platform changes
- what happens if a service account, runner token, or cloud role is abused?
- can an attacker move from app access to deployment or infrastructure control?
- can a workload read metadata services, node-level credentials, or broad secrets?
- if one namespace, node, or runner is compromised, what is the blast radius?
For container and Kubernetes changes
- does the workload need root, write access, extra capabilities, host namespaces, or host paths?
- are service account tokens mounted by default?
- does network segmentation actually isolate the workload?
- what admission, scanning, or policy controls would stop an unsafe deployment?
What good outputs look like
A useful threat model should produce concrete action types:
- design change โ e.g. move authorization to a shared decision point;
- platform guardrail โ e.g. require restricted Pod Security Admission or signed images;
- release gate โ e.g. contract linting or DAST before promotion;
- detection requirement โ e.g. alert on privilege escalation, token abuse, or unusual export activity;
- accepted residual risk โ with owner, rationale, expiration, and review date.
Add a business-integrity lens, not only an attacker-technique lens
A repeated weakness in real systems is that the design is technically consistent but business-incorrect. Threat modeling should therefore ask not only โwhat exploit class exists?โ but also:
- what state transition would violate the business invariant even if auth still passes;
- what negative quantity, replay, duplicate event, or stale approval would create loss;
- what hidden assumption about inventory, credits, approvals, pricing, or ownership is currently implicit.
This catches the kind of shallow-model failures that scanner-driven review often misses.
Use DFDs and explicit trust boundaries when the architecture is fuzzy
If the design is still changing, create a simple dataflow view anyway:
- external clients and browser/mobile layers;
- API edge or gateway;
- core services and workers;
- data stores;
- upstream and downstream services;
- telemetry and audit sinks.
Mark the trust boundaries explicitly. For APIs, the most valuable findings usually sit on the flows that cross those boundaries.
A STRIDE-style pass is still useful when kept lightweight
After drawing the flow, quickly ask where the change could introduce:
- spoofing โ false caller or workload identity;
- tampering โ object, event, or state modification;
- repudiation โ weak evidence or actor attribution;
- information disclosure โ tenant crossover, property leaks, debug or cache exposure;
- denial of service โ resource exhaustion or high-cost workflow abuse;
- elevation of privilege โ support, admin, service, or workflow bypass.
This works best as a compact thinking aid, not a paperwork ceremony.
Recommended artifact format
Keep the artifact short and operational:
- architecture snapshot;
- key assets and trust boundaries;
- top abuse paths;
- selected controls;
- unresolved assumptions;
- owner, due date, and re-review trigger.
Common failure modes
- the team lists OWASP buzzwords but never models tenancy, identity, or deployment trust;
- the review ignores CI/CD and artifact trust even though releases are automated;
- findings are written as generic advice instead of decisions and owners;
- teams say โwe need logsโ but never specify which exact events matter;
- the model is created once and never revisited after the architecture changes.
Practical review cadence
| Trigger | Review expectation |
|---|---|
| New service or major feature | design review model |
| New authn/authz flow | design review or deep dive |
| New tenant boundary or admin capability | deep dive |
| New third-party integration | lightning or design review |
| New deployment path or runner model | lightning or design review |
| Material architecture change after prior review | delta review |
Cross-links
- Multi-Tenant and Microservice Threat Modeling
- Architecture Review Question Bank and Decision Records
- API Design and Contract Security
- Runner Isolation and Trust Boundaries
- Cloud Attack Chains Overview
Author attribution: Ivan Piskunov, 2026 - Educational and defensive-engineering use.