๐ก๏ธ Security as Policy for Terraform and Infrastructure as Code
Intro: Teams often understand Infrastructure as Code first: infrastructure is described, reviewed, versioned, and deployed as code. Security as Policy applies the same idea to guardrails. Instead of relying on tribal memory and manual review alone, the team writes security expectations in testable rules and executes them continuously.
What this page includes
- what Security as Policy means in a Terraform-centered workflow
- the difference between ad hoc scanner usage and real policy enforcement
- where tools like Checkov, OPA/Conftest, Sentinel, and platform policy engines fit
- an implementation roadmap for product and platform teams
Working assumptions
- policies should clarify engineering decisions, not replace engineering judgment
- the goal is not maximum blocking; the goal is repeatable, explainable security guardrails
A simple definition
Security as Policy means expressing security expectations as versioned, reviewable, executable rules that run in the same workflow as code changes.
In practice, that often means:
- a rule says production storage must be encrypted
- a rule says public ingress to admin ports is forbidden
- a rule says certain tags, owners, or logging controls are mandatory
- a rule says only approved module sources may be used
- a rule says production deployments require evidence and approvals
The key shift is this:
- manual review asks humans to remember the rule
- policy as code makes the rule explicit and testable
Security as Policy vs Infrastructure as Code
Infrastructure as Code answers:
"What infrastructure do we want to create?"
Security as Policy answers:
"What security conditions must be true before that infrastructure is allowed to ship?"
They work best together.
| Pattern | Main concern | Output |
|---|---|---|
| Infrastructure as Code | desired infrastructure state | Terraform modules, variables, plans |
| Security as Policy | acceptable security constraints | policy rules, deny conditions, approval logic |
| Runtime posture | what is actually deployed | cloud findings, drift, asset state |
What Security as Policy is not
It is not:
- just running one scanner in CI
- a long spreadsheet of standards nobody tests
- a giant deny list that developers bypass
- a compliance theater layer disconnected from risk
A policy approach becomes real when:
- rules are versioned
- rules are visible to engineers
- rules are tied to stages in the workflow
- exceptions are documented
- outputs affect approvals or release decisions
Where policy can run
A mature Terraform workflow often applies policy in several places:
authoring time
pre-commit, IDE hooks, local scanmerge request time
CI validation, plan review, blocking checksplatform approval time
policy evaluation on Terraform plan or runpost-deploy time
drift, CSPM, runtime posture, evidence collection
Main implementation patterns
1. Scanner-enforced policy
Examples: Checkov, Trivy config, KICS, Terrascan
Best when you want:
- quick coverage
- fast onboarding
- built-in rule packs
- lightweight CI integration
2. Rego-based policy
Examples: OPA, Conftest, policy bundles
Best when you want:
- reusable logic
- policy repos
- multi-tool consistency
- clear "deny" and "warn" semantics across Terraform, Kubernetes, APIs, and CI
3. Platform-native policy
Examples: Sentinel in Terraform/HCP Terraform and similar platform guardrails
Best when you want:
- policy tightly bound to run approvals
- organization-wide governance for Terraform runs
- policy decisions close to the plan/apply flow
Why this approach is valuable
For the engineering team
- fewer late surprises in release windows
- clearer expectations
- reusable safe defaults
- less argument about obvious guardrails
For the product security team
- repeatable evidence
- lower review load
- better consistency across services and teams
- a path from standards text to enforceable controls
For leadership
- more predictable release governance
- measurable control adoption
- fewer ad hoc exceptions
- easier audit and board communication
Common policy categories for Terraform
Start with rules that matter operationally.
Identity and access
- no wildcard IAM in production
- trust policies must be scoped
- workload identity over static credentials
Network exposure
- no internet-exposed admin ports
- private subnets for stateful services where required
- approved ingress sources only
Data protection
- encryption at rest
- versioning and retention on critical storage
- logging for sensitive storage and network boundaries
Observability and audit
- flow logs, CloudTrail, or equivalent
- log retention standards
- required tags for ownership and environment
Platform hygiene
- approved module sources
- pinned module versions
- required provider constraints
- no disallowed services in restricted environments
Example tools and how they fit
Checkov
Great for broad, fast Terraform and plan scanning with built-in policies plus custom YAML/Python policies.
OPA / Conftest
Great for writing explicit deny rules in Rego and sharing bundles across repos and teams.
Sentinel
Great when Terraform run approvals need tighter policy control inside the Terraform delivery platform.
Vendor-neutral reference architecture
- engineers write Terraform
- pre-commit catches obvious issues
- CI runs validate + scanner checks
- a Terraform plan is generated
- policy runs against plan output
- failures create block, warn, or exception paths
- approved changes deploy
- runtime posture feeds back into policy and modules
Example: Conftest policy against Terraform HCL
Semgrep as a complementary policy layer
Semgrep is useful when policy teams want fast custom checks for Terraform, YAML, CI config, or organization-specific anti-patterns that do not justify a full policy-engine rollout yet.
Practical use cases
- ban clearly dangerous Terraform resource patterns;
- catch open-to-the-world ingress fragments before broader IaC scanners run;
- create reviewer-readable checks for internal module conventions.
Important note
Historically, teams often used Semgrep generic pattern matching for Terraform and YAML because native support was limited. The current state is better, but the lesson still holds: Semgrep works best as a targeted custom-check layer, not as fake CSPM.
See also:
Example: Conftest policy against Terraform HCL
See:
Example command:
conftest test --policy snippets/policy snippets/terraform/checkov-demo/main.tf
Example: Sentinel-style guardrail
See ../snippets/policy/sentinel-restrict-public-sg.sentinel.
Use this pattern when the organization wants to say:
- this change may be reviewed, but not applied unless it satisfies policy
- exceptions should be explicit, not silent
Implementation roadmap
Phase 1 โ baseline
- pick one scanner
- pick 10 to 20 high-signal rules
- publish results without blocking
- document ownership
Phase 2 โ narrow enforcement
- block critical misconfigurations on new changes
- create an exception path
- define who approves exceptions
Phase 3 โ module-first hardening
- move repeated remediations into shared modules
- reduce the need for repeated scanner-only fixes
Phase 4 โ policy repository
- store custom policies in a dedicated repo
- version and review policy changes like application code
- publish policy bundles to CI consumers
Phase 5 โ business linkage
- connect policy results to service inventory, exposure, and release evidence
- report adoption, exception debt, and risk reduction to leadership
A simple workflow model
Author
Developer writes or changes Terraform.
Validate
terraform fmt, terraform validate, basic lint, scanner pass.
Evaluate policy
Check source + plan against organization rules.
Triage
Choose pass, fail, or exception.
Approve
Evidence and approvals recorded.
Deploy
Approved plan moves forward.
Measure
Metrics roll up by app, team, environment, and release train.
What makes rollout fail
Security as Policy often fails when:
- the first version blocks everything
- policies are copied from a benchmark with no business context
- nobody owns exceptions
- the team measures only finding count
- there is no shared module strategy
What good adoption looks like
Good adoption looks like this:
- fewer recurring Terraform mistakes
- clearer release expectations
- smaller review burden for security teams
- better consistency across services
- more useful leadership reporting
Quick start plan
If a team asks, "Where do we begin?" the shortest sensible answer is:
- choose Checkov as a baseline scanner
- add a small policy pack for the highest-risk controls
- define pass / warn / exception states
- run in CI on merge requests
- review exception debt monthly
- feed the learnings back into shared Terraform modules
Cross-links
- ๐งฑ Terraform Security Scanning and Checkov
- Terraform Snippet Pack
- Security Quality Gates and Release Blocking
- ๐ Product Security Director Metrics
Footer note: The best policy is the one engineers can understand, reviewers can defend, and CI can enforce consistently.
Testing infrastructure as code
Policy checks are only one part of IaC confidence. A healthy workflow tests infrastructure at several levels.
| Test layer | Goal | Typical tools |
|---|---|---|
| Syntax and semantics | catch broken HCL or template logic early | terraform fmt, terraform validate, tflint |
| Misconfiguration scan | catch known risky patterns | Checkov, Terrascan, KICS, Trivy config |
| Policy test | enforce organization rules | OPA / Conftest, Sentinel, platform policy |
| Module test | prove a module behaves as expected | Terratest, Kitchen-Terraform, Testinfra-style flows |
| Integration / environment test | confirm the deployed stack really behaves correctly | ephemeral environments, smoke tests, cloud API assertions |
Example local flow
terraform fmt -check
terraform validate
tflint
checkov -d .
conftest test --policy ../snippets/policy .
Why this matters
A lot of teams stop at โthe plan compiles.โ
That proves very little about:
- network exposure;
- encryption settings;
- IAM trust boundaries;
- module regressions;
- cloud-provider behavior after apply.
Tool-selection guidance
If you are building an IaC control stack from scratch:
- start with one formatter/linter and one misconfiguration scanner;
- add policy enforcement only after the team understands the baseline findings;
- add module or integration tests where the blast radius is high;
- prefer simple, repeatable commands over a giant, fragile validation tower.
Policy without testing is brittle
Security policy works best when it is only one layer in the IaC test stack.
A practical order is:
- format and validate
- static misconfiguration checks
- policy checks
- plan-aware review
- selective environment tests
- drift-aware follow-up
For a fuller progression model, see ๐งฑ Infrastructure as Code Maturity and Test Strategy.