🛡️ Security as Policy for Terraform and Infrastructure as Code

Security as policy loop

Intro: Teams often understand Infrastructure as Code first: infrastructure is described, reviewed, versioned, and deployed as code. Security as Policy applies the same idea to guardrails. Instead of relying on tribal memory and manual review alone, the team writes security expectations in testable rules and executes them continuously.

What this page includes

what Security as Policy means in a Terraform-centered workflow

the difference between ad hoc scanner usage and real policy enforcement

where tools like Checkov, OPA/Conftest, Sentinel, and platform policy engines fit

an implementation roadmap for product and platform teams

Working assumptions

policies should clarify engineering decisions, not replace engineering judgment

the goal is not maximum blocking; the goal is repeatable, explainable security guardrails

A simple definition

Security as Policy means expressing security expectations as versioned, reviewable, executable rules that run in the same workflow as code changes.

In practice, that often means:

a rule says production storage must be encrypted
a rule says public ingress to admin ports is forbidden
a rule says certain tags, owners, or logging controls are mandatory
a rule says only approved module sources may be used
a rule says production deployments require evidence and approvals

The key shift is this:

manual review asks humans to remember the rule
policy as code makes the rule explicit and testable

Security as Policy vs Infrastructure as Code

Infrastructure as Code answers:

"What infrastructure do we want to create?"

Security as Policy answers:

"What security conditions must be true before that infrastructure is allowed to ship?"

They work best together.

Pattern	Main concern	Output
Infrastructure as Code	desired infrastructure state	Terraform modules, variables, plans
Security as Policy	acceptable security constraints	policy rules, deny conditions, approval logic
Runtime posture	what is actually deployed	cloud findings, drift, asset state

What Security as Policy is not

It is not:

just running one scanner in CI
a long spreadsheet of standards nobody tests
a giant deny list that developers bypass
a compliance theater layer disconnected from risk

A policy approach becomes real when:

rules are versioned
rules are visible to engineers
rules are tied to stages in the workflow
exceptions are documented
outputs affect approvals or release decisions

Where policy can run

A mature Terraform workflow often applies policy in several places:

authoring time
pre-commit, IDE hooks, local scan
merge request time
CI validation, plan review, blocking checks
platform approval time
policy evaluation on Terraform plan or run
post-deploy time
drift, CSPM, runtime posture, evidence collection

Main implementation patterns

1. Scanner-enforced policy

Examples: Checkov, Trivy config, KICS, Terrascan

Best when you want:

quick coverage
fast onboarding
built-in rule packs
lightweight CI integration

2. Rego-based policy

Examples: OPA, Conftest, policy bundles

Best when you want:

reusable logic
policy repos
multi-tool consistency
clear "deny" and "warn" semantics across Terraform, Kubernetes, APIs, and CI

3. Platform-native policy

Examples: Sentinel in Terraform/HCP Terraform and similar platform guardrails

Best when you want:

policy tightly bound to run approvals
organization-wide governance for Terraform runs
policy decisions close to the plan/apply flow

Why this approach is valuable

For the engineering team

fewer late surprises in release windows
clearer expectations
reusable safe defaults
less argument about obvious guardrails

For the product security team

repeatable evidence
lower review load
better consistency across services and teams
a path from standards text to enforceable controls

For leadership

more predictable release governance
measurable control adoption
fewer ad hoc exceptions
easier audit and board communication

Common policy categories for Terraform

Start with rules that matter operationally.

Identity and access

no wildcard IAM in production
trust policies must be scoped
workload identity over static credentials

Network exposure

no internet-exposed admin ports
private subnets for stateful services where required
approved ingress sources only

Data protection

encryption at rest
versioning and retention on critical storage
logging for sensitive storage and network boundaries

Observability and audit

flow logs, CloudTrail, or equivalent
log retention standards
required tags for ownership and environment

Platform hygiene

approved module sources
pinned module versions
required provider constraints
no disallowed services in restricted environments

Example tools and how they fit

Checkov

Great for broad, fast Terraform and plan scanning with built-in policies plus custom YAML/Python policies.

OPA / Conftest

Great for writing explicit deny rules in Rego and sharing bundles across repos and teams.

Sentinel

Great when Terraform run approvals need tighter policy control inside the Terraform delivery platform.

Vendor-neutral reference architecture

engineers write Terraform
pre-commit catches obvious issues
CI runs validate + scanner checks
a Terraform plan is generated
policy runs against plan output
failures create block, warn, or exception paths
approved changes deploy
runtime posture feeds back into policy and modules

Example: Conftest policy against Terraform HCL

Semgrep as a complementary policy layer

Semgrep is useful when policy teams want fast custom checks for Terraform, YAML, CI config, or organization-specific anti-patterns that do not justify a full policy-engine rollout yet.

Practical use cases

ban clearly dangerous Terraform resource patterns;
catch open-to-the-world ingress fragments before broader IaC scanners run;
create reviewer-readable checks for internal module conventions.

Important note

Historically, teams often used Semgrep generic pattern matching for Terraform and YAML because native support was limited. The current state is better, but the lesson still holds: Semgrep works best as a targeted custom-check layer, not as fake CSPM.

Example: Conftest policy against Terraform HCL

See:

Example command:

conftest test --policy snippets/policy snippets/terraform/checkov-demo/main.tf

Example: Sentinel-style guardrail

See ../snippets/policy/sentinel-restrict-public-sg.sentinel.

Use this pattern when the organization wants to say:

this change may be reviewed, but not applied unless it satisfies policy
exceptions should be explicit, not silent

Implementation roadmap

Phase 1 — baseline

pick one scanner
pick 10 to 20 high-signal rules
publish results without blocking
document ownership

Phase 2 — narrow enforcement

block critical misconfigurations on new changes
create an exception path
define who approves exceptions

Phase 3 — module-first hardening

move repeated remediations into shared modules
reduce the need for repeated scanner-only fixes

Phase 4 — policy repository

store custom policies in a dedicated repo
version and review policy changes like application code
publish policy bundles to CI consumers

Phase 5 — business linkage

connect policy results to service inventory, exposure, and release evidence
report adoption, exception debt, and risk reduction to leadership

A simple workflow model

Author
Developer writes or changes Terraform.

Validate
terraform fmt, terraform validate, basic lint, scanner pass.

Evaluate policy
Check source + plan against organization rules.

Triage
Choose pass, fail, or exception.

Approve
Evidence and approvals recorded.

Deploy
Approved plan moves forward.

Measure
Metrics roll up by app, team, environment, and release train.

What makes rollout fail

Security as Policy often fails when:

the first version blocks everything
policies are copied from a benchmark with no business context
nobody owns exceptions
the team measures only finding count
there is no shared module strategy

What good adoption looks like

Good adoption looks like this:

fewer recurring Terraform mistakes
clearer release expectations
smaller review burden for security teams
better consistency across services
more useful leadership reporting

Quick start plan

If a team asks, "Where do we begin?" the shortest sensible answer is:

choose Checkov as a baseline scanner
add a small policy pack for the highest-risk controls
define pass / warn / exception states
run in CI on merge requests
review exception debt monthly
feed the learnings back into shared Terraform modules

Cross-links

Footer note: The best policy is the one engineers can understand, reviewers can defend, and CI can enforce consistently.

Testing infrastructure as code

Policy checks are only one part of IaC confidence. A healthy workflow tests infrastructure at several levels.

Test layer	Goal	Typical tools
Syntax and semantics	catch broken HCL or template logic early	`terraform fmt`, `terraform validate`, `tflint`
Misconfiguration scan	catch known risky patterns	Checkov, Terrascan, KICS, Trivy config
Policy test	enforce organization rules	OPA / Conftest, Sentinel, platform policy
Module test	prove a module behaves as expected	Terratest, Kitchen-Terraform, Testinfra-style flows
Integration / environment test	confirm the deployed stack really behaves correctly	ephemeral environments, smoke tests, cloud API assertions

Example local flow

terraform fmt -check
terraform validate
tflint
checkov -d .
conftest test --policy ../snippets/policy .

Why this matters

A lot of teams stop at “the plan compiles.”
That proves very little about:

network exposure;
encryption settings;
IAM trust boundaries;
module regressions;
cloud-provider behavior after apply.

Tool-selection guidance

If you are building an IaC control stack from scratch:

start with one formatter/linter and one misconfiguration scanner;
add policy enforcement only after the team understands the baseline findings;
add module or integration tests where the blast radius is high;
prefer simple, repeatable commands over a giant, fragile validation tower.

Policy without testing is brittle

Security policy works best when it is only one layer in the IaC test stack.

A practical order is:

format and validate
static misconfiguration checks
policy checks
plan-aware review
selective environment tests
drift-aware follow-up

For a fuller progression model, see 🧱 Infrastructure as Code Maturity and Test Strategy.