PS Product SecurityKnowledge Base

๐Ÿ›ก๏ธ Security as Policy for Terraform and Infrastructure as Code

Security as policy loop

Intro: Teams often understand Infrastructure as Code first: infrastructure is described, reviewed, versioned, and deployed as code. Security as Policy applies the same idea to guardrails. Instead of relying on tribal memory and manual review alone, the team writes security expectations in testable rules and executes them continuously.

What this page includes

  • what Security as Policy means in a Terraform-centered workflow
  • the difference between ad hoc scanner usage and real policy enforcement
  • where tools like Checkov, OPA/Conftest, Sentinel, and platform policy engines fit
  • an implementation roadmap for product and platform teams

Working assumptions

  • policies should clarify engineering decisions, not replace engineering judgment
  • the goal is not maximum blocking; the goal is repeatable, explainable security guardrails

A simple definition

Security as Policy means expressing security expectations as versioned, reviewable, executable rules that run in the same workflow as code changes.

In practice, that often means:

  • a rule says production storage must be encrypted
  • a rule says public ingress to admin ports is forbidden
  • a rule says certain tags, owners, or logging controls are mandatory
  • a rule says only approved module sources may be used
  • a rule says production deployments require evidence and approvals

The key shift is this:

  • manual review asks humans to remember the rule
  • policy as code makes the rule explicit and testable

Security as Policy vs Infrastructure as Code

Infrastructure as Code answers:

"What infrastructure do we want to create?"

Security as Policy answers:

"What security conditions must be true before that infrastructure is allowed to ship?"

They work best together.

Pattern Main concern Output
Infrastructure as Code desired infrastructure state Terraform modules, variables, plans
Security as Policy acceptable security constraints policy rules, deny conditions, approval logic
Runtime posture what is actually deployed cloud findings, drift, asset state

What Security as Policy is not

It is not:

  • just running one scanner in CI
  • a long spreadsheet of standards nobody tests
  • a giant deny list that developers bypass
  • a compliance theater layer disconnected from risk

A policy approach becomes real when:

  • rules are versioned
  • rules are visible to engineers
  • rules are tied to stages in the workflow
  • exceptions are documented
  • outputs affect approvals or release decisions

Where policy can run

A mature Terraform workflow often applies policy in several places:

  1. authoring time
    pre-commit, IDE hooks, local scan

  2. merge request time
    CI validation, plan review, blocking checks

  3. platform approval time
    policy evaluation on Terraform plan or run

  4. post-deploy time
    drift, CSPM, runtime posture, evidence collection

Main implementation patterns

1. Scanner-enforced policy

Examples: Checkov, Trivy config, KICS, Terrascan

Best when you want:

  • quick coverage
  • fast onboarding
  • built-in rule packs
  • lightweight CI integration

2. Rego-based policy

Examples: OPA, Conftest, policy bundles

Best when you want:

  • reusable logic
  • policy repos
  • multi-tool consistency
  • clear "deny" and "warn" semantics across Terraform, Kubernetes, APIs, and CI

3. Platform-native policy

Examples: Sentinel in Terraform/HCP Terraform and similar platform guardrails

Best when you want:

  • policy tightly bound to run approvals
  • organization-wide governance for Terraform runs
  • policy decisions close to the plan/apply flow

Why this approach is valuable

For the engineering team

  • fewer late surprises in release windows
  • clearer expectations
  • reusable safe defaults
  • less argument about obvious guardrails

For the product security team

  • repeatable evidence
  • lower review load
  • better consistency across services and teams
  • a path from standards text to enforceable controls

For leadership

  • more predictable release governance
  • measurable control adoption
  • fewer ad hoc exceptions
  • easier audit and board communication

Common policy categories for Terraform

Start with rules that matter operationally.

Identity and access

  • no wildcard IAM in production
  • trust policies must be scoped
  • workload identity over static credentials

Network exposure

  • no internet-exposed admin ports
  • private subnets for stateful services where required
  • approved ingress sources only

Data protection

  • encryption at rest
  • versioning and retention on critical storage
  • logging for sensitive storage and network boundaries

Observability and audit

  • flow logs, CloudTrail, or equivalent
  • log retention standards
  • required tags for ownership and environment

Platform hygiene

  • approved module sources
  • pinned module versions
  • required provider constraints
  • no disallowed services in restricted environments

Example tools and how they fit

Checkov

Great for broad, fast Terraform and plan scanning with built-in policies plus custom YAML/Python policies.

OPA / Conftest

Great for writing explicit deny rules in Rego and sharing bundles across repos and teams.

Sentinel

Great when Terraform run approvals need tighter policy control inside the Terraform delivery platform.

Vendor-neutral reference architecture

  1. engineers write Terraform
  2. pre-commit catches obvious issues
  3. CI runs validate + scanner checks
  4. a Terraform plan is generated
  5. policy runs against plan output
  6. failures create block, warn, or exception paths
  7. approved changes deploy
  8. runtime posture feeds back into policy and modules

Example: Conftest policy against Terraform HCL

Semgrep as a complementary policy layer

Semgrep is useful when policy teams want fast custom checks for Terraform, YAML, CI config, or organization-specific anti-patterns that do not justify a full policy-engine rollout yet.

Practical use cases

  • ban clearly dangerous Terraform resource patterns;
  • catch open-to-the-world ingress fragments before broader IaC scanners run;
  • create reviewer-readable checks for internal module conventions.

Important note

Historically, teams often used Semgrep generic pattern matching for Terraform and YAML because native support was limited. The current state is better, but the lesson still holds: Semgrep works best as a targeted custom-check layer, not as fake CSPM.

See also:

Example: Conftest policy against Terraform HCL

See:

Example command:

conftest test --policy snippets/policy snippets/terraform/checkov-demo/main.tf

Example: Sentinel-style guardrail

See ../snippets/policy/sentinel-restrict-public-sg.sentinel.

Use this pattern when the organization wants to say:

  • this change may be reviewed, but not applied unless it satisfies policy
  • exceptions should be explicit, not silent

Implementation roadmap

Phase 1 โ€” baseline

  • pick one scanner
  • pick 10 to 20 high-signal rules
  • publish results without blocking
  • document ownership

Phase 2 โ€” narrow enforcement

  • block critical misconfigurations on new changes
  • create an exception path
  • define who approves exceptions

Phase 3 โ€” module-first hardening

  • move repeated remediations into shared modules
  • reduce the need for repeated scanner-only fixes

Phase 4 โ€” policy repository

  • store custom policies in a dedicated repo
  • version and review policy changes like application code
  • publish policy bundles to CI consumers

Phase 5 โ€” business linkage

  • connect policy results to service inventory, exposure, and release evidence
  • report adoption, exception debt, and risk reduction to leadership

A simple workflow model

Author
Developer writes or changes Terraform.

Validate
terraform fmt, terraform validate, basic lint, scanner pass.

Evaluate policy
Check source + plan against organization rules.

Triage
Choose pass, fail, or exception.

Approve
Evidence and approvals recorded.

Deploy
Approved plan moves forward.

Measure
Metrics roll up by app, team, environment, and release train.

What makes rollout fail

Security as Policy often fails when:

  • the first version blocks everything
  • policies are copied from a benchmark with no business context
  • nobody owns exceptions
  • the team measures only finding count
  • there is no shared module strategy

What good adoption looks like

Good adoption looks like this:

  • fewer recurring Terraform mistakes
  • clearer release expectations
  • smaller review burden for security teams
  • better consistency across services
  • more useful leadership reporting

Quick start plan

If a team asks, "Where do we begin?" the shortest sensible answer is:

  1. choose Checkov as a baseline scanner
  2. add a small policy pack for the highest-risk controls
  3. define pass / warn / exception states
  4. run in CI on merge requests
  5. review exception debt monthly
  6. feed the learnings back into shared Terraform modules

Footer note: The best policy is the one engineers can understand, reviewers can defend, and CI can enforce consistently.

Testing infrastructure as code

Policy checks are only one part of IaC confidence. A healthy workflow tests infrastructure at several levels.

Test layer Goal Typical tools
Syntax and semantics catch broken HCL or template logic early terraform fmt, terraform validate, tflint
Misconfiguration scan catch known risky patterns Checkov, Terrascan, KICS, Trivy config
Policy test enforce organization rules OPA / Conftest, Sentinel, platform policy
Module test prove a module behaves as expected Terratest, Kitchen-Terraform, Testinfra-style flows
Integration / environment test confirm the deployed stack really behaves correctly ephemeral environments, smoke tests, cloud API assertions

Example local flow

terraform fmt -check
terraform validate
tflint
checkov -d .
conftest test --policy ../snippets/policy .

Why this matters

A lot of teams stop at โ€œthe plan compiles.โ€
That proves very little about:

  • network exposure;
  • encryption settings;
  • IAM trust boundaries;
  • module regressions;
  • cloud-provider behavior after apply.

Tool-selection guidance

If you are building an IaC control stack from scratch:

  • start with one formatter/linter and one misconfiguration scanner;
  • add policy enforcement only after the team understands the baseline findings;
  • add module or integration tests where the blast radius is high;
  • prefer simple, repeatable commands over a giant, fragile validation tower.

Policy without testing is brittle

Security policy works best when it is only one layer in the IaC test stack.

A practical order is:

  1. format and validate
  2. static misconfiguration checks
  3. policy checks
  4. plan-aware review
  5. selective environment tests
  6. drift-aware follow-up

For a fuller progression model, see ๐Ÿงฑ Infrastructure as Code Maturity and Test Strategy.