PS Product SecurityKnowledge Base

Interview Labs

DevSecOps Engineer Configuration and Platform Review Drills

Purpose: train the exact skill interviewers look for in DevSecOps loops: quickly reading YAML, HCL, Dockerfiles, pipeline files, IAM/RBAC objects, and secret-delivery patterns, then reasoning out loud about risk, blast radius, and safe remediation.

Companion page

For a worked example of a small post-build scanner image that bundles multiple tools into one reproducible container, see Custom Security Toolbox Container for Post-Build Tests.

The five-pass method

When a config review task appears on screen, use this sequence:

  1. Workload identity and privilege pass. Who can do what? Which service account, role, runner, token, or instance profile is in play?
  2. Network and exposure pass. What is internet-facing? What can talk east-west? Where is ingress/egress unrestricted?
  3. Secret and trust pass. Where do secrets come from? How long do they live? Are they logged, baked into images, or mounted too broadly?
  4. Integrity and release pass. Can untrusted code influence the pipeline, artifact, image, deployment target, or production credentials?
  5. Observability and rollback pass. If this breaks or is abused, do we have audit logs, immutable evidence, and a safe rollback path?

High-signal verbal frame

"I would review this in terms of identity, privilege, trust boundaries, and blast radius. The first risky area I see is [control failure]. The likely impact is [impact]. I would remediate by [precise change], and then prove the fix through [test/evidence/alerting]."

Drill 1 - Kubernetes NetworkPolicy missing default deny

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: frontend
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector: {}

What is wrong

  1. It protects only pods with app=frontend; everything else in the namespace may still be fully open.
  2. namespaceSelector: {} effectively allows traffic from every namespace.
  3. No egress controls are defined.
  4. No namespace-wide default-deny policy exists, so segmentation intent is weak.

Better answer

"This policy looks restrictive at first glance, but namespaceSelector: {} is basically 'from anywhere in the cluster'. Also, because there is no namespace default deny, the control is narrow and easy to misunderstand. I would add explicit default-deny ingress and egress policies first, then create narrow allow rules."


Drill 2 - Dockerfile running as root with build leaks

FROM node:20
WORKDIR /app
COPY . .
RUN npm install
ENV NPM_TOKEN=hardcoded-token
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

What is wrong

  1. Runs as root by default.
  2. Entire context is copied before dependency install, increasing secret leak and cache invalidation risk.
  3. Hardcoded token in image metadata / layer history.
  4. No multi-stage build.
  5. No pinned dependency install mode like npm ci.

Strong answer

"This is a classic build-time and runtime trust problem. The image bakes a secret into layers, runs as root, and does not separate build and runtime stages. Even if the app is fine, compromise of the container gives an attacker more leverage than necessary."


Drill 3 - Terraform public storage bucket

resource "aws_s3_bucket" "logs" {
  bucket = "company-prod-logs"
}

resource "aws_s3_bucket_public_access_block" "logs" {
  bucket = aws_s3_bucket.logs.id
  block_public_acls       = false
  block_public_policy     = false
  ignore_public_acls      = false
  restrict_public_buckets = false
}

What is wrong

  1. Public access blocks explicitly disabled.
  2. No encryption configuration shown.
  3. No versioning or object lock / immutability for logs.
  4. No bucket policy restricting access to specific principals or VPC endpoints.

Interview language

"For a log bucket, the problem is not just confidentiality. It is also integrity and evidence retention. A public or weakly controlled log bucket makes both exposure and tampering more likely. I would turn on the public access blocks, server-side encryption, versioning, and, where evidence needs to be preserved, object lock or another immutable logging pattern."


Drill 4 - Kubernetes RBAC overreach

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ci-admin
subjects:
- kind: ServiceAccount
  name: ci-runner
  namespace: cicd
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

What is wrong

  1. CI runner gets cluster-admin.
  2. Compromise of a single pipeline token can become full-cluster takeover.
  3. Violates least privilege and separation of duties.
  4. Makes it hard to reason about safe approval paths.

Better answer

"The main issue is blast radius. A runner should almost never have universal cluster-admin because pipeline code is partly attacker-influenced by design. I would split build, deploy, and cluster-admin break-glass functions, bind only namespace-scoped rights where possible, and enforce environment-specific approval before any production deployment credential is usable."


Drill 5 - GitHub Actions unsafe trigger + token scope

name: deploy
on:
  pull_request_target:
    branches: [main]
jobs:
  deploy:
    runs-on: self-hosted
    permissions: write-all
    steps:
      - uses: actions/checkout@v4
      - run: ./deploy.sh
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

What is wrong

  1. pull_request_target with self-hosted runner is dangerous if workflow executes attacker-controlled code.
  2. write-all permissions are too broad.
  3. Long-lived cloud keys injected directly.
  4. Deploying from PR context is a high-risk pattern.

Better answer

"This is a trust-boundary failure between untrusted contribution context and privileged execution. My concern is not only the action syntax. The bigger design flaw is that code associated with a PR can influence a privileged self-hosted runner with broad token scope and static cloud secrets."


Drill 6 - Vault secret delivery too broad

path "kv/data/prod/*" {
  capabilities = ["read", "list"]
}

What is wrong

  1. Wildcard read across all production secrets.
  2. list on prod secret paths can leak naming patterns and discovery information.
  3. No workload segmentation by service or namespace.
  4. No indication of short-lived, identity-bound access.

Better answer

"This is functionally a broad production read grant. In a real environment I would expect path scoping by application, environment, and possibly cluster or namespace, with short-lived auth tied to workload identity rather than a broadly shared policy."


Drill 7 - Kafka client TLS disabled

listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://kafka-0.kafka:9092
ssl.client.auth=none
authorizer.class.name=
super.users=User:admin

What is wrong

  1. No TLS in transit.
  2. No client certificate auth.
  3. No authorizer configured.
  4. Single super.users pattern suggests weak privilege separation.

Better answer

"For Kafka I review transport security, authn, and authz together. Here all three are weak: plaintext transport, no client auth, no authorizer. In practice that means lateral movement or service impersonation becomes much easier."


Drill 8 - Misleading Kubernetes securityContext

securityContext:
  runAsNonRoot: true
  allowPrivilegeEscalation: true
  privileged: false
  capabilities:
    add: ["SYS_ADMIN"]

What is wrong

  1. runAsNonRoot looks safe, but allowPrivilegeEscalation: true is still risky.
  2. SYS_ADMIN is extremely broad and often described as "the new root".
  3. Missing seccomp/AppArmor/SELinux hints.
  4. Misleads reviewers who stop at one green-looking field.

Better answer

"This is a good interview trap because one setting looks compliant while the rest undermines it. I would call out that SYS_ADMIN plus privilege escalation can dramatically weaken isolation even without privileged: true."


Drill 9 - Redis exposed with weak controls

bind 0.0.0.0
protected-mode no
requirepass changeme
appendonly no

What is wrong

  1. Exposed on all interfaces.
  2. Protected mode disabled.
  3. Weak / default-like password.
  4. No append-only persistence or security-oriented durability comments.
  5. No TLS, ACL, or network scoping shown.

Better answer

"Redis risk here is usually a combination of network exposure, weak auth, and operational assumptions that it is 'internal only'. I would first limit network reachability, re-enable protected mode or equivalent deployment restrictions, move to ACL and TLS where supported, and ensure operational backups and audit expectations are addressed."


Drill 10 - Release gate bypass through manual override

release_gate:
  allow_manual_override: true
  required_approvals: 1
  security_scan_required: false
  provenance_required: false
  emergency_bypass_group: ["eng-managers", "release-engineers"]

What is wrong

  1. Manual override is too easy.
  2. No security scan requirement.
  3. No provenance requirement.
  4. Emergency bypass group appears broad and not clearly break-glass only.
  5. Approval model is weak for production-sensitive changes.

Better answer

"This is not a syntax bug. It is a governance weakness. The release path trusts human override more than evidence. I would require artifact provenance, explicit security gates for scoped change classes, tighter emergency controls, and immutable logging for every override."

Quick cheat sheet - what to inspect first in common files

File type First things to inspect Why it matters
Kubernetes YAML serviceAccount, securityContext, hostPath, hostNetwork, privileged, capabilities, image tag, NetworkPolicy, RBAC Shows workload identity, escape surface, and cluster blast radius
Dockerfile FROM, COPY . ., secrets in ENV, USER, package install, shell use, multi-stage Reveals trust in build context, runtime privilege, and secret leakage
Terraform public exposure, IAM wildcards, encryption, logging, deletion protection, state handling Shows cloud blast radius and evidence integrity
CI/CD YAML triggers, permissions, runner type, environment approvals, secret use, artifact signing Reveals where untrusted code can influence privileged paths
Vault / IAM policy wildcards, list/read breadth, environment scope, duration, subject identity Reveals whether secrets and privileges are segmented or shared