PS Product SecurityKnowledge Base

๐Ÿ› ๏ธ Product Security Incident Response Playbooks

Intro: These playbooks are intentionally product-facing. They assume engineering and platform teams need clear first actions before a broader incident command structure fully forms around them.

What this page includes

  • high-value scenarios for product and platform teams
  • what to do in the first 15 minutes
  • evidence to collect before containment destroys context
  • how to feed postmortem lessons back into code, policy, and infrastructure

Operating principles

  • preserve evidence before you erase context;
  • isolate the smallest useful scope first;
  • revoke or rotate compromised identity quickly;
  • record exact artifacts, digests, and config state involved;
  • end every incident with at least one preventive, one detective, and one process improvement.

Scenario pack

Leaked Git or CI token

First 15 minutes

  • disable or revoke the token;
  • identify repo, runner, registry, and environment scope;
  • review pipeline, artifact, and image activity since suspected exposure.

Preserve

  • token creation and last-use audit trail;
  • related pipeline logs;
  • artifact digests and tag changes;
  • approval and deploy events.

Compromised runner or build agent

First 15 minutes

  • quarantine the runner;
  • stop scheduling new jobs to it;
  • identify accessible secrets, workspaces, artifacts, and cloud credentials.

Preserve

  • runner config;
  • mounted volumes and credentials;
  • recent job list and logs;
  • outbound network destinations;
  • registry or artifact-store access.

Exposed bucket or artifact store

First 15 minutes

  • remove public access or bad sharing;
  • determine whether only data was exposed or also code, manifests, or credentials;
  • preserve access logs before retention or rotation removes them.

Suspicious pod or workload behavior

First 15 minutes

  • decide whether the incident is runtime-only or identity-plus-cloud compromise;
  • isolate the workload or node according to platform guidance;
  • capture Pod spec, image digest, namespace, service account, and recent events.

Public API key or webhook secret exposure

First 15 minutes

  • rotate the secret;
  • review abuse windows;
  • identify replay, scraping, mass-callback, or unusual egress patterns.

Evidence classes that are worth collecting almost every time

  • audit logs;
  • workload and cloud identity used;
  • exact artifact and image digests;
  • deployed configuration or manifest state;
  • tenant, customer, or data scope impacted;
  • timeline of approvals, deploys, and runtime behavior.

Containment to eradication to codification

One of the most valuable habits in modern response is to turn operational fixes into durable engineering controls.

Stage Example outcome
containment quarantine runner, revoke token, isolate node
eradication remove malicious image, delete persistence, rebuild workload
recovery redeploy trusted artifacts, validate authz and telemetry
codify add pipeline gate, policy rule, secret-handling change, or IaC control

Postmortem questions that improve the platform

  • which signal should have detected this sooner?
  • which approval or trust boundary failed?
  • which credential or artifact path was too broad?
  • what can be encoded in IaC, admission policy, runner design, or image promotion rules so this is harder next time?

Author attribution: Ivan Piskunov, 2026 - Educational and defensive-engineering use.