🛡️ Containment and Eradication Automation Lab — SOAR, Remediations, and Postmortem-to-IaC Feedback

Intro: Detection is only half of the job. This lab teaches the next step: how to automate safe containment, preserve evidence, and then push the durable fix back into infrastructure-as-code, policies, or platform baselines.

What this page includes

how to structure a containment-and-eradication lab;

safe automation patterns with SOAR and native cloud tools;

examples using AWS Systems Manager and Cortex XSOAR style playbooks;

how to convert a one-time incident into a codified control improvement.

Learning goal

A good automation lab teaches four skills:

know when to automate;
know what must stay manual;
preserve evidence before destroying context;
feed the durable fix back into code and policy.

Safe automation rules

Never automate destructive response before you answer:

what evidence do we lose?
can the action break production?
who approves the containment?
how do we restore safely?

Good starter scenarios

Scenario	Why it is good for a lab
suspicious EC2 or VM egress	teaches reversible containment
compromised IAM principal or service account	teaches identity-focused isolation
public security group or NSG	teaches quick risk reduction and codified fix
container or pod compromise	teaches runtime evidence + kill/replace discipline
leaked secret or token	teaches rotation, blast-radius review, and pipeline follow-up

AWS-native starter pattern

AWS Systems Manager already ships useful containment runbooks.

Example: contain an EC2 instance

aws ssm start-automation-execution \
  --document-name AWSSupport-ContainEC2Instance \
  --parameters InstanceId=i-0123456789abcdef0

Example: quarantine an EC2 instance

aws ssm start-automation-execution \
  --document-name AWS-QuarantineEC2Instance \
  --parameters InstanceId=i-0123456789abcdef0

Example: contain an IAM principal

aws ssm start-automation-execution \
  --document-name AWSSupport-ContainIAMPrincipal \
  --parameters IAMResourceArn=arn:aws:iam::123456789012:user/suspicious-user

Example custom SSM automation skeleton

schemaVersion: '0.3'
description: Isolate instance and snapshot evidence metadata
assumeRole: '{{ AutomationAssumeRole }}'
parameters:
  AutomationAssumeRole:
    type: String
  InstanceId:
    type: String
mainSteps:
  - name: captureInstanceMetadata
    action: aws:executeAwsApi
    inputs:
      Service: ec2
      Api: DescribeInstances
      InstanceIds:
        - '{{ InstanceId }}'
  - name: quarantineInstance
    action: aws:executeAwsApi
    inputs:
      Service: ec2
      Api: ModifyInstanceAttribute
      InstanceId: '{{ InstanceId }}'
      Groups:
        - sg-containment

Cortex XSOAR style lab pattern

A SOAR playbook is useful when your response needs:

ticketing;
analyst approval steps;
enrichment;
branching logic;
human-in-the-loop escalation.

Minimal playbook design idea

ingest incident;
enrich asset, identity, and tenant context;
ask: manual approval required?
perform reversible containment;
collect evidence references;
open remediation ticket;
trigger postmortem checklist.

Example pseudo-playbook logic

If incident.type == suspicious-ec2:
  collect cloudtrail + vpcflow + guardduty context
  ask analyst for approval
  run AWSSupport-ContainEC2Instance
  create jira ticket for source-of-truth fix
  notify platform owner

Postmortem-to-IaC feedback loop

This is the most important part of the lab.

After containment, force the learner to answer:

what allowed the incident path?
what guardrail should have stopped it earlier?
what Terraform / Helm / policy / CI rule must change?
what new detection should exist next time?

Example: translate incident into code change

Incident: public admin port on a security group.

Temporary action: close the rule.

Durable action: update Terraform module defaults.

variable "allowed_admin_cidrs" {
  type    = list(string)
  default = []
}

resource "aws_security_group_rule" "admin_ingress" {
  count             = length(var.allowed_admin_cidrs) > 0 ? 1 : 0
  type              = "ingress"
  from_port         = 22
  to_port           = 22
  protocol          = "tcp"
  cidr_blocks       = var.allowed_admin_cidrs
  security_group_id = aws_security_group.app.id
}

Example validation after the fix

checkov -d infra/
terraform plan
prowler aws --check aws_ec2_securitygroup_allow_ingress_from_internet_to_tcp_ports_22_3389

Web UI how-to ideas for the lab

AWS console path

Systems Manager → Automation.
Search for containment or quarantine runbook.
Review required parameters and assume role.
Execute against the target resource.
Save execution ID into the incident record.

XSOAR path

open incident;
run enrichment tasks;
request manual approval if production risk exists;
execute containment task;
attach evidence and open engineering remediation ticket;
link the postmortem record.

Common mistakes

automating containment without an approval boundary for production systems;
deleting or rebuilding assets before preserving evidence;
fixing only the live resource and not the template or module;
stopping at detection and never building the response automation.

Cross-links

---Author attribution: Ivan Piskunov, 2026 - Educational and defensive-engineering use.