๐ก๏ธ Containment and Eradication Automation Lab โ SOAR, Remediations, and Postmortem-to-IaC Feedback
Intro: Detection is only half of the job. This lab teaches the next step: how to automate safe containment, preserve evidence, and then push the durable fix back into infrastructure-as-code, policies, or platform baselines.
What this page includes
- how to structure a containment-and-eradication lab;
- safe automation patterns with SOAR and native cloud tools;
- examples using AWS Systems Manager and Cortex XSOAR style playbooks;
- how to convert a one-time incident into a codified control improvement.
Learning goal
A good automation lab teaches four skills:
- know when to automate;
- know what must stay manual;
- preserve evidence before destroying context;
- feed the durable fix back into code and policy.
Safe automation rules
Never automate destructive response before you answer:
- what evidence do we lose?
- can the action break production?
- who approves the containment?
- how do we restore safely?
Good starter scenarios
| Scenario | Why it is good for a lab |
|---|---|
| suspicious EC2 or VM egress | teaches reversible containment |
| compromised IAM principal or service account | teaches identity-focused isolation |
| public security group or NSG | teaches quick risk reduction and codified fix |
| container or pod compromise | teaches runtime evidence + kill/replace discipline |
| leaked secret or token | teaches rotation, blast-radius review, and pipeline follow-up |
AWS-native starter pattern
AWS Systems Manager already ships useful containment runbooks.
Example: contain an EC2 instance
aws ssm start-automation-execution \
--document-name AWSSupport-ContainEC2Instance \
--parameters InstanceId=i-0123456789abcdef0
Example: quarantine an EC2 instance
aws ssm start-automation-execution \
--document-name AWS-QuarantineEC2Instance \
--parameters InstanceId=i-0123456789abcdef0
Example: contain an IAM principal
aws ssm start-automation-execution \
--document-name AWSSupport-ContainIAMPrincipal \
--parameters IAMResourceArn=arn:aws:iam::123456789012:user/suspicious-user
Example custom SSM automation skeleton
schemaVersion: '0.3'
description: Isolate instance and snapshot evidence metadata
assumeRole: '{{ AutomationAssumeRole }}'
parameters:
AutomationAssumeRole:
type: String
InstanceId:
type: String
mainSteps:
- name: captureInstanceMetadata
action: aws:executeAwsApi
inputs:
Service: ec2
Api: DescribeInstances
InstanceIds:
- '{{ InstanceId }}'
- name: quarantineInstance
action: aws:executeAwsApi
inputs:
Service: ec2
Api: ModifyInstanceAttribute
InstanceId: '{{ InstanceId }}'
Groups:
- sg-containment
Cortex XSOAR style lab pattern
A SOAR playbook is useful when your response needs:
- ticketing;
- analyst approval steps;
- enrichment;
- branching logic;
- human-in-the-loop escalation.
Minimal playbook design idea
- ingest incident;
- enrich asset, identity, and tenant context;
- ask: manual approval required?
- perform reversible containment;
- collect evidence references;
- open remediation ticket;
- trigger postmortem checklist.
Example pseudo-playbook logic
If incident.type == suspicious-ec2:
collect cloudtrail + vpcflow + guardduty context
ask analyst for approval
run AWSSupport-ContainEC2Instance
create jira ticket for source-of-truth fix
notify platform owner
Postmortem-to-IaC feedback loop
This is the most important part of the lab.
After containment, force the learner to answer:
- what allowed the incident path?
- what guardrail should have stopped it earlier?
- what Terraform / Helm / policy / CI rule must change?
- what new detection should exist next time?
Example: translate incident into code change
Incident: public admin port on a security group.
Temporary action: close the rule.
Durable action: update Terraform module defaults.
variable "allowed_admin_cidrs" {
type = list(string)
default = []
}
resource "aws_security_group_rule" "admin_ingress" {
count = length(var.allowed_admin_cidrs) > 0 ? 1 : 0
type = "ingress"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = var.allowed_admin_cidrs
security_group_id = aws_security_group.app.id
}
Example validation after the fix
checkov -d infra/
terraform plan
prowler aws --check aws_ec2_securitygroup_allow_ingress_from_internet_to_tcp_ports_22_3389
Web UI how-to ideas for the lab
AWS console path
- Systems Manager โ Automation.
- Search for containment or quarantine runbook.
- Review required parameters and assume role.
- Execute against the target resource.
- Save execution ID into the incident record.
XSOAR path
- open incident;
- run enrichment tasks;
- request manual approval if production risk exists;
- execute containment task;
- attach evidence and open engineering remediation ticket;
- link the postmortem record.
Common mistakes
- automating containment without an approval boundary for production systems;
- deleting or rebuilding assets before preserving evidence;
- fixing only the live resource and not the template or module;
- stopping at detection and never building the response automation.
Cross-links
- Detection and Response
- Product Security Incident Response Playbooks
- Runtime Investigation Playbook for Kubernetes and Containers
- Cloud Compliance Scan Lab โ Scan โ Triage โ Fix โ Codify
---Author attribution: Ivan Piskunov, 2026 - Educational and defensive-engineering use.