Member-only story

Creating your own Chaos Monkey with AWS Systems Manager Automation

Chaos Engineering on AWS

12 min readJun 15, 2020

I’d like to express my gratitude to my colleagues and friends Jason Byrne and Matt Fitzgerald for their valuable feedback.

In a recent post, I explained how to use AWS SSM Run Command to inject failures on EC2 instances. SSM Run Command is well-suited to execute custom scripts on EC2 instances, especially to inject latency or blackouts on the network interface, do resource exhaustion of CPUs, memory, and IO.

However, we need more than that. Failure injection should target resources, network characteristics and dependencies, applications, processes and service, and also the infrastructure.

We also need to have a broad set of controls and capabilities to perform chaos experiments safely. We might want to:

  • Execute commands and scripts directly into EC2 instances.
  • Invoke Lambda functions to run custom scripts.
  • Orchestrate several failure injections to form chaos scenarios.
  • Schedule them for execution at specific times.

--

--

Adrian Hornsby
Adrian Hornsby

Written by Adrian Hornsby

I help software organizations improve resilience and achieve operational excellence | Former Principal Engineer at AWS | Follow for posts on resilience

Responses (3)