Member-only story

The five modes of chaos engineering experimentation

Unleashing the Full Potential of Chaos Engineering — Part1

--

Image by Dorothe from Pixabay

Welcome to this two-part blog post series on chaos engineering best practices! In this series, I’ll be sharing insights and practical strategies that I’ve gathered throughout a decade of chaos engineering.

In Part 1, we’ll dive into the different modes of experimentation for chaos engineering — from the exploratory ad-hoc mode to the challenging continuous experimentation in production, I’ll walk you through each mode and highlight their unique benefits. Part 2 focuses on the essential best practices that I’ve found to be crucial for achieving success in chaos engineering.

Before we jump into Part 1, I want to take a moment to extend my sincere gratitude and appreciation to everyone who has been involved in the review and improvement process of this blog post. In particular Gorik, Rudolf, Seth, Elaine, Varun, Olga, Klara, Jason, Yilong, Alan, Laurent and Shllomi. Your valuable input, feedback, and suggestions have been instrumental in shaping the final version. I am truly grateful for your time, effort, and commitment to making this content the best it can be. It’s been a collaborative journey, and I couldn’t have done it without your support. Thank you all for your contributions.

Now, let’s dive in by first briefly discussing the concept of chaos engineering and its significance.

Chaos Engineering — what is it and why is it important?

Chaos engineering is a systematic process that involves deliberately subjecting an application to disruptive events in a risk mitigated way, closely monitoring its response, and implementing necessary improvements. Its purpose is to validate or challenge assumptions about the application’s ability to handle such disruptions. Instead of leaving these events to chance, chaos engineering empowers engineers to orchestrate controlled experiments in a controlled environment, typically during periods of low traffic and with readily available engineering support for effective mitigation.

The foundation of chaos engineering lies in following a well-defined and scientific approach. It begins with understanding the normal operating…

--

--

Adrian Hornsby
Adrian Hornsby

Written by Adrian Hornsby

Former Principal Engineer @ AWS ☁️ I break stuff .. mostly.

Responses (1)