Member-only story
Patterns for Resilient Architecture — Part 1
The story of embracing failure at scale
Part 1 — Embracing Failure at Scale
Part 2 — Avoiding Cascading Failures
Part 3 — Preventing Service Failures with Health Check
Part 4 — Caching for Resiliency
As you may know, a quote that shaped the way I think about architecture is from Werner Vogels, CTO at Amazon.com. He said:
“Failures are a given, and everything will eventually fail over time.”
Having worked on large-scale systems for more than a decade, if I could summarize in a single animation what I think about managing systems at scale and failure, it would be something like this. (this is a real video and the base jumper survived that failure)
But why? Good question Vincent!
The art of managing systems at scale lies in embracing failure and being at the edge — pushing the limits of your system and software performance ‘almost’ to breaking point, yet still being able to recover. From the outside it looks both impressive and…