Break me if you can: A practical guide to building fault-tolerant systems
You built your system, you deployed it, you rolled it up in production, but it's just the beginning. The life of your system just started. Alex Borysov and Mykyta Protsenko outline their practical guide to building fault-tolerant systems with code and design patterns from REST and gRPC ecosystems, role of right product decisions, and importance of a proper communication culture.
Talk Title | Break me if you can: A practical guide to building fault-tolerant systems |
Speakers | Alex Borysov (Netflix), Mykyta Protsenko (Netflix) |
Conference | O’Reilly Open Source Software Conference |
Conf Tag | Fueling innovative software |
Location | Portland, Oregon |
Date | July 15-18, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
You built your system, you deployed it, you rolled it up in production, but it’s just the beginning. The life of your system just started. It will grow, evolve, and wake you up in the middle of the night. Usually, at this point you start thinking about fault tolerance and error handling. Fault-tolerance concepts sound simple: modern frameworks promise to effortlessly solve it for you. But what’s hiding behind the simplicity? Alex Borysov and Mykyta Protsenko take you along for a sneak peak at how to design and build truly fault-tolerant Java systems. They make it real by trying failure scenarios against a live system (you’ll watch it recover in real time) and then review the recipes (with gRPC and REST examples and a number of open source tools) that you can use right away to make your code more resilient and your system more robust.