You had one job! Learning to cope with failures in a complex distributed system
What are your perceptions of NHS IT? Not great? Well the truth is very different from what you might expect. Ed Hiley and Dan Rathbone offer an overview of the technical renaissance going on in parts of the NHS, where things are being done in a modern way.
Talk Title | You had one job! Learning to cope with failures in a complex distributed system |
Speakers | Ed Hiley (NHS Digital), Dan Rathbone (Infinity Works) |
Conference | O’Reilly Velocity Conference |
Conf Tag | Build Resilient Distributed Systems |
Location | London, United Kingdom |
Date | October 18-20, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Stuff breaks. It’s one of the basic fundamentals of IT, but there are some things you expect to just work. But what happens when these things decide to let you down, especially when part of a large distributed system? Ed Hiley and Dan Rathbone offer an overview of the technical renaissance going on in parts of the NHS, where things are being done in a modern way. Ed and Dan explore a recently launched data processing system that utilizes Apache Spark, Riak, and Python and discuss the events they encountered along the way where things they took for granted just stopped doing the things they expected. Ed and Dan dive into some of these events to debunk the assumptions they made and explain how they troubleshot and fixed the issues.