January 21, 2020

263 words 2 mins read

How I failed to build a runbook automation system and what I learned

How I failed to build a runbook automation system and what I learned

You're going to automate all the things, reduce toil, and make your systems smarter and recover automatically. . .except sometimes you're automating a house of cards built on the back of individual people and a well-meaning solution can fail to address the true problems in the system. Tim Bonci offers a postmortem of a solution that was designed to solve a common operational problem but failed.

Talk Title How I failed to build a runbook automation system and what I learned
Speakers Tim Bonci (Vistaprint)
Conference O’Reilly Velocity Conference
Conf Tag Building and maintaining complex distributed systems
Location San Jose, California
Date June 11-13, 2019
URL Talk Page
Slides Talk Slides
Video

Our intentions can be good, the technical ability and time may be there, and we’re going to build the thing to make work easier and more productive, allowing everyone to apply their labor to only the most valuable tasks—yet sometimes it’s still not enough. This is a postmortem of a solution that was designed to solve a common operational problem but failed. Tim Bonci examines the scars and hopefully provides insights into finding and addressing the right problems in the right places that should be broadly useful in building and deploying your own transformational processes and tools. This is particularly relevant to brownfield teams looking for ways to modernize their processes and anyone who struggles with needing humans to change how they work. Tim explains why shifting human processes to computer automation does not always produce the expected results and how treating nonurgent alerts as a work queue is an anti-pattern.

comments powered by Disqus