December 14, 2019

270 words 2 mins read

Orchestrating chaos: Applying database research in the wild

Orchestrating chaos: Applying database research in the wild

Lineage-driven fault injection (LDFI), a novel approach to automating failure testing, can greatly reduce the number of faults that must be explored via fault injection. Peter Alvaro explores LDFIs theoretical roots in the database research notion of provenance and presents early results from the field and opportunities for near- and long-term future research.

Talk Title Orchestrating chaos: Applying database research in the wild
Speakers Peter Alvaro (UC Santa Cruz)
Conference O’Reilly Velocity Conference
Conf Tag Build Resilient Distributed Systems
Location San Jose, California
Date June 20-22, 2017
URL Talk Page
Slides
Video Talk Video

Large-scale distributed systems must be built to anticipate and mitigate a variety of hardware and software failures. In order to build confidence that fault-tolerant systems are correctly implemented, an increasing number of large-scale sites regularly run failure drills in which faults are deliberately injected in production or staging systems. While fault injection infrastructures are becoming relatively mature, existing approaches either explore the combinatorial space of potential failures randomly or exploit the “hunches” of domain experts to guide the search. Random strategies waste resources testing “uninteresting” faults, while programmer-guided approaches are only as good as the intuition of a programmer and only scale with human effort. Lineage-driven fault injection (LDFI), a novel approach to automating failure testing, utilizes existing tracing or logging infrastructures to work backward from good outcomes, identifying redundant computations that allow it to aggressively prune the space of faults that must be explored via fault injection. Peter Alvaro explores LDFI’s theoretical roots in the database research notion of provenance and presents early results from the field and opportunities for near- and long-term future research.

comments powered by Disqus