Next-generation alerting and fault detection
Alerting on your stack is the key to happy customers and a healthy business. Dieter Plaetinck explains what's wrong with the oft-touted complicated alerting methods and explores how to get the in-depth coverage and address complicated alerting needs using simple techniques, with a focus on the workflow using an alerting IDE.
Talk Title | Next-generation alerting and fault detection |
Speakers | Dieter Plaetinck (raintank) |
Conference | Velocity |
Conf Tag | Build resilient systems at scale |
Location | Santa Clara, California |
Date | June 21-23, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Alerting on your stack is the key to happy customers and a healthy business. In the open source monitoring community, a common belief is that in order to solve more advanced infrastructure/software alerting use cases and get more accurate alerting, we need complex, often math-heavy solutions such as machine learning and stream processing. Dieter Plaetinck explains what’s wrong with the oft-touted complicated alerting methods. Instead, Dieter zooms in on how we can get dramatically better alerting and make our lives a lot easier by using familiar concepts understandable to everyone, such as basic logic and metric metadata. Dieter also discusses how to optimize the overall experience of adjusting and maintaining alerting rules over time by focusing on the concept of a powerful alerting IDE, using Bosun (from StackExchange) as an example, and provides solutions to examples previously deemed only doable with machine learning.