November 15, 2019

359 words 2 mins read

Logs are not human scale: How to build observable systems

Logs are not human scale: How to build observable systems

In the complex world of microservices and distributed systems, we need to understand what our software is doing. Traditional tools, such as logs, read by humans and filtered by crude rules, arent powerful enough. Sam Stokes explains that we need new, better tools and why this will also require us to design our systems to give the tools better data.

Talk Title Logs are not human scale: How to build observable systems
Speakers Sam Stokes (Honeycomb)
Conference O’Reilly Software Architecture Conference
Conf Tag Engineering the Future of Software
Location New York, New York
Date February 26-28, 2018
URL Talk Page
Slides Talk Slides
Video

The world of microservices and distributed systems is complex. There are now more systems to keep an eye on—and more ways they can go wrong. We need to be able to understand what these systems are doing, especially when things break. The traditional solution is logs: log everything, tune your log threshold just right, and away you go. But if you’re investigating an incident, you won’t find what you need if the log threshold wasn’t already set low enough before things went wrong. Just turning down the log threshold permanently isn’t the solution either. If you’re running a high-volume system, you’ll have a correspondingly high volume of logs being produced, which gets out of hand very quickly. The value of logs is in what questions you can answer with them: How busy is the system? How healthy is it? How is it performing for specific customers? But logs aren’t actually a good way of answering these sorts of questions. Logs are designed for humans to read, but our logs are no longer human scale; they are machine scale, so we need machines to help us make sense of them. Sam Stokes explains that we need new, better tools and why this will also require us to design our systems to give the tools better data. What if instead of emitting logs for humans to read, we emitted events for machines to analyze? What would those events look like? What sort of hints might we give to the machine? What sort of questions could we ask?

comments powered by Disqus