January 18, 2020

286 words 2 mins read

Increasing visibility of distributed systems in production

Increasing visibility of distributed systems in production

Understanding the state of a running application is the key to efficiently troubleshooting production issues and ultimately anticipating outages. Pierre Vincent demonstrates how to make monitoring an integral part of development, using health checks, metrics, tracing, and other patterns to get a clearer picture of applications in production.

Talk Title Increasing visibility of distributed systems in production
Speakers Pierre Vincent (Poppulo)
Conference O’Reilly Velocity Conference
Conf Tag Build Resilient Distributed Systems
Location London, United Kingdom
Date October 18-20, 2017
URL Talk Page
Slides Talk Slides
Video

Understanding the running state of an application is the key to efficiently troubleshoot production issues and ultimately anticipate outages. When systems grow larger and become distributed, the visibility of application health needs to become a first-class concern; as the likelihood of something going wrong increases, the focus shifts from increasing mean time between failures to reducing mean time to recovery. The best way to achieve this consistently is to make monitoring an integral part of product development, instead of it being just an afterthought. Pierre Vincent demonstrates how to build in monitoring, using health checks, metrics, tracing, and other patterns to get a clearer picture of applications in production. Monitoring can start simple, with basic telemetry such as health checks, which increase visibility in the system’s status. Exposing more advanced metrics can give more details on how the system is working on a system level (e.g., resource usage), application level (e.g., response times), and business level (e.g., completed sales). These health checks and metrics can then be used to trigger alerts when observed values are outside of expected thresholds. Pierre offers an overview of monitoring patterns and tools that will help you build a fuller picture of a running application.

comments powered by Disqus