December 14, 2019

262 words 2 mins read

Observability in a dynamically scheduled world

Observability in a dynamically scheduled world

Over the past year, DigitalOcean's Delivery team has been building a runtime platform based on Kubernetes with the goal of making shipping code easier. A core component of this system is a monitoring and alerting system based on Prometheus and Alertmanager. Sneha Inguva offers an overview of the system and shares problems encountered, potential solutions, and key lessons learned in the process.

Talk Title Observability in a dynamically scheduled world
Speakers Sneha Inguva (DigitalOcean)
Conference O’Reilly Velocity Conference
Conf Tag Build Resilient Distributed Systems
Location San Jose, California
Date June 20-22, 2017
URL Talk Page
Slides Talk Slides
Video

The industry is moving toward a microservices architecture, and many companies have embraced container orchestration solutions such as Kubernetes. DigitalOcean is no different. Over the past year, DigitalOcean’s Delivery team has been building a runtime platform based on Kubernetes with the goal of making shipping code easier. The system has empowered service owners to quickly and efficiently deploy and update their applications. A vital component is a white box monitoring and alerting solution based on Prometheus and Alertmanager. Sneha Inguva offers an overview of the system and shares problems encountered, potential solutions, and key lessons learned in the process. Sneha dives into the setup of Prometheus and Alertmanager that allows service owners to instrument their own metrics and alerts, explaining the service owner’s point of view and the internals that allow for the dynamic addition of alerts, and offers a glimpse of future modifications to the system. Join in to learn how to leverage open source tools for your monitoring and alerting needs.

comments powered by Disqus