High Altitude, Low Risk: Measuring Reliability in the Cloud Using Open Source Technology
With the financial convenience and flexibility of per-instance spend that cloud hosting allows, it follows that companies of all sizes have migrated their resources to the virtual world, putting their …
Talk Title | High Altitude, Low Risk: Measuring Reliability in the Cloud Using Open Source Technology |
Speakers | Alex Kass (Engineering Manager, DigitalOcean) |
Conference | Open Source Summit North America |
Conf Tag | |
Location | Vancouver, BC, Canada |
Date | Aug 27-31, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
With the financial convenience and flexibility of per-instance spend that cloud hosting allows, it follows that companies of all sizes have migrated their resources to the virtual world, putting their trust in cloud hosting providers to be the rock upon which they build their companies. This comes, of course, with one significant assumption: cloud hosting is reliable. How can we prove that? This talk will reveal a real-world, implemented use-case meant to address just this challenge. The speaker will cover how DigitalOcean leverages OSS (Prometheus/k8s/Spark/HDFS/PrestoDB/more) to: - monitor performance across distributed systems - consume, structure, and productize information - track product SLOs - support official SLAs At DO, reliability is viewed as a core internal data product in and of itself, one that should ultimately drive engineering iterations and business decisions throughout the company.