February 29, 2020

184 words 1 min read

Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes

Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes

Andy Kwiatkowski takes a deep dive into how Shopify saved a million dollars a year in infrastructure costs by rolling its own autoscaler.

Talk Title Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes
Speakers Andy Kwiatkowski (Shopify)
Conference O’Reilly Velocity Conference
Conf Tag Build systems that drive business
Location Berlin, Germany
Date November 5-7, 2019
URL Talk Page
Slides Talk Slides
Video

Cloud providers often come with a checkbox to enable a simple CPU-based autoscaler. However, if your application runs complex deployments on thousands of servers across multiple regions and has to wrestle the occasional celebrity flash sale, you might need to go further to react quicker, allow for more complex scaling rules, and create extra fail-safes to prevent capacity shortages. Andy Kwiatkowski dives into what it took for Shopify to create its own autoscaler, from writing traffic-smoothing algorithms to dealing with regional evacuations and to handling noise from a system continuously deployed 50 times a day. He details creating a more useful utilization signal and share battle-tested ideas for creating a highly fault-tolerant tool you can trust to scale your entire infrastructure.

comments powered by Disqus