Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes
Andy Kwiatkowski takes a deep dive into how Shopify saved a million dollars a year in infrastructure costs by rolling its own autoscaler.
Talk Title | Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes |
Speakers | Andy Kwiatkowski (Shopify) |
Conference | O’Reilly Velocity Conference |
Conf Tag | Build systems that drive business |
Location | Berlin, Germany |
Date | November 5-7, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Cloud providers often come with a checkbox to enable a simple CPU-based autoscaler. However, if your application runs complex deployments on thousands of servers across multiple regions and has to wrestle the occasional celebrity flash sale, you might need to go further to react quicker, allow for more complex scaling rules, and create extra fail-safes to prevent capacity shortages. Andy Kwiatkowski dives into what it took for Shopify to create its own autoscaler, from writing traffic-smoothing algorithms to dealing with regional evacuations and to handling noise from a system continuously deployed 50 times a day. He details creating a more useful utilization signal and share battle-tested ideas for creating a highly fault-tolerant tool you can trust to scale your entire infrastructure.