February 29, 2020

184 words 1 min read

Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes

Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes

Andy Kwiatkowski takes a deep dive into how Shopify saved a million dollars a year in infrastructure costs by rolling its own autoscaler.


Talk Title	Autoscaling in reality: Lessons learned from adaptively scaling Kubernetes
Speakers	Andy Kwiatkowski (Shopify)
Conference	O’Reilly Velocity Conference
Conf Tag	Build systems that drive business
Location	Berlin, Germany
Date	November 5-7, 2019
URL	Talk Page
Slides	Talk Slides
Video

Cloud providers often come with a checkbox to enable a simple CPU-based autoscaler. However, if your application runs complex deployments on thousands of servers across multiple regions and has to wrestle the occasional celebrity flash sale, you might need to go further to react quicker, allow for more complex scaling rules, and create extra fail-safes to prevent capacity shortages. Andy Kwiatkowski dives into what it took for Shopify to create its own autoscaler, from writing traffic-smoothing algorithms to dealing with regional evacuations and to handling noise from a system continuously deployed 50 times a day. He details creating a more useful utilization signal and share battle-tested ideas for creating a highly fault-tolerant tool you can trust to scale your entire infrastructure.

autoscaling safe algorithm tosca infrastructure autoscale cloud kubernetes

comments powered by Disqus

About Space Invaders and automated scaling

About Space Invaders and automated scaling

February 22, 2020

Michael Friedrich and Stefanie Grunwald explore how an algorithm capable of playing Space Invaders can also improve your cloud service's automated scaling mechanism.

Running large-scale machine learning experiments in the cloud

Running large-scale machine learning experiments in the cloud

February 3, 2020

Machine learning involves a lot of experimentation. Data scientists spend days, weeks, or months performing algorithm searches, model architecture searches, hyperparameter searches, etc. Shashank Prasanna breaks down how you can easily run large-scale machine learning experiments using containers, Kubernetes, Amazon ECS, and SageMaker.

Developing serverless applications on Kubernetes with Knative (sponsored by Pivotal)

Developing serverless applications on Kubernetes with Knative (sponsored by Pivotal)

January 30, 2020

There's too much fragmentation for developers when it comes to deciding the right open source FaaS solution. Bryan Friedman and Brian McClain detail Knative, an open source project from Google, Pivotal, and other industry leaders that provides a set of common tooling on top of Kubernetes to help developers build functions.

Serverless for data and AI

Serverless for data and AI

January 6, 2020

What is serverless, and how can it be utilized for data analysis and AI? Avner Braverman outlines the benefits and limitations of serverless with respect to data transformation (ETL), AI inference and training, and real-time streaming. This is a technical talk, so expect demos and code.

Serverless for data and AI

Serverless for data and AI

December 21, 2019

What is serverless, and how can it be utilized for data analysis and AI? Avner Braverman outlines the benefits and limitations of serverless with respect to data transformation (ETL), AI inference and training, and real-time streaming. This is a technical talk, so expect demos and code.

10 Weird Ways to Blow Up Your Kubernetes

10 Weird Ways to Blow Up Your Kubernetes

December 15, 2019

Its a brand new world in infrastructure with the advent of microservices, containerization, Kubernetes, and service mesh. And all is well. Or is it? Find out how easy it is to break container runtime …