January 7, 2020

189 words 1 min read

Scale Your Service on What Matters: Autoscaling on Latency

Scale Your Service on What Matters: Autoscaling on Latency

Scaling HTTP based workloads is about more than cpu and memory. This talk will show why it is critical to scale based on latency, as well as how to do it for your own service by combining Linkerd, Pro …


Talk Title	Scale Your Service on What Matters: Autoscaling on Latency
Speakers	Thomas Rampelberg (Software Engineer, Buoyant)
Conference	KubeCon + CloudNativeCon North America
Conf Tag
Location	Seattle, WA, USA
Date	Dec 9-14, 2018
URL	Talk Page
Slides	Talk Slides
Video

Scaling HTTP based workloads is about more than cpu and memory. This talk will show why it is critical to scale based on latency, as well as how to do it for your own service by combining Linkerd, Prometheus, and Kubernetes. We demonstrate how to use Linkerd to instrument your service to collect aggregated service latency, store these metrics in Prometheus, and use them as custom metrics for consumption by Kubernetes’s Horizontal Pod Autoscaler. We demonstrate how latency-based autoscaling outperforms CPU- and memory-based autoscaling under a variety of conditions including live traffic from the attendees of this talk, and suggest ways to safely apply this technique to existing systems.

autoscaling metrics safe tosca prometheus autoscale kubernetes

comments powered by Disqus

Autoscale your Kubernetes Workload with Prometheus

Autoscale your Kubernetes Workload with Prometheus

November 29, 2019

Time to autoscale your cloud native deployments, but how do you make it happen? In the past, easier said than done. Lack of guidance and inconsistent implementations of solutions have made autoscaling …

Intro: Autoscaling SIG

Intro: Autoscaling SIG

December 26, 2019

SIG Autoscaling develops and maintains the components related to automated scaling in Kubernetes: the Horizontal Pod Autoscaler, Vertical Pod Autoscaler, and Cluster Autoscaler. In this introduction, …

Machine learning at scale with Kubernetes

Machine learning at scale with Kubernetes

January 7, 2020

Christopher Cho demonstrates how Kubernetes can be easily leveraged to build a complete deep learning pipeline, including data ingestion and aggregation, preprocessing, ML training, and serving with the mighty Kubernetes APIs.

Scaling AI Inference Workloads with GPUs and Kubernetes

Scaling AI Inference Workloads with GPUs and Kubernetes

January 7, 2020

Deep Learning (DL) is a computational intense form of machine learning that has revolutionize many fields including computer vision, automated speech recognition, natural language processing and artif …

Rightsize Your Pods with Vertical Pod Autoscaling

Rightsize Your Pods with Vertical Pod Autoscaling

January 2, 2020

Specifying CPU and memory needs for your application is often a fortune-telling exercise where time will almost certainly prove you wrong. Assigning too few resources endangers you with CPU starvation …

Intro to Agones: Scaling Multiplayer Game Servers with Kubernetes

Intro to Agones: Scaling Multiplayer Game Servers with Kubernetes

December 29, 2019

Kubernetes provides an amazing toolset for running processes over potentially thousands of machines. However, Dedicated Game Servers for real time multiplayer games, such as Fortnight, Overwatch, etc, …