Scale Your Service on What Matters: Autoscaling on Latency
Scaling HTTP based workloads is about more than cpu and memory. This talk will show why it is critical to scale based on latency, as well as how to do it for your own service by combining Linkerd, Pro …
Talk Title | Scale Your Service on What Matters: Autoscaling on Latency |
Speakers | Thomas Rampelberg (Software Engineer, Buoyant) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | Seattle, WA, USA |
Date | Dec 9-14, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Scaling HTTP based workloads is about more than cpu and memory. This talk will show why it is critical to scale based on latency, as well as how to do it for your own service by combining Linkerd, Prometheus, and Kubernetes. We demonstrate how to use Linkerd to instrument your service to collect aggregated service latency, store these metrics in Prometheus, and use them as custom metrics for consumption by Kubernetes’s Horizontal Pod Autoscaler. We demonstrate how latency-based autoscaling outperforms CPU- and memory-based autoscaling under a variety of conditions including live traffic from the attendees of this talk, and suggest ways to safely apply this technique to existing systems.