To Infinite Scale and Beyond: Operating Kubernetes Past the Steady State
Operating large distributed systems at significant scale is challenging. Most discussions focus on scalability either at a single point in time under sustained load, or explore challenges related to c …
Talk Title | To Infinite Scale and Beyond: Operating Kubernetes Past the Steady State |
Speakers | Jago Macleod (Engineering Director, Kubernetes & GKE, Google), Austin Lamon (Group Product Manager, Spotify) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | San Diego, CA, USA |
Date | Nov 15-21, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Operating large distributed systems at significant scale is challenging. Most discussions focus on scalability either at a single point in time under sustained load, or explore challenges related to changes in incoming traffic.But running distributed systems at scale is about more than steady states and transitions between them. What is equally challenging and tends to get overlooked are the operational challenges of running at scale: upgrading many and/or large clusters; deploying applications to and across multiple clusters in a reasonable way; balancing freedom and consistency across multiple teams. In this case study, Google and Spotify share some of the challenges of running Kubernetes at Scale, together with concrete solutions, patterns, and common pitfalls we have found together. Intended for cluster operators and developers from organizations of any size and on any provider.