To Infinite Scale and Beyond: Operating Kubernetes Past the Steady State


Talk Title	To Infinite Scale and Beyond: Operating Kubernetes Past the Steady State
Speakers	Jago Macleod (Engineering Director, Kubernetes & GKE, Google), Austin Lamon (Group Product Manager, Spotify)
Conference	KubeCon + CloudNativeCon North America
Conf Tag
Location	San Diego, CA, USA
Date	Nov 15-21, 2019
URL	Talk Page
Slides	Talk Slides
Video

Operating large distributed systems at significant scale is challenging. Most discussions focus on scalability either at a single point in time under sustained load, or explore challenges related to changes in incoming traffic.But running distributed systems at scale is about more than steady states and transitions between them. What is equally challenging and tends to get overlooked are the operational challenges of running at scale: upgrading many and/or large clusters; deploying applications to and across multiple clusters in a reasonable way; balancing freedom and consistency across multiple teams. In this case study, Google and Spotify share some of the challenges of running Kubernetes at Scale, together with concrete solutions, patterns, and common pitfalls we have found together. Intended for cluster operators and developers from organizations of any size and on any provider.