101 Ways to Crash Your Cluster [I]
Running a kubernetes cluster requires operating many components. One must be good at running and scaling etcd, multiple control plane components, a monitoring system, a logging pipeline, Docker, rkt, …
Talk Title | 101 Ways to Crash Your Cluster [I] |
Speakers | Emmanuel Gomez (Principal Engineer, Nordstrom), Marius Grigoriu (Sr Manager, Nordstrom) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | Austin, TX, United States |
Date | Dec 4- 8, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Running a kubernetes cluster requires operating many components. One must be good at running and scaling etcd, multiple control plane components, a monitoring system, a logging pipeline, Docker, rkt, and Linux itself. And this list isn’t even close to being complete. With such a long list of technologies comes the potential to make a mistake that brings the whole cluster down. Come hear war stories from the Nordstrom’s Kubernetes cluster admins. Each is a true story of how the cluster melted down, how they recovered, and what they did to prevent it from happening again. Don’t let any of these happen to you…