November 27, 2019

154 words 1 min read

Stories from the Playbook

Stories from the Playbook

Have you ever wondered how GKE Site Reliability Engineers (SRE) manage an entire fleet of GKE clusters in 15 regions around the world? This talk provides an overview on how the SRE team approach this …

Talk Title Stories from the Playbook
Speakers Fred van den Driessche (Site Reliability Engineer, Google), Tina Zhang (Site Reliability Engineer, Google)
Conference KubeCon + CloudNativeCon Europe
Conf Tag
Location Copenhagen, Denmark
Date Apr 30-May 4, 2018
URL Talk Page
Slides Talk Slides
Video

Have you ever wondered how GKE Site Reliability Engineers (SRE) manage an entire fleet of GKE clusters in 15 regions around the world? This talk provides an overview on how the SRE team approach this challenge, what tools are used, the problems encountered and war stories/learning experiences. The talk introduces the most frequently used parts of our playbook and how SRE endeavours to save your cluster while oncall in an effort to meet our SLOs.

comments powered by Disqus