December 9, 2019

221 words 2 mins read

Running Large-Scale Stateful Workloads On Kubernetes at Lyft

Running Large-Scale Stateful Workloads On Kubernetes at Lyft

Along with core services, K8s at Lyft also forms the base to run a large variety of data processing stateful data processing jobs which includes Spark, Flink and other jobs via various ML and Data pro …

Talk Title Running Large-Scale Stateful Workloads On Kubernetes at Lyft
Speakers Surinder Singh (Software Engineer, Lyft), Anmol Khurana (Software Engineer, Lyft)
Conference KubeCon + CloudNativeCon North America
Conf Tag
Location San Diego, CA, USA
Date Nov 15-21, 2019
URL Talk Page
Slides Talk Slides
Video

Along with core services, K8s at Lyft also forms the base to run a large variety of data processing stateful data processing jobs which includes Spark, Flink and other jobs via various ML and Data processing pipelines.At Lyft, K8s has become the driver for the majority of our data processing needs running 10s of thousands of concurrent jobs. Operating the platform at this scale presents an unique set of challenges which get more complex with highly variable load pattern.In this talk, the speakers will share their journey through some of these challenges and learnings.- Potential pitfalls of running stateful jobs on K8s.- Knobs/tweaks to optimize K8s for stateful jobs.- Running k8s in a cloud environment.- Building a fault-tolerant self-healing system with multiple K8s clusters underneath.Talk will also focus on optimizations done to support the widely used workloads at Lyft.

comments powered by Disqus