Demystifying Data-Intensive Systems On Kubernetes - Alena Hall, Microsoft
Distributed databases, stateful stream processing workloads, caches, and machine learning frameworks often require persistence for storing data, operation progress, and more. Managing state while runn …
Talk Title | Demystifying Data-Intensive Systems On Kubernetes - Alena Hall, Microsoft |
Speakers | Lena Hall (Senior Cloud Developer Advocate, Microsoft) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | Seattle, WA, USA |
Date | Dec 9-14, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Distributed databases, stateful stream processing workloads, caches, and machine learning frameworks often require persistence for storing data, operation progress, and more. Managing state while running systems like Cassandra, Kafka, Spark, Redis, or Tensorflow on Kubernetes is different than with VMs or physical servers. Let’s examine why we might want to run these systems on Kubernetes, and look at foundational Kubernetes concepts (e.g. Stateful Sets) that help us get those systems up and running. But up and running isn’t always equal to operating correctly. We will go over best practices for managing data-intensive systems on Kubernetes, existing challenges, as well as solutions (e.g. CRDs, custom controllers, operators) and a possible future. You will learn about operational things to take into account even if you haven’t worked with data systems systems on Kubernetes before.