October 9, 2019

205 words 1 min read

Embracing Big Data Workload in Cloud-Native Environment with Data Locality

Embracing Big Data Workload in Cloud-Native Environment with Data Locality

Kubernetes support schedule workloads based on CPU and memory resource with node affinity, pod affinity and anti-affinity. This works very well for stateless workloads. For stateful workloads, especia …

Talk Title Embracing Big Data Workload in Cloud-Native Environment with Data Locality
Speakers Sammi Chen (Software Engineer, Tencent), Xiaoyu Yao (Principal Software Engineer, Cloudera)
Conference KubeCon + CloudNativeCon
Conf Tag
Location Shanghai, China
Date Jun 23-26, 2019
URL Talk Page
Slides Talk Slides
Video

Kubernetes support schedule workloads based on CPU and memory resource with node affinity, pod affinity and anti-affinity. This works very well for stateless workloads. For stateful workloads, especially big data workloads, scheduling compute close to data source can greatly boost performance, reliability and availability. However, in many cloud based storage systems, the data locality info is either unavailable or not exposed to container orchestra. In this talk, we will first compare the data locality support from mainstream container attached storage for Kubernetes. Then we will introduce network topology support from Apache Hadoop Ozone and how to use it as locality aware container attached storage via Ozone CSI plugin for better workloads scheduling. Last, we will use Spark on K8s to demo the benefits of data locality aware scheduling with Apache Hadoop Ozone.

comments powered by Disqus