October 9, 2019

205 words 1 min read

Embracing Big Data Workload in Cloud-Native Environment with Data Locality

Embracing Big Data Workload in Cloud-Native Environment with Data Locality

Kubernetes support schedule workloads based on CPU and memory resource with node affinity, pod affinity and anti-affinity. This works very well for stateless workloads. For stateful workloads, especia …


Talk Title	Embracing Big Data Workload in Cloud-Native Environment with Data Locality
Speakers	Sammi Chen (Software Engineer, Tencent), Xiaoyu Yao (Principal Software Engineer, Cloudera)
Conference	KubeCon + CloudNativeCon
Conf Tag
Location	Shanghai, China
Date	Jun 23-26, 2019
URL	Talk Page
Slides	Talk Slides
Video

Kubernetes support schedule workloads based on CPU and memory resource with node affinity, pod affinity and anti-affinity. This works very well for stateless workloads. For stateful workloads, especially big data workloads, scheduling compute close to data source can greatly boost performance, reliability and availability. However, in many cloud based storage systems, the data locality info is either unavailable or not exposed to container orchestra. In this talk, we will first compare the data locality support from mainstream container attached storage for Kubernetes. Then we will introduce network topology support from Apache Hadoop Ozone and how to use it as locality aware container attached storage via Ozone CSI plugin for better workloads scheduling. Last, we will use Spark on K8s to demo the benefits of data locality aware scheduling with Apache Hadoop Ozone.

container reliability apache performance spark k8s hadoop network big data cloud kubernetes

comments powered by Disqus

HDFS CSI Plugin: Speed Up Kubernetes in On-Premises Big Data Cluster

HDFS CSI Plugin: Speed Up Kubernetes in On-Premises Big Data Cluster

October 4, 2019

Kubernetes not only becomes predominant in public cloud area these days, but also becomes a new trend in on-premises big data cluster environment, as an alternative of Hadoop YARN, a resource schedule …

Build Serverless with K8s, Kata Containers and Bare Mental Cloud in Alibaba

Build Serverless with K8s, Kata Containers and Bare Mental Cloud in Alibaba

October 4, 2019

Serverless is hot! Everybody knows that. While not so many people know that in Serverless platform, applications from different tenants have to be co-located on the same node which is the key of why S …

Benchmark Your Cloud Native Database

Benchmark Your Cloud Native Database

October 8, 2019

You can run your stateful apps on Kubernetes. You can even run your databases on Kubernetes. But what are you giving up in performance? Is it worth it, or should you stick to the hosting you know? Fo …

Keynote: Tencent: Kubernetes in the Billions

Keynote: Tencent: Kubernetes in the Billions

September 24, 2019

At Tencent, our business touches everything from gaming, social media, payments, to cloud computing. Wed like to share our story of how K8s is broadly used at Tencent, taking care of our infrastructu …

From Secure Container to Secure Service

From Secure Container to Secure Service

October 9, 2019

In KubeCon NA 2018, we did a quantitive comparison between Kata containers and gVisor, in which we showed the reasonable CPU/Networking performance for Kata, the performance penalty on filesystem stor …

Service Governance in Production-ready Containerized Cloud Foundry with Istio

Service Governance in Production-ready Containerized Cloud Foundry with Istio

October 7, 2019

Containerized Cloud Foundry (CF) turns traditional CF components into micro-services in the Kubernetes (K8s). This approach embraces the benefits that K8s brings and opens up opportunities to manage C …