Spark on Kubernetes for data science

Spark on Kubernetes is a winning combination for data science that stitches together a flexible platform harnessing the best of both worlds. Jordan Volz gives a brief overview of Spark and Kubernetes, the Spark on Kubernetes project, why its an ideal fit for data scientists who may have been dissatisfied with other iterations of Spark in the past, and some applications.


Talk Title	Spark on Kubernetes for data science
Speakers	Jordan Volz (Dataiku)
Conference	Strata Data Conference
Conf Tag	Make Data Work
Location	New York, New York
Date	September 24-26, 2019
URL	Talk Page
Slides	Talk Slides
Video

Data science has benefitted greatly from advances in big data and containerization technologies. Spark is the leading platform for data engineering and data science at scale. Kubernetes is the leading container orchestration service. Spark on Kubernetes is a winning combination for data science that stitches together a flexible platform harnessing the best of both worlds. Although still very experimental and young, Spark on Kubernetes shows tremendous promise and should be something all data science organizations are aware of. Jordan Volz gives a brief overview of Spark and Kubernetes, explaining the history of each and why they are so crucial to the modern data scientist. He explores the Spark on Kubernetes project and why it’s an ideal fit for data scientists who may have been dissatisfied with other iterations of Spark in the past. He also dives into Spark on Kubernetes as the go-to platform in cloud native architectures as organizations begin to modernize their older on-premises architectures and ready them for cloud deployments. He shows some concrete examples to whet your appetite and get you excited to go home and start experimenting with Spark on Kubernetes for yourself.

Spark on Kubernetes for data science

Kubernetes for the Impatient

Big data analytics in the public cloud: Challenges and opportunities

Flyte: Cloud Native Machine Learning & Data Processing Platform

Open Source Weave Ignite - The GitOps VM

Delivering TV Everywhere with Cloud Native Solutions

Accelerating Your Cloud Native DevOps with Oracle Linux and VirtualBox