Modern Big Data Pipelines over Kubernetes [I]

Big data used to be synonymous with Hadoop, but our ecosystem has evolved over time with new database, streaming and machine learning solutions which dont necessarily benefit from the Hadoop deployme …


Talk Title	Modern Big Data Pipelines over Kubernetes [I]
Speakers	Eliran Bivas (Senior Big Data Architect, iguazio)
Conference	KubeCon + CloudNativeCon North America
Conf Tag
Location	Austin, TX, United States
Date	Dec 4- 8, 2017
URL	Talk Page
Slides	Talk Slides
Video

Big data used to be synonymous with Hadoop, but our ecosystem has evolved over time with new database, streaming and machine learning solutions which don’t necessarily benefit from the Hadoop deployment model of Map/Reduce, YARN and HDFS. These solutions require a generic cluster scheduling layer to host multiple workloads such as Kafka, Spark and TensorFlow, alongside databases such as Cassandra, Elasticsearch and cloud-based storage. Eliran Bivas is a senior big data architect with years of hands-on experience working on both big data and cloud native solutions. Eliran will go over a common solution framework to create cloud native end-to-end analytics applications. It involves using Kubernetes as an alternative to Yarn, running Spark, Presto, machine learning frameworks (TensorFlow, Python and Spark ML kits) and serverless functions coupled with local and cloud-based storage. The session will showcase customer use-cases from IoT, automotive, cloud SaaS and finance. It will also include a live solution demo which demonstrates the benefits of using big data and analytics over a cloud native architecture, eliminating the existing challenges of complexity and moving towards a continuous integration and development architecture for big data.

Modern Big Data Pipelines over Kubernetes [I]

Paint the landscape and secure your data center with Apache Spot

Real-time analytics using Kudu at petabyte scale

Distinguish pop music from heavy metal using Apache Spark MLlib

Hadoop and object stores: Can we do it better?

Running Mesos Frameworks on Kubernetes with the Open-Source Universal Resource Broker

The state of Spark in the cloud