December 3, 2019

259 words 2 mins read

Modern Big Data Pipelines over Kubernetes [I]

Modern Big Data Pipelines over Kubernetes [I]

Big data used to be synonymous with Hadoop, but our ecosystem has evolved over time with new database, streaming and machine learning solutions which dont necessarily benefit from the Hadoop deployme …

Talk Title Modern Big Data Pipelines over Kubernetes [I]
Speakers Eliran Bivas (Senior Big Data Architect, iguazio)
Conference KubeCon + CloudNativeCon North America
Conf Tag
Location Austin, TX, United States
Date Dec 4- 8, 2017
URL Talk Page
Slides Talk Slides
Video

Big data used to be synonymous with Hadoop, but our ecosystem has evolved over time with new database, streaming and machine learning solutions which don’t necessarily benefit from the Hadoop deployment model of Map/Reduce, YARN and HDFS. These solutions require a generic cluster scheduling layer to host multiple workloads such as Kafka, Spark and TensorFlow, alongside databases such as Cassandra, Elasticsearch and cloud-based storage. Eliran Bivas is a senior big data architect with years of hands-on experience working on both big data and cloud native solutions. Eliran will go over a common solution framework to create cloud native end-to-end analytics applications. It involves using Kubernetes as an alternative to Yarn, running Spark, Presto, machine learning frameworks (TensorFlow, Python and Spark ML kits) and serverless functions coupled with local and cloud-based storage. The session will showcase customer use-cases from IoT, automotive, cloud SaaS and finance. It will also include a live solution demo which demonstrates the benefits of using big data and analytics over a cloud native architecture, eliminating the existing challenges of complexity and moving towards a continuous integration and development architecture for big data.

comments powered by Disqus