January 18, 2020

260 words 2 mins read

Running multidisciplinary big data workloads in the cloud

Running multidisciplinary big data workloads in the cloud

Attend this tutorial to learn how to successfully run a data analytics pipeline in the cloud and integrate data engineering and data analytic workflows and explore considerations and best practices for data analytics pipelines in the cloud. Along the way, you'll see how to share metadata across workloads in a big data PaaS.

Talk Title Running multidisciplinary big data workloads in the cloud
Speakers Sudhanshu Arora (Cloudera), Stefan Salandy (Cloudera), Suraj Acharya (Cloudera), Brandon Freeman (Cloudera), Jason Wang (Cloudera), Shravan Pabba (Cloudera)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 11-13, 2018
URL Talk Page
Slides Talk Slides
Video

Organizations now run diverse, multidisciplinary big data workloads that span data engineering, analytic database, and data science applications. Many of these workloads operate on the same underlying data, and the workloads themselves can be transient or long running in nature. One of the challenges is keeping the data context consistent across these various workloads. Sudhanshu Arora, Stefan Salandy, Suraj Acharya, Brandon Freeman, Jason Wang, and Shravan Pabba demonstrate how to successfully manage the shared data experience to ensure a consistent experience across all various workloads. You’ll learn how to successfully run a data analytics pipeline in the cloud and integrate data engineering and data analytic workflows and explore considerations and best practices for data analytics pipelines in the cloud. Along the way, you’ll see how to share metadata across workloads in a big data PaaS. You’ll use the Cloudera Altus PaaS offering, powered by Cloudera Altus SDX, to run various big data workloads.

comments powered by Disqus