December 12, 2019

274 words 2 mins read

Hadoop application architectures: Architecting a next-generation data platform for real-time ETL, data analytics, and data warehousing

Hadoop application architectures: Architecting a next-generation data platform for real-time ETL, data analytics, and data warehousing

Jonathan Seidman, Gwen Shapira, Mark Grover, and Ted Malaska demonstrate how to architect a modern, real-time big data platform and explain how to leverage components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics such as real-time ETL, change data capture, and machine learning.

Talk Title Hadoop application architectures: Architecting a next-generation data platform for real-time ETL, data analytics, and data warehousing
Speakers
Conference Strata + Hadoop World
Conf Tag Make Data Work
Location New York, New York
Date September 27-29, 2016
URL Talk Page
Slides Talk Slides
Video

Apache Hadoop is rapidly moving from its batch processing roots to a more flexible platform supporting both batch and real-time workloads. Rapid advancements in the Hadoop ecosystem are causing a dramatic evolution in both the storage and processing capabilities of the Hadoop platform. These advancements include projects like: While these advancements to the Hadoop platform are exciting, they add a new array of tools that architects and developers need to understand when architecting solutions with Hadoop. Jonathan Seidman, Gwen Shapira, Mark Grover, and Ted Malaska explain how to leverage components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics such as real-time ETL, change data capture, and machine learning as they walk attendees through an example architecture that provides the following capabilities: Along the way, Jonathan, Gwen, Mark, and Ted discuss considerations and best practices for utilizing these components to implement solutions, cover common challenges and how to address them, and provide practical advice for building your own modern, real-time big data architectures.

comments powered by Disqus