January 5, 2020

270 words 2 mins read

Architecting a next-generation data platform

Architecting a next-generation data platform

Using Customer 360 and the IoT as examples, Jonathan Seidman, Mark Grover, and Gwen Shapira explain how to architect a modern, real-time big data platform leveraging recent advancements in the open source software world, using components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics.

Talk Title Architecting a next-generation data platform
Speakers Jonathan Seidman (Cloudera), Gwen Shapira (Confluent), Mark Grover (Lyft)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 26-28, 2017
URL Talk Page
Slides Talk Slides
Video

Rapid advancements are causing a dramatic evolution in both the storage and processing capabilities in the open source big data software ecosystem. These advancements include projects like: Along with the Apache Hadoop platform, these storage and processing systems provide a powerful platform to implement data processing applications on batch and streaming data. While these advancements are exciting, they also add a new array of tools that architects and developers need to understand when architecting solutions with Hadoop. Using Customer 360 and the IoT as examples, Jonathan Seidman, Mark Grover, and Gwen Shapira explain how to architect a modern, real-time big data platform leveraging recent advancements in the open source software world, using components like Kafka, Impala, Kudu, Spark Streaming, and Spark SQL with Hadoop to enable new forms of data processing and analytics. Along the way, they discuss considerations and best practices for utilizing these components to implement solutions, cover common challenges and how to address them, and provide practical advice for building your own modern, real-time big data architectures. Topics include:

comments powered by Disqus