Distributed systems for stream processing: Apache Kafka and Spark Streaming
Alena Hall walks you through setting up and building a distributed streaming architecture on Azure using open source frameworks like Apache Kafka and Spark Streaming. You'll use these distributed systems to process data coming from multiple sources in real time and perform machine learning tasks.
Talk Title | Distributed systems for stream processing: Apache Kafka and Spark Streaming |
Speakers | Lena Hall (Microsoft) |
Conference | O’Reilly Open Source Convention |
Conf Tag | Put open source to work |
Location | Portland, Oregon |
Date | July 16-19, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Everything is a data source, and today’s online activities, financial operations, and IoT devices and sensors generate data at an ever-increasing rate. So how do we ingest, process, and manage that data? We need an architecture to ingest these incoming influxes of data that is flexible, scalable, fast, and resilient. Alena Hall walks you through setting up and building a distributed streaming architecture on Azure using open source frameworks like Apache Kafka and Spark Streaming. You’ll use these distributed systems to process data coming from multiple sources in real time and perform machine learning tasks. Along the way, you’ll discover how to effectively and interactively experiment with streams.