Rethinking stream processing with Apache Kafka: Applications versus clusters and streams versus databases

Michael Noll explains how Apache Kafka helps you radically simplify your data processing architectures by building normal applications to serve your real-time processing needs rather than building clusters or similar special-purpose infrastructurewhile still benefiting from properties typically associated exclusively with cluster technologies.


Talk Title	Rethinking stream processing with Apache Kafka: Applications versus clusters and streams versus databases
Speakers	Michael Noll (Confluent)
Conference	Strata Data Conference
Conf Tag	Making Data Work
Location	London, United Kingdom
Date	May 23-25, 2017
URL	Talk Page
Slides	Talk Slides
Video

Modern businesses have data at their core, but this data is changing continuously. How can you harness this torrent of information in real time? The answer: stream processing. The core platform for streaming data is Apache Kafka, and thousands of companies are using Kafka to transform and reshape their industries, including Netflix, Uber, PayPal, Airbnb, Goldman Sachs, Cisco, and Oracle. Unfortunately, today’s common architectures for real-time data processing at scale suffer from complexity: to succeed, many technologies need to be stitched and operated together, and each individual technology is often complex by itself. This has led to a strong discrepancy between how we engineers would like to work and how we actually end up working in practice. Michael Noll explains how Apache Kafka helps you radically simplify your data processing architectures by building normal applications to serve your real-time processing needs rather than building clusters or similar special-purpose infrastructure—while still benefiting from properties typically associated exclusively with cluster technologies, like high scalability, distributed computing, and fault tolerance. Michael also covers Kafka’s Streams API, its abstractions for streams and tables, and its recently introduced interactive queries functionality. Along the way, Michael shares common use cases that demonstrate that stream processing in practice often requires database-like functionality and how Kafka allows you to bridge the worlds of streams and databases when implementing your own core business applications (for example, in the form of event-driven, containerized microservices). As you’ll see, Kafka makes such architectures equally viable for small-, medium-, and large-scale use cases.

Rethinking stream processing with Apache Kafka: Applications versus clusters and streams versus databases

Stream analytics with SQL on Apache Flink

Achieving real-time ingestion and analysis of security events through Kafka and Metron

Cloudy with a chance of on-prem

Mistakes were made, but not by us: Lessons from a year of supporting Apache Kafka

Paint the landscape and secure your data center with Apache Spot

Kubernetes Ingress Controller with Apache Traffic Server [I]