November 20, 2019

273 words 2 mins read

Stream processing with Kafka

Stream processing with Kafka

Tim Berglund leads a basic architectural introduction to Kafka and walks you through using Kafka Streams and KSQL to process streaming data.


Talk Title	Stream processing with Kafka
Speakers	Tim Berglund (Confluent)
Conference	Strata Data Conference
Conf Tag	Big Data Expo
Location	San Jose, California
Date	March 6-8, 2018
URL	Talk Page
Slides	Talk Slides
Video

The toolset for building scalable data systems is maturing, having adapted well to our decades-old paradigm of update-in-place databases. We ingest events, we store them in high-volume OLTP databases, and we have new OLAP systems to analyze them at scale—even if the size of our operation requires us to grow to dozens or hundreds of servers in the distributed system. But something feels a little dated about the store-and-analyze paradigm, as if we are missing a new architectural insight that might more efficiently distribute the work of storing and computing the events that happen to our software. That new paradigm is stream processing. Tim Berglund leads a basic architectural introduction to Apache Kafka and walks you through using Kafka Streams and KSQL to process streaming data. You’ll learn the basics of Kafka as a messaging system, including the core concepts of topic, producer, consumer, and broker. Tim also explains how topics are partitioned among brokers and highlights the simple Java APIs for getting data in and out. But more importantly, you’ll discover how to extend this scalable messaging system into a streaming data processing system—one that offers significant advantages in scalability and deployment agility while locating computation in your data pipeline in precisely the places it belongs: in your microservices and applications, not in costly, high-density systems.

api java kafka streaming apache messaging sql microservice introduction database distributed system broker scalable pipeline olap

comments powered by Disqus

Apache Kafka + Apache Mesos = Highly scalable streaming microservices

Apache Kafka + Apache Mesos = Highly scalable streaming microservices

November 18, 2019

Kai Whner shares a highly scalable, mission-critical infrastructure using Apache Kafka and Apache Mesos: Kafka brokers are used as the distributed messaging backbone; Kafkas Streams API embeds stream processing into any external application without the need for a dedicated streaming cluster; and Mesos is used as a scalable infrastructure to leverage the benefits of a cloud-native platform.

Streaming applications as microservices using Kafka, Akka Streams, and Kafka Streams

Streaming applications as microservices using Kafka, Akka Streams, and Kafka Streams

November 20, 2019

Join Dean Wampler and Boris Lublinsky to learn how to build two microservice streaming applications based on Kafka using Akka Streams and Kafka Streams for data processing. You'll explore the strengths and weaknesses of each tool for particular design needs and contrast them with Spark Streaming and Flink, so you'll know when to choose them instead.

Data services: Processing big data the microservice way

Data services: Processing big data the microservice way

November 17, 2019

Mario-Leander Reimer explores key JEE technologies that can be used to build JEE-powered data services and walks you through implementing the individual data processing tasks of a simplified showcase application. You'll then deploy and orchestrate the individual data services using OpenShift, illustrating the scalability of the overall processing pipeline.

Speed up mission-critical analytics in the cloud (sponsored by Kyligence)

Speed up mission-critical analytics in the cloud (sponsored by Kyligence)

November 20, 2019

As organizations look to scale their analytics capability, the need to grow beyond a traditional data warehouse becomes critical, and cloud-based solutions allow more flexibility while being more cost efficient. Billy Liu offers an overview of Kyligence Cloud, a managed Apache Kylin online service designed to speed up mission-critical analytics at web scale for big data.

Streaming SQL to unify batch and stream processing: Theory and practice with Apache Flink at Uber

Streaming SQL to unify batch and stream processing: Theory and practice with Apache Flink at Uber

November 20, 2019

Fabian Hueske and Shuyi Chen explore SQL's role in the world of streaming data and its implementation in Apache Flink and cover fundamental concepts, such as streaming semantics, event time, and incremental results. They also share their experience using Flink SQL in production at Uber, explaining how Uber leverages Flink SQL to solve its unique business challenges.

The secret sauce behind LinkedIn's self-managing Kafka clusters

The secret sauce behind LinkedIn's self-managing Kafka clusters

November 20, 2019

LinkedIn runs more than 1,800+ Kafka brokers that deliver more than two trillion messages a day. Running Kafka at such a scale makes automated operations a necessity. Jiangjie Qin shares lessons learned from operating Kafka at scale with minimum human intervention.