Open source streaming analytics with the Kafka, Flink, Cassandra (KFC) stack
Streaming analytics is a popular subject in enterprise organizations because customers want real-time experiences, such as notifications and advice based on online behavior and other users actions. Bas Geerdink details an open source reference solution for streaming analytics that covers many use cases that follow a "pipes and filters" pattern, built with Scala, Flink, Kafka, and Cassandra.
|Talk Title||Open source streaming analytics with the Kafka, Flink, Cassandra (KFC) stack|
|Speakers||Bas Geerdink (Aizonic)|
|Conference||O’Reilly Open Source Software Conference|
|Conf Tag||Fueling innovative software|
|Date||July 15-18, 2019|
Streaming analytics (or fast data) is becoming an increasingly popular subject in enterprise organizations because customers want to have real-time experiences, such as notifications and advice based on their online behavior and other users’ actions. A typical streaming analytics solution follows a “pipes and filters” pattern that consists of three main steps: detecting patterns on raw event data (complex event processing), evaluating the outcomes with the aid of business rules and machine learning algorithms, and deciding on the next action. Bas Geerdink details an open source reference solution for streaming analytics that covers many use cases that follow this pattern: actionable insights, fraud detection, log parsing, traffic analysis, factory data, the IoT, and others. The solution is built with the KFC stack: Kafka, Flink, and Cassandra. All source code is written in Scala. Bas explores a few architecture challenges that arise when dealing with streaming data, such as latency issues, event time versus server time, and exactly once processing. He provides architectural diagrams, explanations, a demo, and the source code. The solution (“Styx”) is open source and available on GitHub.