Making sense of exactly-once semantics
Exactly-once semantics is a highly desirable property for streaming analytics. Ideally, all applications process events once and never twice, but making such guarantees in general either induces significant overhead or introduces other inconveniences, such as stalling. Flavio Junqueira explores what's possible and reasonable for streaming analytics to achieve when targeting exactly-once semantics.
Talk Title | Making sense of exactly-once semantics |
Speakers | Flavio Junqueira (Dell EMC) |
Conference | Strata + Hadoop World |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | June 1-3, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Exactly-once delivery is the holy grail of streaming analytics. Having duplicates of events processed in a streaming job is inconvenient and often undesirable depending on the nature of the application. For example, if billing applications miss an event or process an event twice, they could lose revenue or overcharge customers. Guaranteeing that such scenarios never happen is difficult; any project seeking such a property will need to make some choices with respect to availability and consistency. One main difficulty stems from the fact that a streaming pipeline might have multiple stages, and exactly-once delivery needs to happen at each stage. Another difficulty is that intermediate computations could potentially affect the final computation. And once results are exposed, retracting them causes problems. These observations lead to the conclusion that providing a general solution is very difficult, if not impossible. For some classes of applications, however, it is possible to obtain exactly-once behavior. If senders are willing to retry an unbounded number of times and the delivery is idempotent, then we can create a behavior equivalent to exactly-once semantics. One way to achieve the goal of making delivery idempotent is to rely on a distributed consensus primitive to have agreement on what has been delivered. For example, if the output has a key and a value, we can make sure that the key has been produced already. Assuming we rely on persistent queues to propagate events, we can use the queue position to also disambiguate updates using this consensus primitive. Flavio Junqueira explores what’s reasonable for streaming analytics to achieve when targeting exactly-once semantics, using examples based on systems that provide commit-log abstractions (like Apache Kafka and Apache BookKeeper) to demonstrate when it is possible to guarantee strong delivery semantics and how. Flavio shows that for a simple, but important, class of applications, it’s possible to provide strong delivery guarantees along with efficient implementations.