Live Aggregators: A scalable, cost-effective, and reliable way of aggregating billions of messages in real time

Osman Sarood and Chunky Gupta discuss Mists real-time data pipeline, focusing on Live Aggregators (LA)a highly reliable and scalable in-house real-time aggregation system that can autoscale for sudden changes in load. LA is 80% cheaper than competing streaming solutions due to running over AWS Spot Instances and having 70% CPU utilization.


Talk Title	Live Aggregators: A scalable, cost-effective, and reliable way of aggregating billions of messages in real time
Speakers	Osman Sarood (Mist Systems), Chunky Gupta (Mist Systems)
Conference	Strata Data Conference
Conf Tag	Big Data Expo
Location	San Francisco, California
Date	March 26-28, 2019
URL	Talk Page
Slides	Talk Slides
Video

Osman Sarood and Chunky Gupta discuss Mist’s real-time data pipeline, focusing on Live Aggregators (LA)—a highly reliable and scalable in-house real-time aggregation system that can autoscale for sudden changes in load, ensuring fault tolerance and scalability. LA consumes billions of messages a day from Kafka with a memory footprint of over 1 TB and aggregates over 100 million time series. Since it runs entirely on top of AWS Spot Instances, it’s highly reliable. LA can recover from hours-long EC2 outages using its checkpointing mechanism, which recovers the checkpoint from S3 and replays messages from Kafka where it left off, ensuring no data loss. LA writes the aggregated data to the configured storage system (either be Cassandra, S3 or Kafka). LA does over 1.5 billion writes to Cassandra per day and maintains over 100 million concurrent state machines. The characteristic that sets LA apart is its ability to autoscale by intelligently learning about resource usage (both seasonal and long-term trends) and allocating resources accordingly. LA emits custom metrics that track resource usage for different components, such as Kafka consumers and shared memory managers and aggregators, to achieve server utilization of over 70%. LA is horizontally scalable to thousands of cores. Mist also does multilevel aggregations in LA to intelligently solve load imbalance issues among different partitions for a Kafka topic. Osman and Chunky demonstrate multilevel aggregation using an example that aggregates indoor location data coming from different organizations both spatially and temporally. You’ll learn how changing partitioning keys and writing intermediate data back to Kafka in a new topic for the next level aggregators help Mist scale its solution.

Live Aggregators: A scalable, cost-effective, and reliable way of aggregating billions of messages in real time

Lightning Talk: Open Match - Matchmaking Framework

Serverless for data and AI

Serverless workflows for orchestration hybrid cluster-based and serverless processing

KEDA: Event Driven and Serverless Containers in Kubernetes

Tutorial: Debug Your Kubernetes Apps

Deep Dive: Kubernetes Metric APIs Using Prometheus