December 31, 2019

301 words 2 mins read

MacroBase: A search engine for fast data streams

MacroBase: A search engine for fast data streams

Sahaana Suri offers an overview of MacroBase, a new analytics engine from Stanford designed to prioritize the scarcest resource in large-scale, fast-moving data streams: human attention. MacroBase allows reconfigurable, real-time root-cause analyses that have already diagnosed issues in production streams in mobile, data center, and industrial applications.

Talk Title MacroBase: A search engine for fast data streams
Speakers Sahaana Suri (Stanford University)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 26-28, 2017
URL Talk Page
Slides Talk Slides
Video

MacroBase is a new open source analytics engine from the Stanford InfoLab designed to prioritize the scarcest resource in large-scale, fast-moving data streams: human attention. In many deployments at scale, an overwhelming proportion of data collected is never read and is instead retained only for reactive failure analysis. MacroBase analyzes data as it arrives, providing high-level interpretable explanations of stream behaviors, thus increasing its utility and enabling real-time root-cause analysis and anomaly detection. At its core, MacroBase combines streaming classification and explanation operators to both identify individual points of interest and highlight commonalities across them. For example, the Android device ecosystem comprises over 24,000 distinct device types. How can you determine whether your mobile application is behaving correctly on all of them? MacroBase’s classification operators can identify abnormally behaving devices, while its explanation operators can aggregate many such devices, producing more interpretable outputs. Thus, MacroBase is designed as both a set of reconfigurable dataflow operators as well as a series of end-to-end dataflow pipelines that have already been used to diagnose issues in production streams in mobile, data center, and industrial applications. Sahaana Suri walks you through the core concepts behind MacroBase, its architecture, and key use cases and shares takeaways from the recent research literature for data scientists, data engineers, and DevOps engineers.

comments powered by Disqus