February 10, 2020

233 words 2 mins read

Sketching data and other magic tricks

Sketching data and other magic tricks

Go hands-on with Sophie Watson and William Benton to examine data structures that let you answer interesting queries about massive datasets in fixed amounts of space and constant time. This seems like magic, but they'll explain the key trick that makes it possible and show you how to use these structures for real-world machine learning and data engineering applications.

Talk Title Sketching data and other magic tricks
Speakers Sophie Watson (Red Hat), William Benton (Red Hat)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 24-26, 2019
URL Talk Page
Slides Talk Slides
Video

Sophie Watson and William Benton explore a way to answer interesting queries about truly massive datasets almost instantly and with a fixed amount of space. It sounds like magic, but you’ll go hands-on to practice sketching data structures that work this magic and the key trick that makes them possible. Sophie and William introduce truly scalable techniques for several fundamental problems like set membership, set and document similarity, counting kinds of events, and counting distinct elements. You’ll learn how and when to use these structures as well as how they work. You’ll see how the same techniques work for parallel, distributed, and stream processing at scale. And you’ll leave able to put these techniques to work in real data engineering and machine learning applications like join processing, document classification, and content personalization.

comments powered by Disqus