February 8, 2020

347 words 2 mins read

Trill: The crown jewel of Microsofts streaming pipeline explained

Trill: The crown jewel of Microsofts streaming pipeline explained

Trill has been open-sourced, making the streaming engine behind services like the Bing Ads platform available for all to use and extend. James Terwilliger, Badrish Chandramouli, and Jonathan Goldstein dive into the history of and insights from streaming data at Microsoft. They demonstrate how its API can power complex application logic and the performance that gives the engine its name.

Talk Title Trill: The crown jewel of Microsofts streaming pipeline explained
Speakers James Terwilliger (Microsoft Corporation), Badrish Chandramouli (Microsoft Research), Jonathan Goldstein (Microsoft Research)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 24-26, 2019
URL Talk Page
Slides Talk Slides
Video

The Trill data engine is the power behind many of Microsoft’s offerings, from products like Azure Stream Analytics to billion-dollar services like Bing Ads. It has now been open-sourced and is available to everyone. But it has been a long path to get there. James Terwilliger, Badrish Chandramouli, and Jonathan Goldstein explore the history of decades of streaming data processing at Microsoft: a beginning in research, a first product in StreamInsight, the transition to the cloud, and all the pain points along the way. A key result of that lineage and learning has been the Trill engine, which has three key properties a single standalone data processing engine for all temporal data, no matter if the data is streamed or stored; a simple API that integrates seamlessly with the programming language; and performance without ego, a willingness to use every lesson learned to improve throughput in every way possible. They dive deep into why each of those properties is important through examples. A simple application to demonstrate the basics of Trill: joins, aggregation, windowing; a more complicated application to demonstrate the power of Trill’s API: progressive windowing, regular expressions and pattern detection, data-dependent windows; and an overview of the kind of query used by Bing Ads, a query to run a multi-billion-dollar business. You’ll see a performance showcase: running the previous examples to demonstrate how Trill got its name—processing a trillion events per day on a single node.

comments powered by Disqus