December 22, 2019

240 words 2 mins read

Presto: Tuning performance of SQL-on-anything analytics

Presto: Tuning performance of SQL-on-anything analytics

Kamil Bajda-Pawlikowski and Martin Traverso explore Presto's recently introduced cost-based optimizer, which must account for heterogeneous inputs with differing and often incomplete data statistics, and detail use cases for Presto across several industries. They also share recent Presto advancements, such as geospatial analytics at scale, and the project roadmap going forward.

Talk Title Presto: Tuning performance of SQL-on-anything analytics
Speakers Kamil Bajda-Pawlikowski (Starburst), Martin Traverso (Presto Software Foundation)
Conference Strata Data Conference
Conf Tag Big Data Expo
Location San Francisco, California
Date March 26-28, 2019
URL Talk Page
Slides Talk Slides
Video

Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Airbnb, Bloomberg, Comcast, Facebook, FINRA, LinkedIn, Lyft, Netflix, Twitter, and Uber, in the last few years Presto has experienced an unprecedented growth in popularity in both on-premises and cloud deployments over object stores, HDFS, NoSQL, and RDBMS data stores. With the ever-growing list of connectors to new data sources such as Azure Blob Storage, Elasticsearch, Netflix Iceberg, Apache Kudu, and Apache Pulsar, Presto’s recently introduced cost-based optimizer must account for heterogeneous inputs with differing and often incomplete data statistics. Kamil Bajda-Pawlikowski and Martin Traverso explore this topic and detail use cases for Presto across several industries. They also share recent Presto advancements, such as geospatial analytics at scale, and the project roadmap going forward.

comments powered by Disqus