November 25, 2019

247 words 2 mins read

How to use Impala's query plan and profile to fix performance issues

How to use Impala's query plan and profile to fix performance issues

Apache Impala (incubating) is an exceptional, best-of-breed massively parallel processing SQL query engine that is a fundamental component of the big data software stack. Juan Yu demystifies the cost model Impala Planner uses and how Impala optimizes queries and explains how to identify performance bottleneck through query plan and profile and how to drive Impala to its full potential.

Talk Title How to use Impala's query plan and profile to fix performance issues
Speakers Juan Yu (Cloudera)
Conference Strata Data Conference
Conf Tag Big Data Expo
Location San Jose, California
Date March 6-8, 2018
URL Talk Page
Slides Talk Slides
Video

Apache Impala (incubating) is an exceptional, best-of-breed massively parallel processing SQL query engine that is a fundamental component of the big data software stack. However, Impala is a complex engine and requires a thorough technical understanding to utilize it fully. When Impala is improperly configured or used, it may use too many resources, and performance could be very poor. For many users, understanding Impala query performance is like a trip on the mystery bus. Impala provides a query plan and query profile to help users choose an optimal plan and understand how a query is executed and how many resources it uses. But digging through query profiles isn’t fun for everyone. Juan Yu demystifies the cost model Impala Planner uses and how Impala optimizes queries and explains how to identify performance bottleneck through query plan and profile and how to drive Impala to its full potential.

comments powered by Disqus