Faster conclusions using in-memory columnar SQL and machine learning
Hadoops traditional batch technologies are quickly being supplanted by in-memory columnar execution to drive faster data-to-value. Wes McKinney and Jacques Nadeau provide an overview of in-memory columnar execution, survey key related technologies, including Kudu, Ibis, Impala, and Drill, and cover a sample use case using Ibis in conjunction with Apache Drill to deliver real-time conclusions.
|Talk Title||Faster conclusions using in-memory columnar SQL and machine learning|
|Speakers||Wes McKinney (Two Sigma Investments), Jacques Nadeau (Dremio)|
|Conference||Strata + Hadoop World|
|Conf Tag||Big Data Expo|
|Location||San Jose, California|
|Date||March 29-31, 2016|
Data ages quickly. The longer it takes for you to reach a conclusion, the less value that conclusion can provide. In-memory columnar execution provides a way to get to Hadoop data scale with real-time response. In-memory columnar execution is a powerful paradigm for analyzing large amounts of data very quickly. It provides the ability for multiple applications to share a common data representation and perform operations using SIMD and vectorization. A number of key big data technologies, including Kudu, Ibis, Drill, and Impala, have or will soon have in-memory columnar capabilities. Wes McKinney and Jacques Nadeau give a quick overview of how each of these tools benefits from in-memory columnar execution and then get practical, going into detail about the capabilities of Ibis and how in-memory execution can speed up performance of key operations. Wes and Jacques explore Apache Drill as the backdrop for executing high speed in-memory transformations and machine learning algorithms and demonstrate how a powerful columnar UDF interface can allow organizations to take advantage of the performance of in-memory columnar execution within their custom requirements.