Yellow Pages (Canada): Our journey to speed of thought interactive analytics on top of Hadoop
The self-service YP Analytics application allows advertisers to understand their digital presence and ROI. Richard Langlois explains how Yellow Pages used this expertise for an internal use case that delivers real-time analytics with Tableau, using OLAP on Hadoop and enabled by its stack, which includes HDFS, Parquet, Hive, Impala, and AtScale, for fast, real-time analytics and data exploration.
|Yellow Pages (Canada): Our journey to speed of thought interactive analytics on top of Hadoop
|Strata + Hadoop World
|Make Data Work
|New York, New York
|September 27-29, 2016
Using Hadoop and other big data technologies, the YP Analytics application allows advertisers and media and advertising consultants to understand their digital presence and ROI. Richard Langlois explains how Yellow Pages (YP) used this expertise for an internal use case that delivers real-time analytics with Tableau, using OLAP on Hadoop and enabled by its stack (HDFS, Parquet, Hive, Impala, and AtScale). Yellow Pages’ first big data analytics use case, the YP Analytics application, uses Hadoop (Cloudera) and other big data technologies to help YP’s 244,000 advertisers understand their digital presence (ranking) and ROI with regard to the products and services they use with YP. With the delivery of YP Analytics, YP realized that its nationwide media and advertising consultants (MAC) needed the same information when meeting the advertisers. The MACs were and are still using a different sales application called Compass. In order to ensure information consistencies between these two applications, built by different teams and technologies, the YP team created a series of data services that can be used by any consuming applications, such as YP Analytics and Compass. The successes of these applications led YP’s internal teams to ask, “What about us?” For YP Analytics and Compass, all queries were known in advance and always in the context of a merchant or an account, which allowed the team to do multiple optimizations. However, these optimizations were not great for different internal ad hoc queries with other contexts than a merchant or an account, so YP decided to use OLAP on Hadoop. Richard offers an overview of the stack that has enabled OLAP on Hadoop (with more than 75 billion rows in production). The stack includes HDFS, Parquet, Hive, Impala, and AtScale for incredibly fast, real-time analytics and data exploration through Tableau, the tool chosen by YP’s end users. Richard also describes other recent use cases in advanced analytics for marketing campaign automation and sales recommendation engines using Spark, as well as recent work on reducing data analytics silos and experiments with search-based analytics. This session is sponsored by Tableau Software.