October 24, 2019

237 words 2 mins read

How to build a successful data lake

How to build a successful data lake

It is fashionable today to declare doom and gloom for the data lake. Alex Gorelik discusses best practices for Hadoop data lake success and provides real-world examples of successful data lake implementations in a non-vendor-specific talk.

Talk Title How to build a successful data lake
Speakers Alex Gorelik (Waterline Data)
Conference Strata + Hadoop World
Conf Tag Big Data Expo
Location San Jose, California
Date March 29-31, 2016
URL Talk Page
Slides Talk Slides

Big data and data science promise to bring unprecedented levels of insight and efficiency to everything from working with data to working with customers to curing cancer. To successfully deliver on this promise, traditional enterprises are building data lakes, which bridge the gap between enterprise data warehouses, where data is a precious commodity carefully tended to by professional IT personnel, and the freewheeling culture of modern Internet companies. An enterprise data lake must provide three new capabilities: cost-effective scalable storage and computing; cost-effective data access and governance; and tiered, governed access, based on user needs, skill levels, and applicable data-governance policies. Drawing on a 30-year career developing leading-edge data technology and working with some of the world’s largest enterprises on their thorniest data problems, Alex Gorelik, author of the forthcoming O’Reilly book The Enterprise Data Lake, discusses the considerations of and best practices for building data lakes, with examples taken from from the world’s leading big data companies and enterprises. Topics include:

comments powered by Disqus