January 1, 2020

219 words 2 mins read

From Kafka to BigQuery: A guide for delivering billions of daily events

From Kafka to BigQuery: A guide for delivering billions of daily events

What are the most important considerations for shipping billions of daily events to analysis? Ofir Sharony shares MyHeritage's journey to find a reliable and efficient way to achieve real-time analytics. Along the way, Ofir compares several data loading techniques, helping you make better choices when building your next data pipeline.

Talk Title From Kafka to BigQuery: A guide for delivering billions of daily events
Speakers Ofir Sharony (MyHeritage)
Conference Strata + Hadoop World
Conf Tag Make Data Work
Location Singapore
Date December 6-8, 2016
URL Talk Page
Slides Talk Slides
Video

MyHeritage collects billions of events every day, including request logs from web servers and backend services, events describing user activities across different platforms, and change data capture logs recording every change made to its database records. Delivering these events to analytics is a complex task, requiring a robust and scalable data pipeline. Ofir Sharony shares MyHeritage’s journey to find a reliable and efficient way to achieve real-time analytics and offers an overview of the system the company decided on: shipping events to Apache Kafka and loading them to analysis in Google BigQuery. Along the way, Ofir compares several data loading techniques, helping you make better choices when building your next data pipeline. Topics include: For more information, take a look at Ofir’s recent blog post on the subject.

comments powered by Disqus