Big data for operational insights
GoDaddy ingests and analyzes 100,000 EPS of logs, metrics, and events each day. Felix Gorodishter shares GoDaddy's big data journey and explains how the company makes sense of 10+-TB-per-day growth for operational insights of its cloud leveraging Kafka, Hadoop, Spark, Pig, Hive, Cassandra, and Elasticsearch.
Talk Title | Big data for operational insights |
Speakers | Felix Gorodishter (GoDaddy) |
Conference | Strata + Hadoop World |
Conf Tag | Big Data Expo |
Location | San Jose, California |
Date | March 14-16, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Harnessing the power of data is the future of business. GoDaddy is constantly trying to improve its customer experience and internal operations by understanding massive amounts of data and is on the path to transform its business by leveraging Hadoop in conjunction with an enterprise-wide, Kafka-backed data ingest pipeline along with Elasticsearch, Spark, and Cassandra to perform anomaly detection, real-time log visualization, alerting, remediation, and batch reporting on hundreds of thousands of events per second across its products and IT data. Felix Gorodishter shares GoDaddy’s big data journey from a farm of data silos to a centralized platform capable of supporting data ingest and visualization throughout its enterprise. Learn how GoDaddy collects and manages its data, which ranges from business units like hosting and domains to network and hardware events across its fleet of servers and network devices. As GoDaddy was transforming its data ingest, it also took on the challenge of understanding what it was collecting in order to answer key business questions, such as: Felix discusses how GoDaddy went about answering those questions by leveraging a wide range of technologies including Kafka, Spark, Hadoop, Elasticsearch, and other open source tools.