New data architectures for high performance netflow analytics
Operational analytic databases are a new class of data systems that are emerging to store and analyze various types of operational data, including netflows. Popula …
Talk Title | New data architectures for high performance netflow analytics |
Speakers | Fangjin Yang |
Conference | NANOG74 |
Conf Tag | |
Location | Vancouver, BC, Canada |
Date | Oct 1 2018 - Oct 3 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | Talk Video |
Operational analytic databases are a new class of data systems that are emerging to store and analyze various types of operational data, including netflows. Popular systems in this area include Apache Druid (incubating in the Apache Software Foundation), Scuba (from Facebook), Pinot (from LinkedIn), and Clickhouse (from Yandex). In this session, we will describe the motivation and architecture behind operational analytic databases, and how they are used at some of the world’s largest companies to analyze netflows. This new class of data system enables rapid and flexible data ingestion, efficient data storage of large volumes of dimensional data such as netflows, and extremely fast queries compared against traditional systems. We will use Druid as a case study to explain the performance benefits for netflows. Speaker: Fangjin Yang is a co-author of the open source Druid project and a co-founder of Imply, a Silicon Valley technology company. Fangjin previously held senior engineering positions at Metamarkets (now a part of Snapchat) and Cisco. He holds a BASc in Electrical Engineering and a MASc in Computer Engineering from the University of Waterloo, Canada.