LSTM-based time series anomaly detection using Analytics Zoo for Spark and BigDL
Collecting and processing massive time series data (e.g., logs, sensor readings, etc.) and detecting the anomalies in real time is critical for many emerging smart systems, such as industrial, manufacturing, AIOps, and the IoT. Guoqiong Song explains how to detect anomalies in time series data using Analytics Zoo and BigDL at scale on a standard Spark cluster.
Talk Title | LSTM-based time series anomaly detection using Analytics Zoo for Spark and BigDL |
Speakers | Guoqiong Song (Intel) |
Conference | Strata Data Conference |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | April 30-May 2, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Collecting and processing massive time series data (e.g., logs, sensor readings, etc.) and detecting the anomalies in real time is critical for many emerging smart systems, such as industrial, manufacturing, AIOps, and the IoT. Long short-term memory networks (LSTMs) have proven to be an effective technology on a variety of time series analysis tasks. They capture temporal information by learning the dynamics of sequences via cycles in the network of nodes. LSTMs can be readily built using any of today’s deep learning packages. However, most popular deep learning libraries use Python as their native language and run on GPU clusters to achieve state-of-the-art performance, which presents a real challenge in the productionization environment. Guoqiong Song explains how to apply time series anomaly detection for big data at scale, using the end-to-end Spark and BigDL pipeline provided by Analytics Zoo.You’ll learn how to build the end-to-end flow on standard Hadoop/Spark clusters, including preprocessing the raw time series data and extracting features, then train an anomaly detector model based on LSTMs and evaluate the model and anomaly detection. This solution has been applied at Yunda, Travelsky, and Baosight, among others