September 22, 2019

210 words 1 min read

Anomaly Detection for Cloud Native Storage

Anomaly Detection for Cloud Native Storage

Integrating with heterogeneous storage in the Cloud Native environment has always been a challenge. How to detect problems and fix them in a timely fashion is important for mission critical workloads. …

Talk Title Anomaly Detection for Cloud Native Storage
Speakers Seiya Takei (Storage Engineer, Yahoo Japan Corporation), Xing Yang (Tech Lead, VMware)
Conference KubeCon + CloudNativeCon
Conf Tag
Location Shanghai, China
Date Jun 23-26, 2019
URL Talk Page
Slides Talk Slides
Video

Integrating with heterogeneous storage in the Cloud Native environment has always been a challenge. How to detect problems and fix them in a timely fashion is important for mission critical workloads. In this session, Takei-san and Xing will describe a common volume metrics model designed to retrieve data from heterogeneous storage in the Cloud Native environment. They will also illustrate a ML module that analyzes the data to detect anomalous behavior, and discuss how it helps Yahoo Japan identify problems early to keep the Cloud Native storage system healthy. Volume metrics such as IOPs, bandwidth, latency, and capacity are collected from storage backends serving workloads running on Kubernetes, and emitted to the Prometheus server. The ML module retrieves data from Prometheus and applies algorithms to do anomaly detection. Results are evaluated and alerts are issued when needed.

comments powered by Disqus