November 28, 2019

268 words 2 mins read

Classifying job execution using deep learning

Classifying job execution using deep learning

Ash Munshi shares techniques for labeling big data apps using runtime measurements of CPU, memory, I/O, and network and details a deep neural network to help operators understand the types of apps running on the cluster and better predict runtimes, tune resource utilization, and increase efficiency. These methods are new and are the first approach to classify multivariate time series.

Talk Title Classifying job execution using deep learning
Speakers Ash Munshi (Pepperdata)
Conference Strata Data Conference
Conf Tag Big Data Expo
Location San Jose, California
Date March 6-8, 2018
URL Talk Page
Slides Talk Slides
Video

Operators of big data clusters must understand the types of applications that run on these clusters to better predict runtimes, tune resource utilization, and increase efficiency. Unfortunately, application developers seldom provide meaningful information to accomplish this task. Ash Munshi shares techniques for labeling big data apps using runtime measurements of CPU, memory, I/O, and network and details a deep neural network to help operators understand the types of apps running on the cluster. This labeling groups applications into buckets that have understandable characteristics, which can then be used to reason about the cluster and its performance. For example, members of a single group can be studied to understand variability in runtime, effects of different queue assignments, effects of the underlying system hardware architecture, and even the effects of start times for periodic applications. The machine learning techniques presented are new and represent the first approach to classify multivariate time series. The data for the models comes from observing over 22,000 servers and all of their task metrics every five seconds for months.

comments powered by Disqus