January 24, 2020

205 words 1 min read

Deep learning on YARN: Running distributed TensorFlow, MXNet, Caffe, and XGBoost on Hadoop clusters

Deep learning on YARN: Running distributed TensorFlow, MXNet, Caffe, and XGBoost on Hadoop clusters

In order to train deep learning and machine learning models, you must leverage applications such as TensorFlow, MXNet, Caffe, and XGBoost. Wangda Tan discusses new features in Apache Hadoop 3.x to better support deep learning workloads and demonstrates how to run these applications on YARN.

Talk Title Deep learning on YARN: Running distributed TensorFlow, MXNet, Caffe, and XGBoost on Hadoop clusters
Speakers Wangda Tan (Cloudera)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 11-13, 2018
URL Talk Page
Slides Talk Slides
Video

Deep learning is useful for enterprises tasks such as speech recognition, image classification, AI chatbots, and machine translation, just to name a few. In order to train deep learning and machine learning models, you must leverage applications such as TensorFlow, MXNet, Caffe, and XGBoost. Wangda Tan discusses new features in Apache Hadoop 3.x to better support deep learning workloads, such as first-class GPU support, container-DNS support, scheduling improvements, and more. These improvements make running distributed deep learning and machine learning applications on YARN as simple as running them locally, which allows machine learning engineers to focus on algorithms instead of worrying about the underlying infrastructure. Wangda then demonstrates how to run these applications on YARN.

comments powered by Disqus