February 20, 2020

382 words 2 mins read

Deep learning with Horovod and Spark using GPUs and Docker containers

Deep learning with Horovod and Spark using GPUs and Docker containers

Today, organizations understand the need to keep pace with new technologies when it comes to performing data science with machine learning and deep learning, but these new technologies come with their own challenges. Thomas Phelan demonstrates the deployment of TensorFlow, Horovod, and Spark using the NVIDIA CUDA stack on Docker containers in a secure multitenant environment.

Talk Title Deep learning with Horovod and Spark using GPUs and Docker containers
Speakers Thomas Phelan (HPE BlueData)
Conference O’Reilly Artificial Intelligence Conference
Conf Tag Put AI to Work
Location London, United Kingdom
Date October 15-17, 2019
URL Talk Page
Slides Talk Slides

Data volume and complexity increases by the day, so it’s imperative that companies understand their business needs in order to stay ahead of their competition. Thanks to AI, ML, and deep learning (DL) projects such as Apache Spark, H2O, TensorFlow, and Horovod, these organizations no longer have to lock in to a specific vendor technology or proprietary solutions to maintain this competitive advantage. These feature-rich, deep learning applications are available directly from the open source community with many different algorithms and options tailored for specific use cases. One of the biggest challenges for the enterprise is how to deploy these open source tools in an easy and consistent manner (keeping in mind that some of them have operating system kernel and software components). For example, TensorFlow can leverage NVIDIA GPU resources, but running TensorFlow with GPUs requires users to set up NVIDIA CUDA libraries on the host and install and configure the TensorFlow application to make use of the GPU computing facility. The combination of device drivers, libraries, and software versions can be daunting and may end in failure for many users. Moreover, since GPUs are a premium resource, organizations want to maximize their use. Clusters using these resources need to be configured on demand and freed immediately after computation is complete. Docker containers are ideal for enabling just this sort of instant cluster provisioning and deprovisioning. They also ensure reproducible and consistent deployment. Thomas Phelan demonstrates how to deploy AI, ML, and DL applications, including Spark, TensorFlow, and Horovod, using GPU hardware acceleration on Docker containers in a secure multitenant environment. The use of GPU-based services within Docker containers does require some careful consideration, so he’ll also explore some best practices.

comments powered by Disqus