Deep learning with TensorFlow and Spark using GPUs and Docker containers
Organizations need to keep ahead of their competition by using the latest AI, ML, and DL technologies such as Spark, TensorFlow, and H2O. The challenge is in how to deploy these tools and keep them running in a consistent manner while maximizing the use of scarce hardware resources, such as GPUs. Thomas Phelan discusses the effective deployment of such applications in a container environment.
|Talk Title||Deep learning with TensorFlow and Spark using GPUs and Docker containers|
|Speakers||Thomas Phelan (HPE BlueData)|
|Conference||Strata Data Conference|
|Conf Tag||Making Data Work|
|Location||London, United Kingdom|
|Date||April 30-May 2, 2019|
Organizations understand the need to keep pace with newer technologies and methodologies when it comes to doing data science with machine learning and deep learning. Data volume and complexity is increasing by the day, so it’s imperative that companies understand their business better and stay on par with or ahead of the competition. Thanks to applications such as Apache Spark, H2O, and TensorFlow, these organizations no longer have to lock in to a specific vendor technology or proprietary solutions. These rich deep learning applications are available in the open source community, with many different algorithms and options for various use cases. However, one of the biggest challenges is how to get all these open source tools up and running in an easy and consistent manner (keeping in mind that some of them have OS kernel and software components). For example, TensorFlow can leverage NVIDIA GPU resources, but installing TensorFlow for GPU requires users to set up NVIDIA CUDA libraries on the machine and install and configure TensorFlow to make use of the GPU computing ability. The combination of device drivers, libraries, and software versions can be daunting and may be a nonstarter for many users. Since GPUs are a premium resource, organizations that want to leverage this capability need to bring up clusters with these resources on demand and then relinquish their use after computation is complete. Docker containers can be used to set up this instant cluster provisioning and deprovisioning and can help ensure reproducible builds and easier deployment. Thomas Phelan demonstrates how to deploy a TensorFlow and Spark with NVIDIA CUDA stack on Docker containers in a multitenant environment. Using GPU-based services with Docker containers does require some careful consideration, so Thomas shares best practices specifically related to the pros and cons of using NVIDIA-Docker versus regular Docker containers, CUDA library usage in Docker containers, Docker run parameters to pass GPU devices to containers, storing results for transient clusters, and integration with Spark.