High Performance Deep Learning on Containers
The field of deep learning has led to the emergence of new frameworks such as Caffee, Torch, and TensorFlow that tackle problems in image recognition, object classification, or machine translation. Th …
Talk Title | High Performance Deep Learning on Containers |
Speakers | Khalid Ahmed (Distinguished Engineer, IBM), Bruce D’amora (Senior Technical Staff Member, IBM) |
Conference | Open Source Summit North America |
Conf Tag | |
Location | Los Angeles, CA, United States |
Date | Sep 10-14, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
The field of deep learning has led to the emergence of new frameworks such as Caffee, Torch, and TensorFlow that tackle problems in image recognition, object classification, or machine translation. These systems must interact with containerized micro-services developed using DevOps tools running on popular container management tools such as Kubernetes. In this talk we examine the work in the Kubernetes ecosystem to enable some of the special requirements of deep learning such as GPU support, high speed networking, access to large data sets, ,better batch job scheduling and distributed computing support. We show how the Kubernetes platform can support both CI/CD pipelines and the high performance computing requirements using examples from research and industry.