GPU Sharing for Machine Learning Workload on Kubernetes
Machine learning is becoming more and more popular in the technology world. The community is beginning to leverage Kubernetes to deploy and manage the machine learning workload. One of the key challe …
Talk Title | GPU Sharing for Machine Learning Workload on Kubernetes |
Speakers | (Haining Henry) Zhang (Chief Architect, VMware), Yang Yu (Software Engineer, VMware) |
Conference | KubeCon + CloudNativeCon Europe |
Conf Tag | |
Location | Barcelona, Spain |
Date | May 19-23, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Machine learning is becoming more and more popular in the technology world. The community is beginning to leverage Kubernetes to deploy and manage the machine learning workload. One of the key challenges is to schedule the GPU-intensive workload. The Kubernetes has included GPU support for applications. However, there are some limitations of GPU usage: 1. GPU assignment is exclusive. Containers cannot share GPU resources. 2. A container can request one or more GPUs, but it is not possible to request a fraction of a GPU. This session introduces how to run workload using the GPU in Kubernetes. In addition, an approach will be demonstrated to use virtual GPU (vGPU) technology to enable multiple pods concurrently accessing the same physical GPU. This approach not only increases the utilization of GPU resources, it also allows more GPU workloads to be scheduled on the same physical GPU.