Volcano: Running AI/DL workload on Kubernetes
Kubernetes started as a general purpose orchestration framework with a focus on serving jobs. But as it gains popularity, users want to run AI/DL workloads on Kubernetes, such as TensorFlow, PyTorch e …
Talk Title | Volcano: Running AI/DL workload on Kubernetes |
Speakers | Da Ma (Software Architect, Huawei) |
Conference | KubeCon + CloudNativeCon |
Conf Tag | |
Location | Shanghai, China |
Date | Jun 23-26, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Kubernetes started as a general purpose orchestration framework with a focus on serving jobs. But as it gains popularity, users want to run AI/DL workloads on Kubernetes, such as TensorFlow, PyTorch etc. When running these workloads on Kubernetes, several advanced capability are required, e.g. fair-share sharing, queue, job management (suspend/resume), data management. This talk will demonstrate how to use volcano to bring “batch” capability.