Supercharge Kubeflow Performance on GPU Clusters
AI/ML applications on Kubernetes can be optimized for performance at many levels.This presentation provides an overview of the optimizations such as:- Distributed training on multiple GPUs with optima …
Talk Title | Supercharge Kubeflow Performance on GPU Clusters |
Speakers | Meenakshi Kaushik (Product Manager, Cisco), Neelima Mukiri (Principal Engineer, Cisco) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | San Diego, CA, USA |
Date | Nov 15-21, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
AI/ML applications on Kubernetes can be optimized for performance at many levels.This presentation provides an overview of the optimizations such as:- Distributed training on multiple GPUs with optimal selection of interconnects between the GPUs and CPUs.- Utilizing different types of GPUs/Servers for different workloads like training and inference.- OS level optimizations to get optimal performance on the hardware.- Usage of GPU Passthrough for optimal utilization and performance.This presentation will also cover how the selection of machine learning framework, like Kubeflow, can impact performance and hardware utilization.