Democratizing Machine Learning on Kubernetes [I]
One of the largest challenges facing the machine learning community today is understanding how to build a platform to run common open-source machine learning libraries such as Tensorflow. Both Joy and …
Talk Title | Democratizing Machine Learning on Kubernetes [I] |
Speakers | Lachlan Evenson (Principal Program Manager - Azure Container Compute, Microsoft), Joy Qiao (Senior Solution Architect - AI and Research Group, Microsoft) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | Austin, TX, United States |
Date | Dec 4- 8, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
One of the largest challenges facing the machine learning community today is understanding how to build a platform to run common open-source machine learning libraries such as Tensorflow. Both Joy and Lachie are both passionate about making machine learning accessible to the masses using Kubernetes. In this session they’ll share how to deploy a distributed Tensorflow training cluster complete with GPU scheduling on Kubernetes. We’ll also share how distributed Tensorflow training works, various options for distributed training, and when to choose what option. We’ll also share some best practices on using distributed Tensorflow on top of Kubernetes, based on our latest performance tests performed on public cloud providers. All work presented in this session will be accessible via a public Github repository.