November 30, 2019

209 words 1 min read

Democratizing Machine Learning on Kubernetes [I]

Democratizing Machine Learning on Kubernetes [I]

One of the largest challenges facing the machine learning community today is understanding how to build a platform to run common open-source machine learning libraries such as Tensorflow. Both Joy and …

Talk Title Democratizing Machine Learning on Kubernetes [I]
Speakers Lachlan Evenson (Principal Program Manager - Azure Container Compute, Microsoft), Joy Qiao (Senior Solution Architect - AI and Research Group, Microsoft)
Conference KubeCon + CloudNativeCon North America
Conf Tag
Location Austin, TX, United States
Date Dec 4- 8, 2017
URL Talk Page
Slides Talk Slides
Video

One of the largest challenges facing the machine learning community today is understanding how to build a platform to run common open-source machine learning libraries such as Tensorflow. Both Joy and Lachie are both passionate about making machine learning accessible to the masses using Kubernetes. In this session they’ll share how to deploy a distributed Tensorflow training cluster complete with GPU scheduling on Kubernetes. We’ll also share how distributed Tensorflow training works, various options for distributed training, and when to choose what option. We’ll also share some best practices on using distributed Tensorflow on top of Kubernetes, based on our latest performance tests performed on public cloud providers. All work presented in this session will be accessible via a public Github repository.

comments powered by Disqus