Kubeflow: Portable machine learning on Kubernetes (sponsored by Google)

Michelle Casbon offers an overview of Kubeflow. By providing a platform that reduces variability between services and environments, Kubeflow enables applications that are more robust and resilient, resulting in less downtime, quality issues, and customer impact. It also supports the use of specialized hardware such as GPUs, which can reduce operational costs and improve model performance.


Talk Title	Kubeflow: Portable machine learning on Kubernetes (sponsored by Google)
Speakers	Michelle Casbon (Google)
Conference	Artificial Intelligence Conference
Conf Tag	Put AI to Work
Location	San Francisco, California
Date	September 5-7, 2018
URL	Talk Page
Slides	Talk Slides
Video

Practically speaking, some of the biggest challenges facing ML applications are composability, portability, and scalability. The Kubernetes framework is well suited to address these issues, which is why it’s a great foundation for deploying ML products. Michelle Casbon offers an overview of Kubeflow, which is designed to take advantage of these benefits by providing a sustainable, repeatable platform that supports the full lifecycle of an ML application. Kubeflow removes the need for expertise in a large number of areas, reducing the barrier to entry for developing and maintaining ML products. The composability problem is addressed by providing a single, unified tool for running common processes such as data ingestion, transformation, and analysis, model training, evaluation, and serving, as well as monitoring, logging, and other operational tools. The portability problem is resolved by supporting the use of the entire stack either locally, on-premises, or on the cloud platform of your choice. Scalability is native to the kubernetes platform and leveraged by Kubeflow to run all aspects of the product, including resource-intensive model training tasks. By providing a platform that reduces variability between services and environments, Kubeflow enables applications that are more robust and resilient, resulting in less downtime, quality issues, and customer impact. It also supports the use of specialized hardware such as GPUs, which can reduce operational costs and improve model performance. This session is sponsored by Google.

Kubeflow: Portable machine learning on Kubernetes (sponsored by Google)

Machine learning at scale with Kubernetes

Nezha: A Kubernetes Native Big Data Accelerator For Machine Learning

Distributed training of deep learning models

The Path to GPU as a Service in Kubernetes

Scaling AI Inference Workloads with GPUs and Kubernetes

Tutorial: Kubeflow End-to-End: GitHub Issue Summarization