December 17, 2019

199 words 1 min read

Introducing KFServing: Serverless Model Serving on Kubernetes

Introducing KFServing: Serverless Model Serving on Kubernetes

Production-grade serving of ML models is a challenging task for data scientists. In this talk, we'll discuss how KFServing powers some real-world examples of inference in production at Bloomberg, whic …

Talk Title Introducing KFServing: Serverless Model Serving on Kubernetes
Speakers Dan Sun (Senior Software Engineer, Bloomberg), Ellis Bigelow (Software Engineer, Google)
Conference KubeCon + CloudNativeCon North America
Conf Tag
Location San Diego, CA, USA
Date Nov 15-21, 2019
URL Talk Page
Slides Talk Slides
Video

Production-grade serving of ML models is a challenging task for data scientists. In this talk, we’ll discuss how KFServing powers some real-world examples of inference in production at Bloomberg, which supports the business domains of NLP, computer vision, and time-series analysis. KFServing (https://github.com/kubeflow/kfserving) provides a Kubernetes CRD for serving ML models on arbitrary frameworks. It aims to solve 80% of model serving use cases by providing performant, high abstraction interfaces for common ML frameworks. It provides a consistent and richly featured abstraction that supports bleeding-edge serving features like CPU/GPU auto-scaling, scale to and from 0, and canary rollouts. KFServing’s charter includes a rich roadmap to fulfill a complete story for mission critical ML, including inference graphs, model explainability, outlier detection, and payload logging.

comments powered by Disqus