January 23, 2020

226 words 2 mins read

Executive Briefing: Why machine-learned models crash and burn in production and what to do about it

Executive Briefing: Why machine-learned models crash and burn in production and what to do about it

Machine learning and data science systems often fail in production in unexpected ways. David Talby shares real-world case studies showing why this happens and explains what you can do about it, covering best practices and lessons learned from a decade of experience building and operating such systems at Fortune 500 companies across several industries.


Talk Title	Executive Briefing: Why machine-learned models crash and burn in production and what to do about it
Speakers	David Talby (Pacific AI)
Conference	Strata Data Conference
Conf Tag	Make Data Work
Location	New York, New York
Date	September 11-13, 2018
URL	Talk Page
Slides	Talk Slides
Video

Much progress has been made over the past decade on process and tooling for managing large-scale, multitier cloud apps and APIs, but there is far less common knowledge on best practices for managing machine-learned models (classifiers, forecasters, etc.), especially beyond the modeling, optimization, and deployment process once these models are in production. A key mindset shift required to address these issues is understanding that model development is different than software development in fundamental ways. David Talby shares real-world case studies showing why this is true and explains what you can do about it, covering key best practices that executives, solution architects, and delivery teams must take into account when committing to successfully deliver and operate data science-intensive systems in the real world. Topics include:

api real world classifier large-scale data science forecast optimization cloud

comments powered by Disqus

Data science in the cloud

Data science in the cloud

November 27, 2019

In this talk Alex will discuss lessons learned from AWS SageMaker, an integrated framework for handling all stages of analysis. AWS uses open source components such as Jupyter, Docker containers, Python and well established deep learning frameworks such as Apache MxNet and TensorFlow for an easy to learn workflow.

Leveraging Spark and deep learning frameworks to understand data at scale

Leveraging Spark and deep learning frameworks to understand data at scale

January 21, 2020

Vartika Singh, Alan Silva, Alex Bleakley, Steven Totman, Mirko Kmpf, and Syed Nasar outline approaches for preprocessing, training, inference, and deployment across datasets (time series, audio, video, text, etc.) that leverage Spark, its extended ecosystem of libraries, and deep learning frameworks.

Tutorial: P4 and P4Runtime Technical Introduction and Use Cases for Service Providers

Tutorial: P4 and P4Runtime Technical Introduction and Use Cases for Service Providers

January 21, 2020

P4 has gained industry momentum in the last year. New language features allow describing more forwarding devices, not only programmable ones but also fixed-function conventional ones. P4Runtime was in …

OpenDaylight Project Breakout: Introduction and Demo of the Infrautils. Metrics API and Its Implementations, Incl. Support for Prometheus.io From the CNCF

OpenDaylight Project Breakout: Introduction and Demo of the Infrautils. Metrics API and Its Implementations, Incl. Support for Prometheus.io From the CNCF

January 20, 2020

The ODL infrautils project proposes a new API to expose metrics from ODL SDN applications. This API is currently in the process of being adopted accross the genius and netvirt projects, and could be o …

Arm Mini-Summit: Qualcomm Centriq Arm-based Servers for Edge Computing

Arm Mini-Summit: Qualcomm Centriq Arm-based Servers for Edge Computing

January 19, 2020

Edge Computing is emerging as a promising new opportunity to support novel use cases that require a trifecta of low latency (as perceived by the end user), intensive data computation, and energy effic …

Modernizing operational architecture with big data: Creating and implementing a modern data strategy

Modernizing operational architecture with big data: Creating and implementing a modern data strategy

January 19, 2020

The use of data throughout Cerner had taxed the company's legacy operational data store, data warehouse, and enterprise reporting pipeline to the point where it would no longer scale to meet needs. Jennifer Lim explains how Cerner modernized its corporate data platform with the use of a hybrid cloud architecture.