December 22, 2019

370 words 2 mins read

Online evaluation of machine learning models

Online evaluation of machine learning models

Evaluating machine learning models is surprisingly hard, particularly because these systems interact in very subtle ways. Ted Dunning breaks the problem of evaluation apart into operational and function evaluation, demonstrating how to do each without unnecessary pain and suffering. Along the way, he shares exciting visualization techniques that will help make differences strikingly apparent.

Talk Title Online evaluation of machine learning models
Speakers Ted Dunning (MapR, now part of HPE)
Conference Strata Data Conference
Conf Tag Big Data Expo
Location San Francisco, California
Date March 26-28, 2019
URL Talk Page
Slides Talk Slides
Video

Academic machine learning almost exclusively involves offline evaluation of machine learning models. In the real world this is, somewhat surprisingly, only good enough for a rough cut that eliminates the real dogs. For production work, online evaluation is often the only option to determine which of several final-round candidates might be chosen for further use. As Einstein is rumored to have said, theory and practice are the same, in theory. In practice, they are different. So it is with models. Part of the problem is interaction with other models and systems. Part of the problem has to do with the variability of the real world. Often, there are adversaries at work. It may even be sunspots. One particular problem arises when models choose their own training data and thus couple back onto themselves. In addition to these difficulties, production models almost always have service-level agreements that have to do with how quickly they must produce results and how often they are allowed to fail. These operational considerations can be as important as the accuracy of the model: the right results returned late are worse than slightly wrong results returned in time. Ted Dunning offers a survey of useful ways to evaluate models in the real world, breaking the problem of evaluation apart into operational and function evaluation and demonstrating how to do each without unnecessary pain and suffering. You’ll learn about decoy and canary models, nonlinear latency histogramming, model-delta diagrams, and more. These techniques may sound arcane, but each is simple at heart and doesn’t require any advanced mathematics to understand. Along the way, he shares exciting visualization techniques that will help make differences strikingly apparent.

comments powered by Disqus