Machine learning versus machine learning in production
Acme Corporation is a global leader in commerce marketing. Manu Mukerji walks you through Acme Corporation's machine learning example for universal catalogs, explaining how the training and test sets are generated and annotated; how the model is pushed to production, automatically evaluated, and used; production issues that arise when applying ML at scale in production; lessons learned; and more.
Talk Title | Machine learning versus machine learning in production |
Speakers | Manu Mukerji (8x8) |
Conference | Strata Data Conference |
Conf Tag | Big Data Expo |
Location | San Jose, California |
Date | March 6-8, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Acme Corporation, a global leader in commerce marketing, classifies 4.5B products a day into ~4,500 categories using Google Taxonomy. At 600 TB of data per day, Acme Corporation has the largest Hadoop cluster in Europe. Manu Mukerji walks you through Acme Corporation’s machine learning example for universal catalogs, explaining how the training and test sets are generated and annotated; how they were created when there is no public training data available; how the model is pushed to production, automatically evaluated, and used; how Acme Corporation built a Hadoop/Spark pipeline using different types of models predicting various values; production issues that arise when applying ML at scale in production; and lessons learned along the way.