Building high-performance text classifiers on a limited labeling budget
Robert Horton, Mario Inchiosa, and Ali Zaidi demonstrate how to use three cutting-edge machine learning techniquestransfer learning from pretrained language models, active learning to make more effective use of a limited labeling budget, and hyperparameter tuning to maximize model performanceto up your modeling game.
Talk Title | Building high-performance text classifiers on a limited labeling budget |
Speakers | Robert Horton (Microsoft), Mario Inchiosa (Microsoft), Ali Zaidi (Microsoft) |
Conference | Strata Data Conference |
Conf Tag | Big Data Expo |
Location | San Francisco, California |
Date | March 26-28, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Robert Horton, Mario Inchiosa, and Ali Zaidi demonstrate how to use three cutting-edge machine learning techniques—transfer learning from pretrained language models, active learning to make more effective use of a limited labeling budget, and hyperparameter tuning to maximize model performance—to up your modeling game. Though plentiful data is available in many domains, often the limiting factor in applying supervised machine learning techniques is the availability of useful labels. Labels often represent the interpretation that a human applies to an example, and obtaining such labels can be expensive, particularly in application domains where experts are highly compensated or difficult to find. Active learning is a model-driven selection process that helps to make more effective use of a labeling budget. Robert, Mario, and Ali start by building a model on a small dataset, then use that model to select additional examples to label. Using multiple rounds of modeling and selection, you can obtain training sets that lead to much better-performing models than would be expected from training on a randomly selected dataset of similar size. Many state-of-the-art results in natural language processing rely on the ability to use complex models with large datasets to learn rich representations useful for multiple tasks. Robert, Mario, and Ali’s examples use transfer learning from a pretrained language model to generate features that can be effectively used by low-complexity classifier models capable of training on relatively small datasets. As you integrate machine learning and AI into business processes, even small improvements in predictive performance can translate into huge ROI, so hyperparameter tuning is now an inherent part of many ML pipelines. Robert, Mario, and Ali explain how to leverage Spark clusters in platforms such as Azure Databricks to perform hyperparameter tuning, and detail the improvements this tuning produces in your classifier.