January 5, 2020

295 words 2 mins read

The unreasonable effectiveness of transfer learning on NLP

The unreasonable effectiveness of transfer learning on NLP

Transfer learning has been proven to be a tremendous success in computer visiona result of the ImageNet competition. In the past few months, there have been several breakthroughs in natural language processing with transfer learning, namely ELMo, OpenAI Transformer, and ULMFit. David Low demonstrates how to use transfer learning on an NLP application with SOTA accuracy.


Talk Title	The unreasonable effectiveness of transfer learning on NLP
Speakers	David Low (Pand.ai)
Conference	Strata Data Conference
Conf Tag	Making Data Work
Location	London, United Kingdom
Date	April 30-May 2, 2019
URL	Talk Page
Slides	Talk Slides
Video

Transfer learning has been proven to be a tremendous success in computer vision—a result of the ImageNet competition. In the past few months, there have been several breakthroughs in natural language processing with transfer learning, namely ELMo, OpenAI Transformer, and ULMFit. Pretrained models derived from these techniques have been proven in achieving state-of-the-art results on a wide range of NLP problems. The use of pretrained models has come a long way since the introduction of word2vec and GloVe, and these two approaches are considered shallow in comparison. David Low demonstrates how to use transfer learning on an NLP application with SOTA accuracy. David starts with an introduction to transfer learning followed by explanations on why pretrained models are handy for tackling machine learning problems with limited data as well as how they could be used as fixed feature extractor for downstream tasks and applications. David then walks you through fine-tuning a transfer learning model to achieve state-of-the-art accuracy (92%) on a real-world sentiment classification problem using the Amazon Reviews dataset. In comparison to a FastText-based model trained on the full dataset (3.6 million samples), it takes just 1,000 samples of training data to produce a model that achieves similar performance.

dataset introduction machine learning performance computer vision transfer learning nlp

comments powered by Disqus

Building high-performance text classifiers on a limited labeling budget

Building high-performance text classifiers on a limited labeling budget

December 26, 2019

Robert Horton, Mario Inchiosa, and Ali Zaidi demonstrate how to use three cutting-edge machine learning techniquestransfer learning from pretrained language models, active learning to make more effective use of a limited labeling budget, and hyperparameter tuning to maximize model performanceto up your modeling game.

Beyond Word2Vec: Using embeddings to chart out the ebb and flow of tech skills

Beyond Word2Vec: Using embeddings to chart out the ebb and flow of tech skills

January 2, 2020

Word embeddings such as word2vec have revolutionized language modeling. Maryam Jahanshahi discusses exponential family embeddings, which apply probabilistic embedding models to other data types. Join in to learn how TapRecruit implemented a dynamic embedding model to understand how tech skill sets have changed over three years.

Manipulating and measuring model interpretability

Manipulating and measuring model interpretability

December 29, 2019

Forough Poursabzi-Sangdeh argues that to understand interpretability, we need to bring humans in the loop and run human-subject experiments. She describes a set of controlled user experiments in which researchers manipulated various design factors in models that are commonly thought to make them more or less interpretable and measured their influence on users behavior.

Measuring and Optimizing Kubeflow Clusters at Lyft

Measuring and Optimizing Kubeflow Clusters at Lyft

December 13, 2019

Machine learning workloads are often resource-intensive operations. As companies adopt more of these workloads, tracking resource consumption and optimizing spending becomes more challenging.At Lyft, …

Aggregating Mobile Operator Edge Compute Infrastructure with End to End Network Slicing

Aggregating Mobile Operator Edge Compute Infrastructure with End to End Network Slicing

January 3, 2020

We believe that edge is the missing piece in a new value chain that expands the types of devices on mobile networks, secures IOT, enables next generation content (AR/MR), readies mobile infrastructure …

An active learning framework to optimize training of deep models with human in the loop

An active learning framework to optimize training of deep models with human in the loop

January 3, 2020

Humayun Irshad offers an overview of an active learning framework that uses a crowdsourcing approach to solve parking sign recognitiona real-world problem in transportation and autonomous driving for which a large amount of unlabeled data is available. The solution generates an accurate model, quickly and cost-effectively, despite the unevenness of the data.