The unreasonable effectiveness of transfer learning on NLP
Transfer learning has been proven to be a tremendous success in computer visiona result of the ImageNet competition. In the past few months, there have been several breakthroughs in natural language processing with transfer learning, namely ELMo, OpenAI Transformer, and ULMFit. David Low demonstrates how to use transfer learning on an NLP application with SOTA accuracy.
Talk Title | The unreasonable effectiveness of transfer learning on NLP |
Speakers | David Low (Pand.ai) |
Conference | Strata Data Conference |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | April 30-May 2, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Transfer learning has been proven to be a tremendous success in computer vision—a result of the ImageNet competition. In the past few months, there have been several breakthroughs in natural language processing with transfer learning, namely ELMo, OpenAI Transformer, and ULMFit. Pretrained models derived from these techniques have been proven in achieving state-of-the-art results on a wide range of NLP problems. The use of pretrained models has come a long way since the introduction of word2vec and GloVe, and these two approaches are considered shallow in comparison. David Low demonstrates how to use transfer learning on an NLP application with SOTA accuracy. David starts with an introduction to transfer learning followed by explanations on why pretrained models are handy for tackling machine learning problems with limited data as well as how they could be used as fixed feature extractor for downstream tasks and applications. David then walks you through fine-tuning a transfer learning model to achieve state-of-the-art accuracy (92%) on a real-world sentiment classification problem using the Amazon Reviews dataset. In comparison to a FastText-based model trained on the full dataset (3.6 million samples), it takes just 1,000 samples of training data to produce a model that achieves similar performance.