February 23, 2020

212 words 1 min read

Natural language processing using transformer architectures

Natural language processing using transformer architectures

Transformer architectures have taken the field of natural language processing (NLP) by storm and pushed recurrent neural networks to the sidelines. Aurlien Gron examines transformers and the amazing language models based on them (e.g., BERT and GPT 2) and shows how you can use them in your projects.

Talk Title Natural language processing using transformer architectures
Speakers Aurélien Géron (Kiwisoft)
Conference O’Reilly TensorFlow World
Conf Tag
Location Santa Clara, California
Date October 28-31, 2019
URL Talk Page
Slides Talk Slides

Whether you need to automatically judge the sentiment of a user review, summarize long documents, translate text, or build a chatbot, you need the best language model available. In 2018, pretty much every NLP benchmark was crushed by novel transformer-based architectures, replacing long-standing architectures based on recurrent neural networks. In short, if you’re into NLP, you need transformers. But to use transformers, you need to know what they are, what transformer-based architectures look like, and how you can implement them in your projects. Aurélien Géron dives into recurrent neural networks and their limits, the invention of the transformer, attention mechanisms, the transformer architecture, subword tokenization using SentencePiece, self-supervised pretraining—learning from huge corpora, one-size-fits-all language models, BERT and GPT 2, and how to use these language models in your projects using TensorFlow.

comments powered by Disqus