February 23, 2020

212 words 1 min read

Natural language processing using transformer architectures

Natural language processing using transformer architectures

Transformer architectures have taken the field of natural language processing (NLP) by storm and pushed recurrent neural networks to the sidelines. Aurlien Gron examines transformers and the amazing language models based on them (e.g., BERT and GPT 2) and shows how you can use them in your projects.


Talk Title	Natural language processing using transformer architectures
Speakers	Aurélien Géron (Kiwisoft)
Conference	O’Reilly TensorFlow World
Conf Tag
Location	Santa Clara, California
Date	October 28-31, 2019
URL	Talk Page
Slides	Talk Slides
Video

Whether you need to automatically judge the sentiment of a user review, summarize long documents, translate text, or build a chatbot, you need the best language model available. In 2018, pretty much every NLP benchmark was crushed by novel transformer-based architectures, replacing long-standing architectures based on recurrent neural networks. In short, if you’re into NLP, you need transformers. But to use transformers, you need to know what they are, what transformer-based architectures look like, and how you can implement them in your projects. Aurélien Géron dives into recurrent neural networks and their limits, the invention of the transformer, attention mechanisms, the transformer architecture, subword tokenization using SentencePiece, self-supervised pretraining—learning from huge corpora, one-size-fits-all language models, BERT and GPT 2, and how to use these language models in your projects using TensorFlow.

recurrent neural network tensorflow chatbot network neural network nlp

comments powered by Disqus

Zero to ML hero with TensorFlow 2.0

Zero to ML hero with TensorFlow 2.0

February 22, 2020

Get a programmer's perspective on machine learning with Laurence Moroney, from the basics all the way up to building complex computer vision scenarios using convolutional neural networks and natural language processing with recurrent neural networks.

Deep learning methods for natural language processing

Deep learning methods for natural language processing

January 1, 2020

Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include word2vec, recurrent neural networks and variants (LSTM, GRU), and convolutional neural networks.

How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms

How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms

February 23, 2020

Criteo's real-time bidding of ad spaces requires its TensorFlow (TF) models to make online predictions in less than 5 ms. Nicolas Kowalski and Axel Antoniotti explain why Criteo moved away from high-level APIs and rewrote its models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TF nodes in its model.

About Space Invaders and automated scaling

About Space Invaders and automated scaling

February 22, 2020

Michael Friedrich and Stefanie Grunwald explore how an algorithm capable of playing Space Invaders can also improve your cloud service's automated scaling mechanism.

Anomaly detection using deep learning to measure the quality of large datasets

Anomaly detection using deep learning to measure the quality of large datasets

February 22, 2020

Any business, big or small, depends on analytics, whether the goal is revenue generation, churn reduction, or sales or marketing purposes. No matter the algorithm and the techniques used, the result depends on the accuracy and consistency of the data being processed. Sridhar Alla examines some techniques used to evaluate the quality of data and the means to detect the anomalies in the data.

Deep learning methods for natural language processing

Deep learning methods for natural language processing

February 15, 2020

Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include Word2Vec, recurrent neural networks (RNNs) and variants (long short-term memory [LSTM] and gated recurrent unit [GRU]), and convolutional neural networks.