January 2, 2020

216 words 2 mins read

BERT: Pretraining deep bidirectional transformers for language understanding

BERT: Pretraining deep bidirectional transformers for language understanding

Ming-Wei Chang offers an overview of a new language representation model called BERT (Bidirectional Encoder Representations from Transformers). Unlike recent language representation models, BERT is designed to pretrain deep bidirectional representations by jointly conditioning on both left and right context in all layers.

Talk Title BERT: Pretraining deep bidirectional transformers for language understanding
Speakers Chang Ming-Wei (Google)
Conference O’Reilly Artificial Intelligence Conference
Conf Tag Put AI to Work
Location New York, New York
Date April 16-18, 2019
URL Talk Page
Slides Talk Slides
Video

Ming-Wei Chang offers an overview of a new language representation model called BERT (Bidirectional Encoder Representations from Transformers). Unlike recent language representation models, BERT is designed to pretrain deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pretrained BERT representations can be fine-tuned with just one additional output layer to create models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new, state-of-the-art results on 11 natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement), and the SQuAD v1.1 question-answering test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.

comments powered by Disqus