January 2, 2020

216 words 2 mins read

BERT: Pretraining deep bidirectional transformers for language understanding

BERT: Pretraining deep bidirectional transformers for language understanding

Ming-Wei Chang offers an overview of a new language representation model called BERT (Bidirectional Encoder Representations from Transformers). Unlike recent language representation models, BERT is designed to pretrain deep bidirectional representations by jointly conditioning on both left and right context in all layers.


Talk Title	BERT: Pretraining deep bidirectional transformers for language understanding
Speakers	Chang Ming-Wei (Google)
Conference	O’Reilly Artificial Intelligence Conference
Conf Tag	Put AI to Work
Location	New York, New York
Date	April 16-18, 2019
URL	Talk Page
Slides	Talk Slides
Video

Ming-Wei Chang offers an overview of a new language representation model called BERT (Bidirectional Encoder Representations from Transformers). Unlike recent language representation models, BERT is designed to pretrain deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pretrained BERT representations can be fine-tuned with just one additional output layer to create models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new, state-of-the-art results on 11 natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement), and the SQuAD v1.1 question-answering test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.

performance code

comments powered by Disqus

Industrialized capsule networks for text analytics

Industrialized capsule networks for text analytics

December 30, 2019

Vijay Agneeswaran and Abhishek Kumar offer an overview of capsule networks and explain how they help in handling spatial relationships between objects in an image. They also show how to apply them to text analytics. Vijay and Abhishek then explore an implementation of a recurrent capsule network and benchmark the RCN with capsule networks with dynamic routing on text analytics tasks.

Understanding and integrating Intel Deep Learning Boost (Intel DL Boost)

Understanding and integrating Intel Deep Learning Boost (Intel DL Boost)

December 28, 2019

Banu Nagasundaram offers an overview of Intel's Deep Learning Boost (Intel DL Boost) technology, featuring integer vector neural network instructions targeting future Intel Xeon scalable processors. Banu walks you through the 8-bit integer convolution implementation made in the Intel MKL-DNN library to demonstrate how this new instruction is used in optimized code.

Analytics Zoo: Distributed TensorFlow in production on Apache Spark

Analytics Zoo: Distributed TensorFlow in production on Apache Spark

December 27, 2019

Yuhao Yang and Jennie Wang demonstrate how to run distributed TensorFlow on Apache Spark with the open source software package Analytics Zoo. Compared to other solutions, Analytics Zoo is built for production environments and encourages more industry users to run deep learning applications with the big data ecosystems.

Serverless for data and AI

Serverless for data and AI

December 21, 2019

What is serverless, and how can it be utilized for data analysis and AI? Avner Braverman outlines the benefits and limitations of serverless with respect to data transformation (ETL), AI inference and training, and real-time streaming. This is a technical talk, so expect demos and code.

Spark-PMoF: Accelerating big data analytics with Persistent Memory over Fabric

Spark-PMoF: Accelerating big data analytics with Persistent Memory over Fabric

December 21, 2019

Yuan Zhou, Haodong Tang, and Jian Zhang offer an overview of Spark-PMOF and explain how it improves Spark analytics performance.

(Continuous) threat modeling: What works?

(Continuous) threat modeling: What works?

December 19, 2019

Threat modeling as a discipline has always enjoyed a special place in development, going from "Why do it?" to "I should do it one of these days" to "We did it and didn't even get a T-shirt." Many competing methodologies, interests, and constraints help make the process more difficult than it needs to be, reducing the results. Izar Tarandach shares the approach Autodesk uses for threat modeling.