Named entity recognition at scale with deep learning
Twitter is whats happening in the world right now. To connect users with the best content, Twitter needs to build a deep understanding of its noisy and temporal text content. Sijun He and Ali Mollahosseini explore the named entity recognition (NER) system at Twitter and the challenges Twitter faces to build and scale a large-scale deep learning system to annotate 500 million tweets per day.
Talk Title | Named entity recognition at scale with deep learning |
Speakers | Sijun He (Twitter), Ali Mollahosseini (Twitter) |
Conference | O’Reilly Artificial Intelligence Conference |
Conf Tag | Put AI to Work |
Location | San Jose, California |
Date | September 10-12, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Twitter is what’s happening in the world right now, and operating at such a global scale brings massive engineering challenges. To connect users with the best content, Twitter needs to build up a deep understanding of its text content. Such understanding needs to be scalable to annotate more than 500 million tweets per day, in real time to accommodate the live nature of Twitter, and multilingual due to the number of languages Twitter supports. Sijun He and Ali Mollahosseini offer insights into how Twitter Cortex built and productionized a deep learning-based NER system to address those challenges. He highlights Twitter’s experimentations with state-of-the-art models (i.e., BERT) and learning methods (i.e., semisupervised learning and active learning), as well as how Twitter has balanced such efforts to keep in sync with recent developments in natural language processing (NLP) with engineering needs.