December 8, 2019

311 words 2 mins read

Unlocking unstructured text data with summarization

Unlocking unstructured text data with summarization

Our ability to extract meaning from unstructured text data has not kept pace with our ability to produce and store it, but recent breakthroughs in recurrent neural networks are allowing us to make exciting progress in computer understanding of language. Building on these new ideas, Michael Williams explores three ways to summarize text and presents prototype products for each approach.


Talk Title	Unlocking unstructured text data with summarization
Speakers
Conference	Strata + Hadoop World
Conf Tag	Make Data Work
Location	New York, New York
Date	September 27-29, 2016
URL	Talk Page
Slides	Talk Slides
Video

We’ve seen significant progress in infrastructure for using data effectively in the last half-decade. But this hasn’t applied to all types of data equally. Unstructured text, in particular, has been slower to yield to the kinds of analysis that many businesses are starting to take for granted. Rather than being limited by what we can collect, we are now constrained by the tools, time, and techniques to make good use of it. But we are beginning to gain the ability to do remarkable things with unstructured text data. Michael Williams explores text summarization—taking text in and returning a shorter document that contains the same information—covering both single document and multidocument summarization. Michael demonstrates ways to solve the summarization problem that range from extremely simple algorithms that date back to the 1950s to the latest recurrent neural networks, explains how to choose between these approaches, and shows working prototype products for each. Summarizing tens or hundreds of thousands of articles at once represents an entirely new capability. But this capability is a solution to a bigger problem: it’s a gateway to quantified representations of text. The breakthrough capabilities realized by the application of sentence embedding and recurrent neural networks to the semantic meaning of text are poised to transform all the ways in which computers process language.

recurrent neural network algorithm infrastructure network neural network

comments powered by Disqus

Deep learning for web-scale text

Deep learning for web-scale text

November 19, 2019

Piotr Mirowski looks under the hood of recurrent neural networks and explains how they can be applied to speech recognition, machine translation, sentence completion, and image captioning.

Chainer: A flexible and intuitive framework for complex neural networks

Chainer: A flexible and intuitive framework for complex neural networks

December 6, 2019

Open source software frameworks are the key for applying deep learning technologies. Orion Wolfe and Shohei Hido introduce Chainer, a Python-based standalone framework that enables users to intuitively implement many kinds of other models, including recurrent neural networks, with a lot of flexibility and comparable performance to GPUs.

Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

December 6, 2019

Natural-language assistants are the emergent killer app for AI. Getting from here to there with deep learning, however, can require enormous datasets. Christopher Nguyen and Binh Han explain how to shorten the time to effectiveness and the amount of training data that's required to achieve a given level of performance using human-in-the-loop active learning.

Can deep neural networks save your neural network? Artificial intelligence, sensors, and strokes

Can deep neural networks save your neural network? Artificial intelligence, sensors, and strokes

October 26, 2019

Each year, 15 million people suffer strokes, and at least a fifth of those are due to atrial fibrillation, the most common heart arrhythmia. Brandon Ballinger reports on a collaboration between UCSF cardiologists and ex-Google data scientists that detects atrial fibrillation with deep learning.

Deep neural network model compression and an efficient inference engine

Deep neural network model compression and an efficient inference engine

December 6, 2019

Neural networks are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. Song Han explains how deep compression addresses this limitation by reducing the storage requirement of neural networks without affecting their accuracy and proposes an energy-efficient inference engine (EIE) that works with this model.

End-to-end learning for autonomous driving

End-to-end learning for autonomous driving

December 6, 2019

Urs Muller presents the architecture and training methods used to build an autonomous road-following system. A key aspect of the approach is eliminating the need for hand-programmed rules and procedures such as finding lane markings, guardrails, or other cars, thereby avoiding the creation of a large number of if, then, else statements.