January 1, 2020

290 words 2 mins read

Chargrid: Understanding 2D documents

Chargrid: Understanding 2D documents

Anoop Katti explores the shortcomings of the existing techniques for understanding 2D documents and offers an overview of the Character Grid (Chargrid), a new processing pipeline pioneered by data scientists at SAP.

Talk Title Chargrid: Understanding 2D documents
Speakers Anoop Katti (SAP)
Conference O’Reilly Artificial Intelligence Conference
Conf Tag Put AI to Work
Location New York, New York
Date April 16-18, 2019
URL Talk Page
Slides Talk Slides
Video

Textual information is often represented through structured documents, which have an inherent 2D structure—particularly with the advent of new types of media and communications such as presentations, websites, blogs, and formatted notebooks. In such documents, the layout, positioning, and sizing might be crucial to understanding its semantic content and provide strong guidance for the human perception. Natural language processing (NLP) addresses the task of processing and understanding plain text. However, it processes text by serializing it, completely ignoring any 2D structure in the text. On the other hand, computer vision (CV) may be used to process document images, retaining the structure but learning the document semantics from the image pixels. Anoop Katti explores the shortcomings of the existing techniques for understanding 2D documents and offers an overview of the Character Grid (Chargrid), a new processing pipeline pioneered by data scientists at SAP that retains the original 2D structure while directly encoding the characters in the text. The Character Grid representation can readily be used with deep neural networks, for example. Anoop applies Chargrid to the task of information extraction from invoices to show how it captures the best of both NLP and CV. Chargrid is accepted for presentation at EMNLP 2018 and is also deployed in the production system of SAP Concur, currently processing tens of thousands of invoices every month.

comments powered by Disqus