Industrialized capsule networks for text analytics

Vijay Agneeswaran and Abhishek Kumar offer an overview of capsule networks and explain how they help in handling spatial relationships between objects in an image. They also show how to apply them to text analytics. Vijay and Abhishek then explore an implementation of a recurrent capsule network and benchmark the RCN with capsule networks with dynamic routing on text analytics tasks.


Talk Title	Industrialized capsule networks for text analytics
Speakers	Vijay Agneeswaran (Walmart Labs), Abhishek Kumar (Publicis Sapient)
Conference	O’Reilly Artificial Intelligence Conference
Conf Tag	Put AI to Work
Location	New York, New York
Date	April 16-18, 2019
URL	Talk Page
Slides	Talk Slides
Video

Multilabel text classification is an interesting problem where multiple tags or categories may have to be associated with the given text/documents. Multilabel text classification occurs in numerous real-world scenarios, for instance, in news categorization and in bioinformatics (such as the gene classification problem, see Zafer Barutcuoglu et al. 2006). The Kaggle dataset is representative of the problem. Several other interesting problem in text analytics exist, such as abstractive summarization, sentiment analysis, search and information retrieval, entity resolution, document categorization, document clustering, and machine translation. Deep learning has been applied to solve many of the above problems—for instance, “Effective Use of Word Order for Text Categorization with Convolutional Neural Networks” gives an early approach to applying a convolutional network to make effective use of word order in text categorization. Recurrent neural networks (RNNs) have been effective in various tasks in text analytics, as explained here. Significant progress has been achieved in language translation by modeling machine translation using an encoder-decoder approach with the encoder formed by a neural network. However, as shown in “Capsule Networks for Protein Structure Classification and Prediction,” certain cases require modeling the hierarchical relationship in text data and is difficult to achieve with traditional deep learning networks because linguistic knowledge may have to be incorporated in these networks to achieve high accuracy. Moreover, deep learning networks do not consider hierarchical relationships between local features as pooling operation of CNNs lose information about the hierarchical relationships. Vijay Agneeswaran and Abhishek Kumar share an industrial-scale use case of capsule networks they have implemented for a client in the realm of text analytics for news categorization. They demonstrate the performance of capsule networks on the news categorization task, using the precision, recall and F1 metrics and benchmark the performance of recurrent capsule networks for the same task and compare the two implementations against a baseline model. They also discuss how to tune key hyperparameters of capsule networks such as batch size, number of filters and size of filters, initial learning rate, number of capsules, and dimension of capsules. Vijay and Abhishek conclude by detailing some of the key challenges they faced along the way. Topics include:

Industrialized capsule networks for text analytics

Dilated neural networks for time series forecasting

Faster ML over joins of tables

Understanding and integrating Intel Deep Learning Boost (Intel DL Boost)

Analytics Zoo: Distributed TensorFlow and Keras on Apache Spark

Analytics Zoo: Distributed TensorFlow in production on Apache Spark

Data processing at the speed of 100 Gbps using Apache Crail