Introduction to generalized low-rank models and missing values
The generalized low-rank model is a new machine-learning approach for reconstructing missing values and identifying important features in heterogeneous data. Through a series of examples, Jo-fai Chow demonstrates how to fit low-rank models in a parallelized framework and how to use these models to make better predictions.
Talk Title | Introduction to generalized low-rank models and missing values |
Speakers | Jo-fai Chow (H2O.ai) |
Conference | Strata + Hadoop World |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | June 1-3, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Across business and research, analysts seek to understand large collections of data with numeric, Boolean, and categorical values. Many entries in the table may be noisy or even missing altogether. Low-rank models facilitate understanding of tabular data by producing a condensed vector representation for every row and column in the dataset. These representations can then be compared, clustered, plotted, and used in subsequent analysis. Jo-fai Chow describes offers an overview of low-rank models and demonstrates how to build them in H2O, an open source distributed machine-learning platform. Through examples, Jo-fai explains how to fit low-rank models to numeric and categorical datasets with missing values and how to use these models to identify important features and make better predictions. Topics include: