NLP for healthcare: Feature engineering and model diagnostics
Natural language processing (NLP) is hard, especially for clinical text. Manas Ranjan Kar explains the multiple challenges of NLP for clinical text and why it's so important that we invest a fair amount of time on domain-specific feature engineering. Its also crucial to understand to diagnose an NLP model performance and identify possible gaps.
Talk Title | NLP for healthcare: Feature engineering and model diagnostics |
Speakers | Manas Ranjan Kar (Episource) |
Conference | O’Reilly Artificial Intelligence Conference |
Conf Tag | Put AI to Work |
Location | London, United Kingdom |
Date | October 15-17, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
NLP for clinical text is an extremely challenging problem. It suffers from the bane of nonconformity of language, lack of appropriate grammar, a wild range of mentions of the same entity, limited disambiguation, and requirement of highly specific domain rules. While building NLP solutions with high precision and recall at Episource, the company came to a conclusion: healthcare NLP requires a fair amount of feature engineering, with a special focus on incorporating domain-specific features. It’s also pertinent to handle negations and disease mentions in line with what a human would decipher. For example, it’s important that a state-of-the-art system be able to decipher the difference in context of “diabetes” for “patient has diabetes” versus “patient must follow a healthy diet to avoid diabetes.” While both have a mention of a disease, the “real” mention is in only one of them. Manas Ranjan Kar walks you through the challenges for NLP in a typical clinical text domain and explores the broad techniques that seems to work pretty well for feature engineering for such problem statements. You’ll take a dive into performing automated model diagnostics for an NLP model to ensure that the domain-specific feature engineering was able to improve model skill.