We enhance privilege with supervised machine learning
Machines are not objective, and big data is not fair. Michael Williams uses sentiment analysis to show that supervised machine learning has the potential to amplify the voices of the most privileged people in society, violate the spirit and letter of civil rights law, and make your product suck.
Talk Title | We enhance privilege with supervised machine learning |
Speakers | Mike Lee Williams (Cloudera Fast Forward Labs) |
Conference | Strata + Hadoop World |
Conf Tag | Big Data Expo |
Location | San Jose, California |
Date | March 29-31, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Michael Williams uses sentiment analysis to show that supervised machine learning has the potential to amplify the voices of the most privileged people in society. A sentiment analysis algorithm is considered table stakes for any serious text-analytics platform in social media, finance, or security. As an example of supervised machine learning, Michael demonstrates how these systems are trained and illustrates that they have the unavoidable property of being better at spotting unsubtle expressions of extreme emotion. Such crude expressions are used by a particularly privileged group of authors: men. As a result, brands that depend on sentiment analysis to “learn what people think” inevitably pay more attention to men. But the problem doesn’t stop with sentiment analysis: Michael explains how at every step of any model-building process, we make choices that can introduce bias, enhance privilege, break the law, or simply make your product suck. Michael reviews these pitfalls, talks about how to recognize them in your own work, and touches on some new academic work that aims to measure and mitigate these harms.