Distinguish pop music from heavy metal using Apache Spark MLlib
Taras Matyashovsky explains how to use Apache Spark MLlib to build a supervised learning NLP pipeline to distinguish pop music from heavy metaland have fun in the process.
Talk Title | Distinguish pop music from heavy metal using Apache Spark MLlib |
Speakers | Taras Matyashovsky (Lohika) |
Conference | O’Reilly Open Source Convention |
Conf Tag | Making Open Work |
Location | Austin, Texas |
Date | May 8-11, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Machine learning may be overhyped nowadays, but there is still a strong belief that this area is exclusively for data scientists with a deep mathematical background who leverage the Python (scikit-learn, Theano, TensorFlow, etc.) or R ecosystems and use specific tools like R Studio, Matlab, or Octave. Obviously, there is some truth to this statement, but Java engineers can also take the best of the machine-learning world from an applied perspective by using our native language and familiar frameworks like Apache Spark. Taras Matyashovsky explains how to use Apache Spark MLlib to build a supervised learning NLP pipeline to distinguish pop music from heavy metal—and have fun in the process. Along the way, Taras offers an overview of the simplest machine-learning tasks and algorithms, like regression, classification, and clustering.