December 8, 2019

219 words 2 mins read

Improving computer vision models at scale

Improving computer vision models at scale

Rigorous improvement of an image recognition model often requires multiple iterations of eyeballing outliers, inspecting statistics of the output labels, then modifying and retraining the model. Marton Balassi, Mirko Kmpf, and Jan Kunigk share a solution that automates the process of running the model on the testing data and populating an index of the labels so they become searchable.

Talk Title Improving computer vision models at scale
Speakers Marton Balassi (Cloudera), Mirko Kämpf (Cloudera), Jan Kunigk (Cloudera)
Conference Strata Data Conference
Conf Tag Making Data Work
Location London, United Kingdom
Date May 22-24, 2018
URL Talk Page
Slides Talk Slides
Video

Rigorous improvement of an image recognition model often requires multiple iterations of eyeballing outliers, inspecting statistics of the output labels, then modifying and retraining the model. When testing data is present at the petabyte scale, the ability to seamlessly access all the images that have been assigned specific labels poses a technical challenge by itself. Marton Balassi, Mirko Kämpf, and Jan Kunigk share a solution that automates the process of running the model on the testing data and populating an index of the labels so they become searchable. Images and labels are stored in HBase. The model is encapsulated in a PySpark program, while the images are indexed with Solr and can be accessed from a Hue dashboard.

comments powered by Disqus