When Tiramisu meets online fashion retail
Large online fashion retailers must efficiently maintain catalogues of millions of items. Due to human error, it's not unusual that some items have duplicate entries. Since manually trawling such a large catalogue is next to impossible, how can you find these entries? Patty Ryan, CY Yam, and Elena Terenzi explain how they applied deep learning for image segmentation and background removal.
Talk Title | When Tiramisu meets online fashion retail |
Speakers | Patty Ryan (Microsoft), CY Yam (Microsoft), Elena Terenzi (Microsoft) |
Conference | Strata Data Conference |
Conf Tag | Make Data Work |
Location | New York, New York |
Date | September 11-13, 2018 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Large online fashion retailers must efficiently maintain catalogues of millions of items. Due to human error, it’s not unusual that some items have duplicate entries. Since manually trawling such a large catalogue is next to impossible, how can you find these entries? You might take a snapshot of a newly arrived item with your phone and have an algorithm automatically check if such an item is already registered, based on its visual appearance. However, when applying content-based image retrieval, it’s highly likely that the performance will be hindered by the difference of the visual content in the images, such as the busy background of a mobile image versus a clean studio image, not to mention inconsistent folding or creases, lighting, scale and point-of-view angle. To increase the success rate, it’s prudent to remove the background of the query image before applying any retrieval algorithms. Patty Ryan, CY Yam, and Elena Terenzi explain how they developed a specialized segmentation model for background removal or garment (foreground) segmentation using one of the most recent deep learning architectures, Tiramisu. The solution achieved a remarkable segmentation accuracy of 94% with 200 training images and has been proved to significantly improve content-based image retrieval performance. Patty, CY, and Elena begin by discussing GrabCut, a very successful foreground segmentation method, and explain how it is being used to create labeled data. They then offer an overview of their deep learning-based specialized segmentation tool Tiramisu and show where the model performs well and where its performance is less satisfactory. Patty, CY, and Elena conclude with a demonstration of how this tool can be applied to help to prevent the issue of duplicate entries in a very large online fashion retailer catalogue.