Multimodal Product Identification: Submission to Watch and Buy 2021 Challenge

J Peng, S Feng, Y Wang, H Hou, F Lian… - Proceedings of the 1st …, 2021 - dl.acm.org
J Peng, S Feng, Y Wang, H Hou, F Lian, Z Kang
Proceedings of the 1st Workshop on Multimodal Product Identification in …, 2021dl.acm.org
This technical report describes the overview of our approach to the" Watch and Buy:
Multimodal Product Identification Challenge". Specifically, we tackle this problem with a
three-stage framework, ie, product detection, retrieval and classification. For the product
detection, we leverage the performance by Cascade R-CNN and deformable convolution to
alleviate the impact of image distortion. For the product retrieval, we enhance the Multiple
Granularity Network (MGN) with global and local context through IBN, SE and Non-local …
This technical report describes the overview of our approach to the "Watch and Buy: Multimodal Product Identification Challenge". Specifically, we tackle this problem with a three-stage framework, i.e., product detection, retrieval and classification. For the product detection, we leverage the performance by Cascade R-CNN and deformable convolution to alleviate the impact of image distortion. For the product retrieval, we enhance the Multiple Granularity Network (MGN) with global and local context through IBN, SE and Non-local blocks. The task of product classification suffers from fashion variation. To this end, we propose to fuse the global feature of the integral images and local feature of products. Experiments demonstrate that our works could achieve competitive performance with the state-of-the-art methods and our overall approach achieves a F1 score of 0.648, ranking the second place in the final challenge.
ACM Digital Library
Showing the best result for this search. See all results