Learning Effective Visual Relationship Detector on 1 GPU

Lu, Yichao; Chang, Cheng; Rai, Himanshu; Yu, Guangwei; Volkovs, Maksims

Computer Science > Computer Vision and Pattern Recognition

arXiv:1912.06185 (cs)

[Submitted on 12 Dec 2019]

Title:Learning Effective Visual Relationship Detector on 1 GPU

Authors:Yichao Lu, Cheng Chang, Himanshu Rai, Guangwei Yu, Maksims Volkovs

View PDF

Abstract:We present our winning solution to the Open Images 2019 Visual Relationship challenge. This is the largest challenge of its kind to date with nearly 9 million training images. Challenge task consists of detecting objects and identifying relationships between them in complex scenes. Our solution has three stages, first object detection model is fine-tuned for the challenge classes using a novel weight transfer approach. Then, spatio-semantic and visual relationship models are trained on candidate object pairs. Finally, features and model predictions are combined to generate the final relationship prediction. Throughout the challenge we focused on minimizing the hardware requirements of our architecture. Specifically, our weight transfer approach enables much faster optimization, allowing the entire architecture to be trained on a single GPU in under two days. In addition to efficient optimization, our approach also achieves superior accuracy winning first place out of over 200 teams, and outperforming the second place team by over $5\%$ on the held-out private leaderboard.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1912.06185 [cs.CV]
	(or arXiv:1912.06185v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1912.06185

Submission history

From: Himanshu Rai [view email]
[v1] Thu, 12 Dec 2019 19:59:41 UTC (8,831 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2019-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yichao Lu
Cheng Chang
Himanshu Rai
Guang Wei Yu
Maksims Volkovs

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Effective Visual Relationship Detector on 1 GPU

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Effective Visual Relationship Detector on 1 GPU

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators