DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

Rodriguez-Opazo, Cristian; Marrese-Taylor, Edison; Fernando, Basura; Li, Hongdong; Gould, Stephen

Computer Science > Computer Vision and Pattern Recognition

arXiv:2010.06260 (cs)

[Submitted on 13 Oct 2020]

Title:DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

Authors:Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hongdong Li, Stephen Gould

View PDF

Abstract:This paper studies the task of temporal moment localization in a long untrimmed video using natural language query. Given a query sentence, the goal is to determine the start and end of the relevant segment within the video. Our key innovation is to learn a video feature embedding through a language-conditioned message-passing algorithm suitable for temporal moment localization which captures the relationships between humans, objects and activities in the video. These relationships are obtained by a spatial sub-graph that contextualizes the scene representation using detected objects and human features conditioned in the language query. Moreover, a temporal sub-graph captures the activities within the video through time. Our method is evaluated on three standard benchmark datasets, and we also introduce YouCookII as a new benchmark for this task. Experiments show our method outperforms state-of-the-art methods on these datasets, confirming the effectiveness of our approach.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2010.06260 [cs.CV]
	(or arXiv:2010.06260v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2010.06260

Submission history

From: Cristian Rodriguez [view email]
[v1] Tue, 13 Oct 2020 09:50:29 UTC (13,752 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2020-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Edison Marrese-Taylor
Basura Fernando
Hongdong Li
Stephen Gould

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators