Exploring global diverse attention via pairwise temporal relation for video summarization

Li, Ping; Ye, Qinghao; Zhang, Luming; Yuan, Li; Xu, Xianghua; Shao, Ling

Computer Science > Computer Vision and Pattern Recognition

arXiv:2009.10942 (cs)

[Submitted on 23 Sep 2020]

Title:Exploring global diverse attention via pairwise temporal relation for video summarization

Authors:Ping Li, Qinghao Ye, Luming Zhang, Li Yuan, Xianghua Xu, Ling Shao

View PDF

Abstract:Video summarization is an effective way to facilitate video searching and browsing. Most of existing systems employ encoder-decoder based recurrent neural networks, which fail to explicitly diversify the system-generated summary frames while requiring intensive computations. In this paper, we propose an efficient convolutional neural network architecture for video SUMmarization via Global Diverse Attention called SUM-GDA, which adapts attention mechanism in a global perspective to consider pairwise temporal relations of video frames. Particularly, the GDA module has two advantages: 1) it models the relations within paired frames as well as the relations among all pairs, thus capturing the global attention across all frames of one video; 2) it reflects the importance of each frame to the whole video, leading to diverse attention on these frames. Thus, SUM-GDA is beneficial for generating diverse frames to form satisfactory video summary. Extensive experiments on three data sets, i.e., SumMe, TVSum, and VTW, have demonstrated that SUM-GDA and its extension outperform other competing state-of-the-art methods with remarkable improvements. In addition, the proposed models can be run in parallel with significantly less computational costs, which helps the deployment in highly demanding applications.

Comments:	12 pages, 8 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
Cite as:	arXiv:2009.10942 [cs.CV]
	(or arXiv:2009.10942v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2009.10942
Journal reference:	Pattern Recognition, 2020

Submission history

From: Ping Li PhD [view email]
[v1] Wed, 23 Sep 2020 06:29:09 UTC (2,169 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring global diverse attention via pairwise temporal relation for video summarization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring global diverse attention via pairwise temporal relation for video summarization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators