Video:: Query-Focused Video Summarization: Dataset, Evaluation, and A Memory Network Based Approach

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2



 Video data is great asset for information extraction and knowledge discovery.
 Due to its size and variability, it is extremely hard for users to monitor.

Video Summarization:

 Intelligent video summarization algorithms allow us to quickly browse a lengthy video by

capturing the essence and removing redundant information.

The application of video summarization can be divided into three main categories:

1) Consumer Video Applications

2) Browsing the recorded content
3) View the interesting parts quickly


 Loss of information
 Computationally expensive
 Evaluate the performance of a video summarizer
 No single video summarizer fits all users

Related work:

Paper 1:
Query-Focused Video Summarization: Dataset, Evaluation, and A Memory Network Based

Author introduces user preferences in the form of text queries propose a memory network
parameterized sequential determinantal point process. Author also contend that a good evaluation
metric for video summarization should Focus on the semantic information to collect dense per-
video-shot concept annotations and new evaluation method.

Objective Proposed Method Dataset Strength Limitation

Main obstacles to Authors propose 1. UTEgocentric 1)Introduces user 1)Collecting
the research on a memory net- (UTE) dataset preferences in dense per- video-
video work the form of text shot concept
summarization parameterized queries 2) Author annotations. 
are the user sequential collect dense
subjectivity-- determinantal per-video- shot
users have point process to concept
various attend the user annotations.
preferences over query onto
the summaries. different video
frames and
Paper 2:
Query-Conditioned Three-Player Adversarial Network for Video Summarization BMBC 2018

Video summarization plays an important role in video understanding by selecting key frames/shots.
Traditionally, it aims to find the most representative and diverse contents in a video as short
summaries. In this paper, Author propose a query-conditioned three-player generative adversarial
network to tackle this challenge. The generates or learns the joint representation of the user query
and the video content, and the discriminator takes three pairs of query-conditioned summaries as
the input to discriminate the real summary from a generated and a random one.

Objective Proposed Method Dataset Strength Limitation

Main aims to find Authors propose , 1. UTEgocentric Results are more 1)Do not
the most a query- (UTE) dataset accurate based randomly
representative conditioned on user query. generated
and diverse three-player summary. 
contents in a generative
video as short adversarial
summaries. network to tackle
this challenge.
The generator
learns the joint
representation of
the user query
and the video

You might also like