Development of an Estimation Model for Instantaneous Presence in Audio-Visual Content

Kenji OZAWA; Shota TSUKAHARA; Yuichiro KINOSHITA; Masanori MORISE

Development of an Estimation Model for Instantaneous Presence in Audio-Visual Content

Kenji OZAWA
Shota TSUKAHARA
Yuichiro KINOSHITA
Masanori MORISE

Publication
IEICE TRANSACTIONS on Information and Systems Vol.E99-D No.1 pp.120-127
Publication Date: 2016/01/01
Publicized: 2015/10/21
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015MUP0014
Type of Manuscript: Special Section PAPER (Special Section on Enriched Multimedia---Creation of a New Society through Value-added Multimedia Content---)
Category:
Keyword:
sense of presence, instantaneous presence, audio-visual content, neural networks, time-series data,

Full Text: PDF(1.8MB)>>

Summary:
The sense of presence is often used to evaluate the performances of audio-visual (AV) content and systems. However, a presence meter has yet to be realized. We consider that the sense of presence can be divided into two aspects: system presence and content presence. In this study we focused on content presence. To estimate the overall presence of a content item, we have developed estimation models for the sense of presence in audio-only and audio-visual content. In this study, the audio-visual model is expanded to estimate the instantaneous presence in an AV content item. Initially, we conducted an evaluation experiment of the presence with 40 content items to investigate the relationship between the features of the AV content and the instantaneous presence. Based on the experimental data, a neural-network-based model was developed by expanding the previous model. To express the variation in instantaneous presence, 6 audio-related features and 14 visual-related features, which are extracted from the content items in 500-ms intervals, are used as inputs for the model. The audio-related features are loudness, sharpness, roughness, dynamic range and standard deviation in sound pressure levels, and movement of sound images. The visual-related features involve hue, lightness, saturation, and movement of visual images. After constructing the model, a generalization test confirmed that the model is sufficiently accurate to estimate the instantaneous presence. Hence, the model should contribute to the development of a presence meter.

open access publishing via