2009 Volume 4 Issue 4 Pages 903-912
This study addresses the challenge of analyzing affective video content. The affective content of a given video is defined as the intensity and the type of emotion that arise in a viewer while watching that video. In this study, human emotion was monitored by capturing viewers' pupil sizes and gazing points while they were watching the video. On the basis of the measurement values, four features were extracted (namely cumulative pupil response (CPR), frequency component (FC), modified bivariate contour ellipse area (mBVCEA) and Gini coefficient). Using principal component analysis, we have found that two key features, namely the CPR and FC, contribute to the majority of variance in the data. By utilizing the key features, the affective content was identified and could be used in classifying the video shots into their respective scenes. An average classification accuracy of 71.89% was achieved for three basic emotions, with the individual maximum classification accuracy at 89.06%. The development in this study serves as the first step in automating personalized video content analysis on the basis of human emotion.