Authors:
Chaudhary Muhammad Aqdus Ilyas
1
;
2
;
Rita Nunes
3
;
Kamal Nasrollahi
2
;
Matthias Rehm
1
and
Thomas B. Moeslund
2
Affiliations:
1
Human-Robot Interaction Lab, Aalborg University, Aalborg, Denmark
;
2
Visual Analysis of People Lab, Aalborg University, Aalborg, Denmark
;
3
Department of Electronic Systems, Aalborg University, Aalborg, Denmark
Keyword(s):
Emotion Recognition, Facial Expressions, Body movements, Deep Learning, Convolutional Neural Networks.
Abstract:
Despite recent significant advancements in the field of human emotion recognition, applying upper body movements along with facial expressions present severe challenges in the field of human-robot interaction. This article presents a model that learns emotions through upper body movements and corresponds with facial expressions. Once this correspondence is mapped, tasks such as emotion and gesture recognition can easily be identified using facial features and movement vectors. Our method uses a deep convolution neural network trained on benchmark datasets exhibiting various emotions and corresponding body movements. Features obtained through facial movements and body motion are fused to get emotion recognition performance. We have implemented various fusion methodologies to integrate multimodal features for non-verbal emotion identification. Our system achieves 76.8% accuracy of emotion recognition through upper body movements only, surpassing 73.1% on the FABO dataset. In addition,
employing multimodal compact bilinear pooling with temporal information surpassed the state-of-the-art method with an accuracy of 94.41% on the FABO dataset. This system can lead to better human-machine interaction by enabling robots to recognize emotions and body actions and react according to their emotions, thus enriching the user experience.
(More)