Facial Expression Recognition and Image Description Generation in Vietnamese

Lam, Khang Nhut; Nguyen, Kim-Ngoc Thi; Nguy, Loc Huu; Kalita, Jugal

doi:10.3233/faia210176

Computer Science > Computer Vision and Pattern Recognition

arXiv:2208.06117 (cs)

[Submitted on 12 Aug 2022]

Title:Facial Expression Recognition and Image Description Generation in Vietnamese

Authors:Khang Nhut Lam, Kim-Ngoc Thi Nguyen, Loc Huu Nguy, Jugal Kalita

View PDF

Abstract:This paper discusses a facial expression recognition model and a description generation model to build descriptive sentences for images and facial expressions of people in images. Our study shows that YOLOv5 achieves better results than a traditional CNN for all emotions on the KDEF dataset. In particular, the accuracies of the CNN and YOLOv5 models for emotion recognition are 0.853 and 0.938, respectively. A model for generating descriptions for images based on a merged architecture is proposed using VGG16 with the descriptions encoded over an LSTM model. YOLOv5 is also used to recognize dominant colors of objects in the images and correct the color words in the descriptions generated if it is necessary. If the description contains words referring to a person, we recognize the emotion of the person in the image. Finally, we combine the results of all models to create sentences that describe the visual content and the human emotions in the images. Experimental results on the Flickr8k dataset in Vietnamese achieve BLEU-1, BLEU-2, BLEU-3, BLEU-4 scores of 0.628; 0.425; 0.280; and 0.174, respectively.

Comments:	7 pages
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2208.06117 [cs.CV]
	(or arXiv:2208.06117v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2208.06117
Journal reference:	Fuzzy Systems and Data Mining VII: Proceedings of FSDM 2021 340 (2021): 63
Related DOI:	https://doi.org/10.3233/faia210176

Submission history

From: Khang Lam [view email]
[v1] Fri, 12 Aug 2022 04:45:10 UTC (436 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Facial Expression Recognition and Image Description Generation in Vietnamese

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Facial Expression Recognition and Image Description Generation in Vietnamese

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators