Learning-based Composite Metrics for Improved Caption Evaluation

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Syed Afaq Ali Shah


Abstract
The evaluation of image caption quality is a challenging task, which requires the assessment of two main aspects in a caption: adequacy and fluency. These quality aspects can be judged using a combination of several linguistic features. However, most of the current image captioning metrics focus only on specific linguistic facets, such as the lexical or semantic, and fail to meet a satisfactory level of correlation with human judgements at the sentence-level. We propose a learning-based framework to incorporate the scores of a set of lexical and semantic metrics as features, to capture the adequacy and fluency of captions at different linguistic levels. Our experimental results demonstrate that composite metrics draw upon the strengths of stand-alone measures to yield improved correlation and accuracy.
Anthology ID:
P18-3003
Volume:
Proceedings of ACL 2018, Student Research Workshop
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Vered Shwartz, Jeniya Tabassum, Rob Voigt, Wanxiang Che, Marie-Catherine de Marneffe, Malvina Nissim
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14–20
Language:
URL:
https://aclanthology.org/P18-3003
DOI:
10.18653/v1/P18-3003
Bibkey:
Cite (ACL):
Naeha Sharif, Lyndon White, Mohammed Bennamoun, and Syed Afaq Ali Shah. 2018. Learning-based Composite Metrics for Improved Caption Evaluation. In Proceedings of ACL 2018, Student Research Workshop, pages 14–20, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Learning-based Composite Metrics for Improved Caption Evaluation (Sharif et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-3003.pdf