The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics

Ricardo Rei; Nuno M. Guerreiro; Marcos Treviso; Luísa Coheur; Alon Lavie; André F. T. Martins

doi:10.18653/v1/2023.acl-short.94

The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics

Ricardo Rei, Nuno M. Guerreiro, Marcos Treviso, Luisa Coheur, Alon Lavie, André Martins

Abstract

Neural metrics for machine translation evaluation, such as COMET, exhibit significant improvements in their correlation with human judgments, as compared to traditional metrics based on lexical overlap, such as BLEU. Yet, neural metrics are, to a great extent, “black boxes” returning a single sentence-level score without transparency about the decision-making process. In this work, we develop and compare several neural explainability methods and demonstrate their effectiveness for interpreting state-of-the-art fine-tuned neural metrics. Our study reveals that these metrics leverage token-level information that can be directly attributed to translation errors, as assessed through comparison of token-level neural saliency maps with Multidimensional Quality Metrics (MQM) annotations and with synthetically-generated critical translation errors. To ease future research, we release our code at: https://github.com/Unbabel/COMET/tree/explainable-metrics

Anthology ID:: 2023.acl-short.94
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1089–1105
Language:
URL:: https://aclanthology.org/2023.acl-short.94/
DOI:: 10.18653/v1/2023.acl-short.94
Bibkey:
Cite (ACL):: Ricardo Rei, Nuno M. Guerreiro, Marcos Treviso, Luisa Coheur, Alon Lavie, and André Martins. 2023. The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1089–1105, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics (Rei et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-short.94.pdf
Video:: https://aclanthology.org/2023.acl-short.94.mp4

PDF Cite Search Video Fix data