Beyond Common Words: Enhancing ASR Cross-Lingual Proper Noun Recognition Using Large Language Models

Rishabh Kumar, Sabyasachi Ghosh, Ganesh Ramakrishnan


Abstract
In this work, we address the challenge of cross-lingual proper noun recognition in automatic speech recognition (ASR), where proper nouns in an utterance may originate from a language different from the language in which the ASR system is trained. We enhance the performance of end-to-end ASR systems by instructing a large language model (LLM) to correct the ASR model’s predictions. The LLM’s context is augmented with a dictionary of cross-lingual words that are phonetically and graphemically similar to the potentially incorrect proper nouns in the ASR predictions. Our dictionary-based method DiP-ASR (Dictionary-based Prompting for Automatic Speech Recognition) significantly reduces word error rates compared to both the end-to-end ASR baseline and instruction-based prompting of the LLM without the dictionary across cross-lingual proper noun recognition tasks involving three secondary languages.
Anthology ID:
2024.findings-emnlp.399
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6821–6828
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.399/
DOI:
10.18653/v1/2024.findings-emnlp.399
Bibkey:
Cite (ACL):
Rishabh Kumar, Sabyasachi Ghosh, and Ganesh Ramakrishnan. 2024. Beyond Common Words: Enhancing ASR Cross-Lingual Proper Noun Recognition Using Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 6821–6828, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Beyond Common Words: Enhancing ASR Cross-Lingual Proper Noun Recognition Using Large Language Models (Kumar et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.399.pdf