Investigation of Multilingual Neural Machine Translation for Indian Languages

Sahinur Rahman Laskar; Riyanka Manna; Partha Pakray; Sivaji Bandyopadhyay

Investigation of Multilingual Neural Machine Translation for Indian Languages

Sahinur Rahman Laskar, Riyanka Manna, Partha Pakray, Sivaji Bandyopadhyay

Abstract

In the domain of natural language processing, machine translation is a well-defined task where one natural language is automatically translated to another natural language. The deep learning-based approach of machine translation, known as neural machine translation attains remarkable translational performance. However, it requires a sufficient amount of training data which is a critical issue for low-resource pair translation. To handle the data scarcity problem, the multilingual concept has been investigated in neural machine translation in different settings like many-to-one and one-to-many translation. WAT2022 (Workshop on Asian Translation 2022) organizes (hosted by the COLING 2022) Indic tasks: English-to-Indic and Indic-to-English translation tasks where we have participated as a team named CNLP-NITS-PP. Herein, we have investigated a transliteration-based approach, where Indic languages are transliterated into English script and shared sub-word level vocabulary during the training phase. We have attained BLEU scores of 2.0 (English-to-Bengali), 1.10 (English-to-Assamese), 4.50 (Bengali-to-English), and 3.50 (Assamese-to-English) translation, respectively.

Anthology ID:: 2022.wat-1.9
Volume:: Proceedings of the 9th Workshop on Asian Translation
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Venue:: WAT
SIG:
Publisher:: International Conference on Computational Linguistics
Note:
Pages:: 78–81
Language:
URL:: https://aclanthology.org/2022.wat-1.9
DOI:
Bibkey:
Cite (ACL):: Sahinur Rahman Laskar, Riyanka Manna, Partha Pakray, and Sivaji Bandyopadhyay. 2022. Investigation of Multilingual Neural Machine Translation for Indian Languages. In Proceedings of the 9th Workshop on Asian Translation, pages 78–81, Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
Cite (Informal):: Investigation of Multilingual Neural Machine Translation for Indian Languages (Laskar et al., WAT 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.wat-1.9.pdf
Data: Samanantar

PDF Cite Search