Termpaper
Termpaper
Termpaper
TRANSLATION
* FINAL TERM PAPER
KISEJJERE RASHID
BACHELORS OF SCIENCE IN SOFTWARE ENGINEERING)
MAKERERE UNIVERSITY
COURSE INSTRUCTOR: MR.GALIWANGO MARVIN
KAMPALA, UGANDA
[email protected]
IV. RESEARCH GAPS Model Training: Train the NMT model on the pr-processed
data.
Below are some of the major research Gaps in the field of
machine translation. Model Evaluation: Evaluate the trained model on a held-out
• Limited Training Data: The quality of AI-powered trans- set of data to determine its performance.
lations is heavily dependent on the amount and quality
of training data used to train the model. Further research Deployment: Deploy the trained model for use in a
is needed to explore methods for obtaining high-quality real-world setting.
training data.
• Lack of Cultural Sensitivity: AI-powered translation sys- Continuous Improvement: Continuously evaluate the
tems can produce translations that are grammatically performance of the model and make improvements as needed.
correct but lack the cultural sensitivity of human trans-
lations. This can result in translations that are culturally The AI evaluation framework used in this project are the
inappropriate or that do not accurately convey the original accuracy metrics mainly. This is a major of how the model
message. will be able to translate a given text correctly.
• Vulnerability to Errors of the machine learning system.
AI can only understand what it has been trained on. So In conclusion, the proposed AI approach for this project is to
in cases where the input is not similar to the data which develop a neural machine translation model that can accurately
it was trained on, AI then can easily create undesired translate English text into Luganda text while preserving the
results. meaning and cultural context of the original text.
VII. DATASET DESCRIPTION through the visualization of the data. Below are the visualiza-
tions and their meanings;
B. DATA ANALYSIS
Exploratory data analysis is referred to as the process of
performing initial investigations on data to discover anomalies
and patterns. Exploratory data analysis is mainly carried out For the it’s Luganda Sentence
For the respective Luganda Sentence
3) Sentence Lengths plots: Through these plots, we are
ale to determine what should all the sentences of the datasets
be padded to because during the training process they are all
supposed to be of the same length
XIII. ACKNOWLEDGMENT
B. ATTENTION PLOT
Special Thanks to Mr.Ggaliwango Marvin for his never
An attention plot is a figure showing how the model was ending support towards my research on this project. I also
able to predict the given output. want to appreciate Dr. Rose Nakibuule for the provision of
the foundation knowledge needed for this project. [4]
R EFERENCES
[1] M. Singh, R. Kumar, and I. Chana, ”Neural-Based Machine Transla-
tion System Outperforming Statistical Phrase-Based Machine Transla-
tion for Low-Resource Languages”, 2019 Twelfth International Con-
ference on Contemporary Computing (IC3), 2019, pp. 1-7, DOI:
10.1109/IC3.2019.8844915. V. Bakarola and J. Nasriwala, ”Attention
based Neural Machine Translation with Sequence to Sequence Learning
on Low Resourced Indic Languages,” 2021 2nd International Con-
ference on Advances in Computing, Communication, Embedded and
Secure Systems (ACCESS), 2021, pp. 178-182, DOI: 10.1109/AC
CESS51619.2021.9563317. .
[2] Academy, E. (2022) How to Write a Research Hy-
pothesis — Enago Academy, Enago Academy. Avail-
able at: https://www.enago.com/academy/how-to-develop-
a-good-research-hypothesis/ (Accessed: 17 November
2022). What is the project scope? (2022). Available at:
https://www.techtarget.com/searchcio/definition/project-scope
(Accessed: 17 November 2022).
[3] Machine translation – Wikipedia (2022). Available at:
https://en.wikipedia.org/wiki/Machine translation (Accessed: 17
November 2022).
[4] K. Chen et al., ”Towards More Diverse Input Representation for
Neural Machine Translation,” in IEEE/ACM Transactions on Audio,
Speech, and Language Processing, vol. 28, pp. 1586-1597, 2020, doi:
10.1109/TASLP.2020.2996077.
[5] O. Mekpiroon, P. Tammarattananont, N. Apitiwongmanit, N. Buasroung,
T. Charoenporn and T. Supnithi, ”Integrating Translation Feature Using
Machine Translation in Open Source LMS,” 2009 Ninth IEEE Interna-
tional Conference on Advanced Learning Technologies, 2009, pp. 403-
404, doi: 10.1109/ICALT.2009.136.
[6] J. -W. Hung, J. -R. Lin and L. -Y. Zhuang, ”The Evaluation Study of
the Deep Learning Model Transformer in Speech Translation,” 2021 7th
International Conference on Applied System Innovation (ICASI), 2021,
pp. 30-33, doi: 10.1109/ICASI52993.2021.9568450.
[7] V. Alves, J. Ribeiro, P. Faria and L. Romero, ”Neural Machine Transla-
tion Approach in Automatic Translations between Portuguese Language
and Portuguese Sign Language Glosses,” 2022 17th Iberian Conference
on Information Systems and Technologies (CISTI), 2022, pp. 1-7, doi:
10.23919/CISTI54924.2022.9820212.
[8] Machine Translation – Towards Data Science. (2022). Retrieved 24
November 2022, from https://towardsdatascience.com/tagged/machine
translation
[9] H. Sun, R. Wang, K. Chen, M. Utiyama, E. Sumita and T. Zhao, ”Un-
supervised Neural Machine Translation With Cross-Lingual Language
Representation Agreement,” in IEEE/ACM Transactions on Audio,
Speech, and Language Processing, vol. 28, pp. 1170-1182, 2020, doi:
10.1109/TASLP.2020.2982282.
[10] Y. Wu, ”A Chinese-English Machine Translation Model Based on
Deep Neural Network,” 2020 International Conference on Intelligent
Transportation, Big Data and Smart City (ICITBS), 2020, pp. 828-831,
doi: 10.1109/ICITBS49701.2020.00182.
[11] L. Wang, ”Adaptability of English Literature Translation from the
Perspective of Machine Learning Linguistics,” 2020 International Con-
ference on Computers, Information Processing and Advanced Education
(CIPAE), 2020, pp. 130-133, doi: 10.1109/CIPAE51077.2020.00042.
[12] S. P. Singh, H. Darbari, A. Kumar, S. Jain and A. Lohan, ”Overview of
Neural Machine Translation for English-Hindi,” 2019 International Con-
ference on Issues and Challenges in Intelligent Computing Techniques
(ICICT), 2019, pp. 1-4, doi: 10.1109/ICICT46931.2019.8977715
[13] R. F. Gibadullin, M. Y. Perukhin and A. V. Ilin, ”Speech
Recognition and Machine Translation Using Neural Networks,”
2021 International Conference on Industrial Engineering, Appli-
cations and Manufacturing (ICIEAM), 2021, pp. 398-403, doi:
10.1109/ICIEAM51226.2021.9446474.
[14] How to Build Accountability into Your AI. (2021). Retrieved 24 Novem-
ber 2022, from https://hbr.org/2021/08/how-to-build-accountability-into-
your-ai
[15] Mukiibi, J., Hussein, A., Meyer, J., Katumba, A., and Nakatumba
Nabende, J. (2022). The Makerere Radio Speech Corpus: A Luganda
Radio Corpus for Automatic Speech Recognition. Retrieved 24 Novem-
ber 2022, from https://zenodo.org/record/5855017