Authors :
Oluwatoki, Tolani Grace; Adetunmbi, Olusola Adebayo; Boyinbode, Olutayo Kehinde
Volume/Issue :
Volume 9 - 2024, Issue 9 - September
Google Scholar :
https://tinyurl.com/5n8v5dav
Scribd :
https://tinyurl.com/ycx66zdb
DOI :
https://doi.org/10.38124/ijisrt/IJISRT24SEP1562
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Automated translation systems for some
indigenous Nigerian languages like the Yoruba, have
historically been limited by the lack of large, high-
quality bilingual text and effective approaches to
modeling. This paper presents introduces an approach
to bi-directional Yoruba-English text-to-text machine
translation utilizing deep learning technique,
specifically Transformer models. Transformer models,
which utilizes self-attention mechanisms to improve
translation quality and efficiency. The system was
trained and evaluated on a newly curated Yoruba-
English parallel corpus, which significantly augments
existing resources. Experimental results demonstrate
that the Transformer-based model performs
translation accurately and fluently, achieving a
ROUGE (Recall-Oriented Understudy for Gisting
Evaluation) score improvement of 0.4649. This work
not only advances the frontiers of Yoruba-English
machine translation but also enriches a wider domain
in the field of multilingual Natural Language
processing (NLP) by addressing challenges associated
with translating between languages with limited
resources. Future studies include enhancing the
available parallel corpus and exploring hybrid models
that combine the strengths of both RNN and
Transformer architectures.
Keywords :
Yoruba, English, Transformer, Self-Attention, ROUGE Score, NLP.
References :
- Collins Online Dictionary, (2015). Pioneers in Language Reference for 200 years.
- Kolawole, S. O. (2022). Translation Studies in Nigeria: Issues and Perspectives. Journal for Translation Studies in Nigeria (JTSN) , pp 17- 37.
- Catford, J. C. (1965). A Linguistic Theory of Translation. Oxford University Press. 1-110.
- Sas, V. (2019). Natural Lnaguage Processing (NLP), What it is and Why it Matters. https://www.sas.com/en_sg/insights/analytics/what-is-natural-language-processing-nlp.html. Retrieved 24/9/2024.
- Venkateswara P.T. and Mayil M.G. (2013). Telugu to English Translation using Direct Machine Translation Approach. International Journal of Science and Engineering Investigations (IJSEI), pp 2532, ISSN:2251-8843.
- Abiola, B.O., Adeyemo, O.A., Saka-Balogun, O.Y. and Okesola, F. (2020). A web-based Yoruba to English Bilingual Lexicon for Building Technicians. International Journal of Advanced Trends in Computer Science and Engineering. 9(1), pp 1-8.
- Oyelami M.O., Famutimi, R.F. and Fadare, T.S. (2021). Development and Evaluation of an Android-based Yoruba Language Proverbs Preservatory and Repository System. International Journal of Computer Applications, 183(6), 9-15.
- Esan, A., Sobowale, A., Adebiyi, T., Adio, M. and Toloruntomi, S. (2024). A rule-based Approach to English-Okun Prepositional Phrase Machine Translation. Dutse Journal of Pure and Applied Sciences (DUJOPAS), 10 (1c), 54-66.
- Agbelusi, O., Matthew, O. O. and Aladesote I. O. (2024). Inclusive Mobile Health System for Yoruba Race in Nigeria. International Conference on Information and Knowledge System, 486, 255-264.
- Fasakin T.G. (2017). An English to Yoruba Statistiacl Machine Translation system. M. Tech Thesis, Federal University of Technology, Akure.
- Ayogu, I.I., Adetunmbi, A.O. and Ojokoh, B. A. (2018). Developing Statistical Machine Translation System for English and Nigerian Languages. Asian Journal of Research in Computer Science. 1(4), 1-8.
- Adelani, D. I., Ruiter, D., Alabi, O. J., Adebonjo, D., Ayeni, A., Adeyemi, M. and Espana-Bonet, C. (2021). The Effect of Domain and Diacritics in Yoruba–English Neural Machine Translation. Proceedings of the 18th Biennial Machine Translation Summit Virtual USA. 61-75.
- Sayuti, M. S.; U. S. Hassanand G. Danlami. (2023). Evaluating English to Nupe Machine Translation Model Using BLEU. Nigerian Journal of Engineering Science Research (NIJESR), 6(3), 1-7.
- Ojo, A., O. Obe;A. Adebayo; and M. Olagunjoye. (2020). Development of English to Yoruba Machine translator Using Syntax-based Model. University of Ibadan Journal of Science and Logics in ICT Research (UIJSLICTR), 6 (1):77-86.
- Chinenyeze C.E. and Benntt E.O. (2019). A Natural Language Processing System for English to Igbo Language Translation in Adriod. International Journal of Computer Science and Mathematics Theory, pp 64-75.
- Artur, N. and Tomaz, D. (2021). Adam Mickiewicz University English-Hausa Submissions to the WMT 2021 News Translation Task. Proceedings of the Sixth Conference on Machine Translation (WMT), 167–171.
- Adewale, A. (2020). HausaMT v1.0: Towards English-Hausa Neural Machine Translation. 4th Widening NLP Workshop, Annual Meeting of the Association for Computational Linguistics, ACL, 1-4.
- Oyeniran, O. A., & Oyebode, E. O. (2021).YORÙBÁNET: A Deep Convolutional Neural Network Design For Yorùbá Alphabets Recognition. International Journal of Engineering Applied Sciences and Technology, 5(11), 57-61.
- Ajao, J., Yusuff, S., & Ajao, A. (2022). Yorùbá character recognition system using convolutional recurrent neural network. Black Sea Journal of Engineering and Science, 5(4), 151-157.
- Adedara, I., Mageed, M.A and Silfverberg, M. (2022). Linguistically-Motivated Yoruba-English Translation. Proceeding of the 29th International Conference on Computational Linguistics. 5066-5075.
- Adegoke-Elijah, A., Jimoh, K. and Alabi, A. (2023). Development of a XML-Encoded Machine-Readable Dictionary for Yoruba Word Sense Disambiguation. UNIOSUN Journal of Engineering and Environmental Sciences, 5 (1): 1-10.
- Akinade, I., Alabi, J., Adelani, D. Odoje, C. and Klakow, D. (2023). Ku : Integrating Yoruba Cultural greetings into Machine Translation. Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP), 1–7.
- David, C. and Robert, H. R. (2024). The Encyclopaedia Britanical. Britannical.com/topic/language. Accessed 26/9/2024.
- Titanium, M. (2021). Introduction To Language. https://www.coursehero.com/file/85286837/INTRODUCTION-TO-LANGUAGEdocx/.
- Richard, N. (2019). Observations on What is Language. https://www.thoughtco.com/what-is-a-language-1691218
- Fabio, A. and Arnt, L. J. (2021) The Routledge Handbook of Translation Cognitive first Edition, 378-382.
- Jim, H. (2024). What is NLP (Natural Language Processing)? https://www.ibm.com/topics/natural-language-processing. Retrieved 27/09/2024.
- Diego, L. Y. (2019). Your Giude to Natural Language Processing (NLP). Towards Data Science.
- Education Ecosystem (LEDU). A Simple Introduction to Natural Language Processing. Becominghuman.ai/a simple-introduction-to-natural-language-processingea66a1747b32. Retrieved 27/09/2024.
- Andi, W. and Zixin, J. (1998). Word Segmentation in Sentence Analysis. Microsoft Reseach, 1-10.
- Encyclopedia of Bioinformatics and Computaional Biology, (2019). Science Direct.
- SYSTRAN by Chapsvision: Rule-Based Machine Translation Vs Statistical Machine Translation. https://www.systransoft.com/systran/translation. Retrieved 27/9/2024.
- Akan, M. F. (2014). The Lingistic Overview of Arabic and Bangla: a Comparative and Contrastive Analysis. Bangladesh Research Foundation Journal, Dhaka, Bangladesh, 3(1), 103-110.
- Sonali, S., Manoj, D., Prabhishek, S., Vijendra, S., Seifedine, K. and Jungeun, K. (2023). Machine Translation Systems Based on Classical-Statistical-Deep-Learning Approaches. Electronics, 1-29.
- Joshua, A. M. (2015). An Overview of Statistical Machine Translation. ResearchGate, 1-14
- Lucia, B. and Lubomir, B. (2020). Neural Machine Translation as a Novel Approach to Machine Translation. Research Gate, 499-508.
- Imelda, U and Ima, E. (2020), Nigerian Languages and Identity Crries. Language and Semiotic Studies, 6(3), 96-108.
- Chirag, (2021). Step by Step Guide to Master NLP. Word Embedding and Text Vectorization. https://www.analyticsvidhya.com/blog/2021/06/part-5-step-by-step-guide-to-master-nlp-text-vectorization-approaches/. Retrieved 28/9/2024.
- Adem, A. (2021). Word Embedding Techniques: Word2Vec and TF-IDF Explained. Towards Data Science.
- Vaswani, A., Shazeer, N., Parmar, N., Jacob, U., Jones, L., Aidan, N. G., Kaiser, L. and Illia, P. (2023). Attention is All You Need. https://arxiv.org/pdf/1706.03762.
- Mehree, S. (2023). A Gentle Introduction to Positioning Encoding in Transformer Model. Machine Learning Mastery.
- Artificial Intelligence for Development-Africa Network (https://africa.ai4d.ai/)
Automated translation systems for some
indigenous Nigerian languages like the Yoruba, have
historically been limited by the lack of large, high-
quality bilingual text and effective approaches to
modeling. This paper presents introduces an approach
to bi-directional Yoruba-English text-to-text machine
translation utilizing deep learning technique,
specifically Transformer models. Transformer models,
which utilizes self-attention mechanisms to improve
translation quality and efficiency. The system was
trained and evaluated on a newly curated Yoruba-
English parallel corpus, which significantly augments
existing resources. Experimental results demonstrate
that the Transformer-based model performs
translation accurately and fluently, achieving a
ROUGE (Recall-Oriented Understudy for Gisting
Evaluation) score improvement of 0.4649. This work
not only advances the frontiers of Yoruba-English
machine translation but also enriches a wider domain
in the field of multilingual Natural Language
processing (NLP) by addressing challenges associated
with translating between languages with limited
resources. Future studies include enhancing the
available parallel corpus and exploring hybrid models
that combine the strengths of both RNN and
Transformer architectures.
Keywords :
Yoruba, English, Transformer, Self-Attention, ROUGE Score, NLP.