Bilingual Neural Machine Translation From English To Yoruba Using A Transformer Model


Authors : Adeboje Olawale Timothy; Adetunmbi Olusola Adebayo; Arome Gabriel Junior; Akinyede Raphael Olufemi

Volume/Issue : Volume 9 - 2024, Issue 7 - July

Google Scholar : https://tinyurl.com/2thcwjs

Scribd : https://tinyurl.com/ykz23ps5

DOI : https://doi.org/10.38124/ijisrt/IJISRT24JUL767

Abstract : The necessity for language translation in Nigeria arises from its linguistic diversity, facilitating effective communication and understanding across communities. Yoruba, considered a language with limited resources, has potential for greater online presence. This research proposes a neural machine translation model using a transformer architecture to convert English text into Yoruba text. While previous studies have addressed this area, challenges such as vanishing gradients, translation accuracy, and computational efficiency for longer sequences persist. This research proposes to address these limitations by employing a transformer- based model, which has demonstrated efficacy in overcoming issues associated with Recurrent Neural Networks (RNNs). Unlike RNNs, transformers utilize attention mechanisms to establish comprehensive connections between input and output, improving translation quality and computational efficiency.

Keywords : NLP, Text to Text, Neural Machine Translation, Encoder, Decoder, BERT, T5.

References :

  1. Eludiora, S. I., & Odejobi, O. A. (2016). Development of an English to Yorùbá Machine Translator.  International Journal of Modern Education and Computer Science, 8(11), 8.
  2. Akintola, A., & Ibiyemi, T. (2017). Machine to Man Communication in Yorùbá Language. Annal. Comput. Sci. Ser, 15(2).
  3. Iyanda, A. R., & Ninan, O. D. (2017). Development of a Yorúbà Textto-Speech System Using Festival. Innovative Systems Design and Engineering (ISDE), 8(5).
  4. Adewole, L. B., Adetunmbi, A. O., Alese, B. K., & Oluwadare, S. A. (2017). Token Validation in Automatic Corpus Gathering for Yoruba Language. FUOYE Journal of Engineering and Technology, 2(1), 4.
  5. Ayogu, I. I., Adetunmbi, A. O., & Ojokoh, B. A. (2018). Developing statistical machine translation system for english and nigerian languages. Asian Journal of Research in Computer Science, 1(4), 1-8.
  6. Greenstein, E., & Penner, D. (2015). Japanese-to-english machine translation using recurrent neural networks. Retrieved Aug, 19, 2019.
  7. Nouhaila, B. E. N. S. A. L. A. H., Habib, A. Y. A. D., Abdellah, A. D. I. B., & Abdelhamid, I. E. F. (2017). Arabic machine translation using Bidirectional LSTM Encoder-Decoder.
  8. Gogoulou, E. (2019). Using Bidirectional Encoder Representations from Transformers for Conversational Machine Comprehension.
  9. Esan, A., Oladosu, J., Oyeleye, C., Adeyanju, I., Olaniyan, O., Okomba, N., ... & Adanigbo, O. (2020). Development of a recurrent neural network model for English to Yorùbá machine translation. Development , 11(5).
  10. Ajibade, B., & Eludiora, S. (2021). Design and Implementation of English To Yor\ub\'a Verb Phrase Machine Translation System. arXiv preprint arXiv: 2104.04125.
  11. Oyeniran, O. A., & Oyebode, E. O. (2021). YORÙBÁNET: A deep convolutional neural network design for Yorùbá alphabets recognition. International Journal of Engineering Applied Sciences and Technology, 5(11), 57-61.
  12. [12] Sawai, R., Paik, I., & Kuwana, A. (2021). Sentence augmentation for language translation using gpt-2. Electronics, 10(24), 3082.
  13. [13] Adebara, I., Abdul-Mageed, M., & Silfverberg, M. (2022, October). Linguistically-motivated Yorùbá-English machine translation. In Proceedings of the 29th International Conference on Computational Linguistics (pp. 5066-5075).
  14. [14] Ajao, J., Yusuff, S., & Ajao, A. (2022). Yorùbá character recognition system using convolutional recurrent neural network. Black Sea Journal of Engineering and Science, 5(4), 151-157.
  15. [15] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention Is All You Need.(Nips), 2017. arXiv preprint arXiv:1706.03762, 10, S0140525X16001837.
  16. [16] Xiao, T., & Zhu, J. (2023). Introduction to Transformers: an NLP Perspective. arXiv preprint arXiv:2311.17633.
  17. [17] Magueresse, A., Carles, V., & Heetderks, E. (2020). Low-resource languages: A review of past work and future challenges. arXiv preprint arXiv: 2006.07264.
  18. [18] Ajepe, I., & Ademowo, A. J. (2016). English language dominance and the fate of indigenous languages in Nigeria. International Journal of History and Cultural Studies, 2(4), 10-17.
  19. [19] Fadoro, J. O. (2010). Revisiting the mother-tongue medium controversy. Montem Paperbacks, Akure.
  20. [20] Mishina, U. L., & Iskandar, I. (2019). The role of English language in Nigerian development. GNOSI: An Interdisciplinary Journal of Human Theory and Praxis, 2(2), 47-54.
  21. [21] Bibi, N., Rana, T., Maqbool, A., Alkhalifah, T., Khan, W. Z., Bashir, A. K., & Zikria, Y. B. (2023). Reusable Component Retrieval: A Semantic Search Approach for Low-Resource Languages. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(5), 1-31.
  22. [22] Omoniyi, A. M. (2012). SOCIO-POLITICAL PROBLEMS OF LANGUAGE TEACHING IN NIGERIA. Advisory Editorial Board, 152.
  23. [23] Khurana, D., Koli, A., Khatter, K., & Singh, S. (2023). Natural language processing: state of the art, current trends and challenges. Multimedia tools and applications, 82(3), 3713-3744.
  24. [20] Mishina, U. L., & Iskandar, I. (2019). The role of English language in Nigerian development. GNOSI: An Interdisciplinary Journal of Human Theory and Praxis, 2(2), 47-54.
  25. [21] Bibi, N., Rana, T., Maqbool, A., Alkhalifah, T., Khan, W. Z., Bashir, A. K., & Zikria, Y. B. (2023). Reusable Component Retrieval: A Semantic Search Approach for Low-Resource Languages. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(5), 1-31.
  26. [22] Omoniyi, A. M. (2012). SOCIO-POLITICAL PROBLEMS OF LANGUAGE TEACHING IN NIGERIA. Advisory Editorial Board, 152.

The necessity for language translation in Nigeria arises from its linguistic diversity, facilitating effective communication and understanding across communities. Yoruba, considered a language with limited resources, has potential for greater online presence. This research proposes a neural machine translation model using a transformer architecture to convert English text into Yoruba text. While previous studies have addressed this area, challenges such as vanishing gradients, translation accuracy, and computational efficiency for longer sequences persist. This research proposes to address these limitations by employing a transformer- based model, which has demonstrated efficacy in overcoming issues associated with Recurrent Neural Networks (RNNs). Unlike RNNs, transformers utilize attention mechanisms to establish comprehensive connections between input and output, improving translation quality and computational efficiency.

Keywords : NLP, Text to Text, Neural Machine Translation, Encoder, Decoder, BERT, T5.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe