Cross-Lingual Language Modeling for Nepali: Enhancing low Resource NLP

Daniel   Soubam; Vivek  Gupta; Aparna  Sivaraj

Cross-Lingual Language Modeling for Nepali: Enhancing low Resource NLP

Authors

Daniel Soubam Department of Computer Science and Engineering Sharda University, Greater Noida, U.P., India Author
Vivek Gupta Department of Computer Science and Engineering Sharda University, Greater Noida, U.P., India Author
Aparna Sivaraj Department of Computer Science and Engineering Sharda University, Greater Noida, U.P., India Author

Keywords:

Cross-lingual, Nepali, XLM-R, Masked Language Modeling, Translation Language Modeling, low-resource NLP

Abstract

This paper looks at improving multilingual language models for Nepali, which is a language with limited resources. We take a two-step approach: first, we train the model using Masked Language Modeling (MLM) on Nepali-only text, then we continue training with Translation Language Modeling (TLM) using English-Nepali parallel texts. Our experiments show that adapting the XLM-R Base model this way helps reduce Nepali perplexity and boosts performance on tasks like machine translation, sentiment analysis, and question answering. We also share our methods and results to support future research on underrepresented South Asian languages.

Downloads

Published

13-03-2026

Conference Proceedings Volume

Vol. 1 (2026): Multidisciplinary Perspectives in Advanced Computing and Technology (IMPACT-26)

Section

Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Soubam, D. . ., Gupta, V. ., & Sivaraj, A. . (2026). Cross-Lingual Language Modeling for Nepali: Enhancing low Resource NLP. DMPedia Lecture Notes in Multidisciplinary Research, IMPACT26, 430-436. https://digitalmanuscriptpedia.com/conferences/index.php/DMP-LNMR/article/view/80

Download Citation

Cross-Lingual Language Modeling for Nepali: Enhancing low Resource NLP

Authors

Keywords:

Abstract

Downloads

Published

Conference Proceedings Volume

Section

License

How to Cite

Information

After