Authors :
Mazimpaka Richard; Dr. Nizeyimana Pacifique; Dr. Kundan Kumar; Mukwende Placide; Nshimiyimana Jerome
Volume/Issue :
Volume 10 - 2025, Issue 8 - August
Google Scholar :
https://tinyurl.com/39udnf8m
Scribd :
https://tinyurl.com/4kaddz6k
DOI :
https://doi.org/10.38124/ijisrt/25aug218
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Umurenge SACCOs are instrumental in fostering financial inclusion in Rwanda, yet they face significant
challenges with high loan default rates that threaten their long-term sustainability. This study develops a predictive model
using machine learning techniques to assess loan default risk among SACCO borrowers. Using a real, anonymized dataset
of 2,000 loan applications from the Rwanda Cooperative Agency (RCA), we compare six machine learning algorithms:
Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, AdaBoost, and XGBoost. The study addresses
class imbalance through balanced training approaches and evaluates models using accuracy, precision, recall, and F1-score
metrics. XGBoost achieved the highest performance with 89.5% accuracy, while Logistic Regression demonstrated optimal
balance between performance (86.5% accuracy, 85.2% F1-score) and interpretability, making it suitable for real-world
deployment in SACCO environments. Key predictors identified include credit score, past loan repayment behavior, and
monthly income. These findings provide a scalable, data-driven approach for SACCOs to transition from intuition-based to
evidence-based credit risk assessment, supporting Rwanda's digital transformation goals in financial services.
Keywords :
Credit Risk Assessment, Logistic Regression, SACCOs, Machine Learning, Loan Default Prediction, Financial Inclusion.
References :
- Abate, G. T., Borzaga, C., & Getnet, K. (2016). Cost-efficiency and outreach of microfinance institutions: Trade-offs and the role of ownership. Journal of Business Ethics, 148(3), 647-665.
- Abdou, H. A., & Pointon, J. (2011). Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intelligent Systems in Accounting, Finance and Management, 18(2-3), 59-88.
- Akerlof, G. A. (1970). The market for 'lemons': Quality uncertainty and the market mechanism. Quarterly Journal of Economics, 84(3), 488-500.
- Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., & Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6), 627-635.
- Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
- Chen, X., & Qian, W. (2019). AI and big data in China's fintech development. China Economic Journal, 12(3), 230-248.
- Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.
- Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139.
- Ghosh, S. (2021). FinTech and financial inclusion in India: An overview. International Journal of Financial Studies, 9(3), 43.
- Gutiérrez-Nieto, B., Serrano-Cinca, C., & Molinero, C. M. (2007). Microfinance institutions and efficiency. Omega, 35(2), 131-142.
- Hermes, N., & Lensink, R. (2011). Microfinance: Its impact, outreach, and sustainability. World Development, 39(6), 875-881.
- Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Hoboken, NJ: Wiley.
- Jagtiani, J., & Lemieux, C. (2019). The roles of alternative data and machine learning in fintech lending: Evidence from the LendingClub consumer platform. Financial Management, 48(4), 1009-1029.
- Kasozi, J., Musisi, C., & Muwanga, J. (2020). Predictive analytics in agricultural SACCO lending: A Ugandan case. African Journal of Agricultural Economics, 15(2), 45-62.
- Ledgerwood, J. (1999). Microfinance Handbook: An Institutional and Financial Perspective. Washington, DC: The World Bank.
- Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124-136.
- Mokhtar, S. H., Nartea, G. V., & Gan, C. (2012). Determinants of microcredit loans repayment problem among microfinance borrowers in Malaysia. International Journal of Business and Social Research, 2(7), 152-163.
- National Bank of Rwanda. (2023). Annual Report 2022. Kigali: BNR Publications.
- Odhiambo, R. (2019). Application of credit scoring models in SACCOs: A case study of rural Kenya. East African Journal of Financial Studies, 12(3), 78-95.
- Otieno, P. A., Ogutu, M., & Awino, Z. B. (2020). Adoption of machine learning in microfinance lending decisions in Kenya. African Journal of Business Management, 14(5), 123-134.
- Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81-106.
- Rwanda Cooperative Agency. (2022). SACCO Performance Report 2021. Kigali: RCA Publications.
- Schreiner, M. (2003). Scoring: The next breakthrough in microcredit? Consultative Group to Assist the Poor Occasional Paper, 7, 1-34.
- Thomas, L. C. (2009). Consumer Credit Models: Pricing, Profit and Portfolios. Oxford: Oxford University Press.
- Verbraken, T., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505-513.
Umurenge SACCOs are instrumental in fostering financial inclusion in Rwanda, yet they face significant
challenges with high loan default rates that threaten their long-term sustainability. This study develops a predictive model
using machine learning techniques to assess loan default risk among SACCO borrowers. Using a real, anonymized dataset
of 2,000 loan applications from the Rwanda Cooperative Agency (RCA), we compare six machine learning algorithms:
Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, AdaBoost, and XGBoost. The study addresses
class imbalance through balanced training approaches and evaluates models using accuracy, precision, recall, and F1-score
metrics. XGBoost achieved the highest performance with 89.5% accuracy, while Logistic Regression demonstrated optimal
balance between performance (86.5% accuracy, 85.2% F1-score) and interpretability, making it suitable for real-world
deployment in SACCO environments. Key predictors identified include credit score, past loan repayment behavior, and
monthly income. These findings provide a scalable, data-driven approach for SACCOs to transition from intuition-based to
evidence-based credit risk assessment, supporting Rwanda's digital transformation goals in financial services.
Keywords :
Credit Risk Assessment, Logistic Regression, SACCOs, Machine Learning, Loan Default Prediction, Financial Inclusion.