Authors :
Celestin Niyomugabo; Sunday A. Idowu
Volume/Issue :
Volume 10 - 2025, Issue 10 - October
Google Scholar :
https://tinyurl.com/mr2x4xxd
Scribd :
https://tinyurl.com/5n6ff6nh
DOI :
https://doi.org/10.38124/ijisrt/25oct139
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Value Added Tax (VAT) non-compliance remains a persistent challenge in Rwanda despite the nationwide rollout
of Electronic Billing Machine (EBM) and other digital reforms. While retrospective VAT gap studies have been useful in
quantifying the scale of revenue loss, they fall short of providing predictive insights that can proactively prevent non-
compliance. To address this gap, this study developed and validated an industry-aware machine learning model capable of
predicting VAT non-compliance using integrated administrative microdata. The study also benchmarked VAT non-
compliance across taxpayer scales and industries to identify systematic sectoral heterogeneity to generate actionable
evidence for risk-based auditing and more targeted policy design. This study integrated VAT declarations, EBM
transactions, and customs import records from Rwanda Revenue Authority (RRA) for the period 2020–2024, linking them
at the taxpayer level to build a comprehensive compliance dataset. An Extreme gradient Boosting (XGBoost) classifier was
applied, with class imbalance addressed through weighting to ensure that the minority class of VAT non-compliant returns
contributed proportionately to model learning. Hyperparameters were optimized through grid search and validation to
ensure robust generalization, while decision thresholds were tuned to prioritize high recall without compromising precision.
Model performance was evaluated using accuracy, precision, recall, F1-score, and both ROC-AUC and PR-AUC, with
additional out-of-time validation to confirm stability. Feature interpretability was ensured through SHARP-based
importance analysis, which highlighted the relative contribution of discrepancies between EBM sales and declared turnover,
penalty history, and trade activity in predicting VAT non-compliance. The model achieved high predictive performance for
the non-compliant class (accuracy 98.9%, precision 0.932, recall 0.887, F1-score 0.909) with robust generalization across tax
years. The VAT non-compliance is 6.9% overall, with statistically significant between-industry dispersion (ANOVA p-
value<0.001). Elevated risk appears in transport and storage, wholesale and retail trade, manufacturing, mining and
quarrying, electricity, gas, steam & air conditioning supply, and activities of households as employers. Non-compliance also
increases with taxpayer scale (large 11.5%, medium 9.4%, small 6.0%). Feature importance confirms the operational
salience of EBM sales and total value of supplies declared discrepancies and penalty history.
Conclusion:
Industry-aware predictive analytics can materially strengthen risk-based auditing in Rwanda by targeting higher-risk
sectors and scales, improving audit efficiency and revenue recovery, and providing replicable benchmarks for sector-specific
policy design.
Keywords :
VAT, Non-Compliance, Machine Learning, Rwanda, EBM.
References :
- M. Keen, “The nature, importance, and spread of the VAT,” in The modern VAT, International Monetary Fund, 2005, pp. 1–22. Available: https://www.elibrary.imf.org/display/book/9781589060265/ch01.xml
- R. De Mooij and A. Swistak, “Value-added tax continues to expand,” Finance & Development, vol. 59, no. 1, 2022.
- M. Keen, “Taxation and development — again,” IMF WORKING PAPERS, 2016.
- D. Ntihemuka, C. Niyomugabo, J. C. Nshimiyimana, U. I. Grace, F. Hakizimana, and C. Harushyubuzima, “TAX STATISTICS IN RWANDA: FISCAL YEAR 2023/24 - 7th edition,” Rwanda Revenue Authority, 2024.
- L. Ebrill, M. Keen, J.-P. Bodin, and V. Summers, Eds., The modern VAT. International Monetary Fund, 2001.
- T. Siwale, B. Dillon, J. Mulenga, and K. Musole, “Harnessing the power of electronic fiscal devices to increase VAT revenue in zambia: The role of consumers and consumer incentive schemes,” International Growth Centre (IGC), Policy Brief ZMB-20014, 2021.
- E. Ghirmai, S. Logan, and S. MuRRAy, “The incidence and impact of electronic billing machines for VAT in rwanda.” International Growth Centre, 2016.
- G. Mascagni, D. Mukama, and F. Santoro, “An analysis of discrepancies in taxpayers’ VAT declarations in rwanda,” 2019.
- P. F. Mugambwa and O. Habineza, “Tax alert: Rwanda moves to make electronic invoicing system (EIS) invoices mandatory for all businesses!” PwC, 2021.
- N. Hakizimana and F. Santoro, “Technology evolution and tax compliance: Evidence from rwanda,” workingpaper, 2024.
- O. Tuyishimire and B. F. Murorunkwere, “Applications of big data analytics in tax compliance monitoring: A case study of rwanda’s value-added tax,” CESifo Economic Studies, 2024.
- C. MUNEZERO, “Value added tax fraud detection using naïve bayes data mining approach,” Master's Thesis, University of Rwanda, College of Business; Economics, Kigali, Rwanda, 2020.
- M. Battaglini, L. Guiso, C. Lacava, D. L. Miller, and E. Patacchini, “Refining public policies with machine learning: The case of tax auditing,” National Bureau of Economic Research, Cambridge, MA, Working Paper Series 30777, 2022. Available: http://www.nber.org/papers/w30777
- South African Revenue Service ( SARS), “ SARS uncovers non-compliance through data-driven risk detection.”
- A. Kamoun, R. Boujelbane, and S. Boujelben, “A prediction model to detect non-compliant taxpayers using a supervised machine learning approach: Evidence from tunisia,” Journal of Business Analytics, vol. 8, no. 2, pp. 116–133, 2025, Available: https://doi.org/10.1080/2573234X.2024.2438195
- M. A. Umar, C. Derashid, I. Ibrahim, and Z. Bidin, “Public governance quality and tax compliance behavior in developing countries,” International Journal of Social Economics, vol. 46, no. 3, pp. 338–351, Oct. 2018, doi: 10.1108/IJSE-11-2016-0338.
- K. P. Chen and C. Y. C. Chu, “Internal control versus external manipulation: A model of corporate income tax evasion,” The RAND Journal of Economics, vol. 36, no. 1, pp. 151–164, 2005.
- M. Hanlon and S. Heitzman, “A review of tax research,” Journal of Accounting and Economics, vol. 50, no. 2–3, pp. 127–178, 2010.
- J. West and M. Bhattacharya, “Intelligent financial fraud detection: A comprehensive review,” Computers & Security, vol. 57, pp. 47–66, 2016.
- D. Pomeranz, “No taxation without information: Deterrence and self-enforcement in the value added tax,” American Economic Review, vol. 105, no. 8, pp. 2539–2569, 2015.
Value Added Tax (VAT) non-compliance remains a persistent challenge in Rwanda despite the nationwide rollout
of Electronic Billing Machine (EBM) and other digital reforms. While retrospective VAT gap studies have been useful in
quantifying the scale of revenue loss, they fall short of providing predictive insights that can proactively prevent non-
compliance. To address this gap, this study developed and validated an industry-aware machine learning model capable of
predicting VAT non-compliance using integrated administrative microdata. The study also benchmarked VAT non-
compliance across taxpayer scales and industries to identify systematic sectoral heterogeneity to generate actionable
evidence for risk-based auditing and more targeted policy design. This study integrated VAT declarations, EBM
transactions, and customs import records from Rwanda Revenue Authority (RRA) for the period 2020–2024, linking them
at the taxpayer level to build a comprehensive compliance dataset. An Extreme gradient Boosting (XGBoost) classifier was
applied, with class imbalance addressed through weighting to ensure that the minority class of VAT non-compliant returns
contributed proportionately to model learning. Hyperparameters were optimized through grid search and validation to
ensure robust generalization, while decision thresholds were tuned to prioritize high recall without compromising precision.
Model performance was evaluated using accuracy, precision, recall, F1-score, and both ROC-AUC and PR-AUC, with
additional out-of-time validation to confirm stability. Feature interpretability was ensured through SHARP-based
importance analysis, which highlighted the relative contribution of discrepancies between EBM sales and declared turnover,
penalty history, and trade activity in predicting VAT non-compliance. The model achieved high predictive performance for
the non-compliant class (accuracy 98.9%, precision 0.932, recall 0.887, F1-score 0.909) with robust generalization across tax
years. The VAT non-compliance is 6.9% overall, with statistically significant between-industry dispersion (ANOVA p-
value<0.001). Elevated risk appears in transport and storage, wholesale and retail trade, manufacturing, mining and
quarrying, electricity, gas, steam & air conditioning supply, and activities of households as employers. Non-compliance also
increases with taxpayer scale (large 11.5%, medium 9.4%, small 6.0%). Feature importance confirms the operational
salience of EBM sales and total value of supplies declared discrepancies and penalty history.
Conclusion:
Industry-aware predictive analytics can materially strengthen risk-based auditing in Rwanda by targeting higher-risk
sectors and scales, improving audit efficiency and revenue recovery, and providing replicable benchmarks for sector-specific
policy design.
Keywords :
VAT, Non-Compliance, Machine Learning, Rwanda, EBM.