Predictive VAT Non-Compliance Benchmarking Across Industries in Rwanda: A Machine Learning Approach


Authors : Celestin Niyomugabo; Sunday A. Idowu

Volume/Issue : Volume 10 - 2025, Issue 10 - October


Google Scholar : https://tinyurl.com/mr2x4xxd

Scribd : https://tinyurl.com/5n6ff6nh

DOI : https://doi.org/10.38124/ijisrt/25oct139

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : Value Added Tax (VAT) non-compliance remains a persistent challenge in Rwanda despite the nationwide rollout of Electronic Billing Machine (EBM) and other digital reforms. While retrospective VAT gap studies have been useful in quantifying the scale of revenue loss, they fall short of providing predictive insights that can proactively prevent non- compliance. To address this gap, this study developed and validated an industry-aware machine learning model capable of predicting VAT non-compliance using integrated administrative microdata. The study also benchmarked VAT non- compliance across taxpayer scales and industries to identify systematic sectoral heterogeneity to generate actionable evidence for risk-based auditing and more targeted policy design. This study integrated VAT declarations, EBM transactions, and customs import records from Rwanda Revenue Authority (RRA) for the period 2020–2024, linking them at the taxpayer level to build a comprehensive compliance dataset. An Extreme gradient Boosting (XGBoost) classifier was applied, with class imbalance addressed through weighting to ensure that the minority class of VAT non-compliant returns contributed proportionately to model learning. Hyperparameters were optimized through grid search and validation to ensure robust generalization, while decision thresholds were tuned to prioritize high recall without compromising precision. Model performance was evaluated using accuracy, precision, recall, F1-score, and both ROC-AUC and PR-AUC, with additional out-of-time validation to confirm stability. Feature interpretability was ensured through SHARP-based importance analysis, which highlighted the relative contribution of discrepancies between EBM sales and declared turnover, penalty history, and trade activity in predicting VAT non-compliance. The model achieved high predictive performance for the non-compliant class (accuracy 98.9%, precision 0.932, recall 0.887, F1-score 0.909) with robust generalization across tax years. The VAT non-compliance is 6.9% overall, with statistically significant between-industry dispersion (ANOVA p- value<0.001). Elevated risk appears in transport and storage, wholesale and retail trade, manufacturing, mining and quarrying, electricity, gas, steam & air conditioning supply, and activities of households as employers. Non-compliance also increases with taxpayer scale (large 11.5%, medium 9.4%, small 6.0%). Feature importance confirms the operational salience of EBM sales and total value of supplies declared discrepancies and penalty history.  Conclusion: Industry-aware predictive analytics can materially strengthen risk-based auditing in Rwanda by targeting higher-risk sectors and scales, improving audit efficiency and revenue recovery, and providing replicable benchmarks for sector-specific policy design.

Keywords : VAT, Non-Compliance, Machine Learning, Rwanda, EBM.

References :

  1. M. Keen, “The nature, importance, and spread of the VAT,” in The modern VAT, International Monetary Fund, 2005, pp. 1–22. Available: https://www.elibrary.imf.org/display/book/9781589060265/ch01.xml
  2. R. De Mooij and A. Swistak, “Value-added tax continues to expand,” Finance & Development, vol. 59, no. 1, 2022.
  3. M. Keen, “Taxation and development — again,” IMF WORKING PAPERS, 2016.
  4. D. Ntihemuka, C. Niyomugabo, J. C. Nshimiyimana, U. I. Grace, F. Hakizimana, and C. Harushyubuzima, “TAX STATISTICS IN RWANDA: FISCAL YEAR 2023/24 - 7th edition,” Rwanda Revenue Authority, 2024.
  5. L. Ebrill, M. Keen, J.-P. Bodin, and V. Summers, Eds., The modern VAT. International Monetary Fund, 2001.
  6. T. Siwale, B. Dillon, J. Mulenga, and K. Musole, “Harnessing the power of electronic fiscal devices to increase VAT revenue in zambia: The role of consumers and consumer incentive schemes,” International Growth Centre (IGC), Policy Brief ZMB-20014, 2021.
  7. E. Ghirmai, S. Logan, and S. MuRRAy, “The incidence and impact of electronic billing machines for VAT in rwanda.” International Growth Centre, 2016.
  8. G. Mascagni, D. Mukama, and F. Santoro, “An analysis of discrepancies in taxpayers’ VAT declarations in rwanda,” 2019.
  9. P. F. Mugambwa and O. Habineza, “Tax alert: Rwanda moves to make electronic invoicing system (EIS) invoices mandatory for all businesses!” PwC, 2021.
  10. N. Hakizimana and F. Santoro, “Technology evolution and tax compliance: Evidence from rwanda,” workingpaper, 2024.
  11. O. Tuyishimire and B. F. Murorunkwere, “Applications of big data analytics in tax compliance monitoring: A case study of rwanda’s value-added tax,” CESifo Economic Studies, 2024.
  12. C. MUNEZERO, “Value added tax fraud detection using naïve bayes data mining approach,” Master's Thesis, University of Rwanda, College of Business; Economics, Kigali, Rwanda, 2020.
  13. M. Battaglini, L. Guiso, C. Lacava, D. L. Miller, and E. Patacchini, “Refining public policies with machine learning: The case of tax auditing,” National Bureau of Economic Research, Cambridge, MA, Working Paper Series 30777, 2022. Available: http://www.nber.org/papers/w30777
  14. South African Revenue Service ( SARS), “ SARS uncovers non-compliance through data-driven risk detection.”
  15. A. Kamoun, R. Boujelbane, and S. Boujelben, “A prediction model to detect non-compliant taxpayers using a supervised machine learning approach: Evidence from tunisia,” Journal of Business Analytics, vol. 8, no. 2, pp. 116–133, 2025, Available: https://doi.org/10.1080/2573234X.2024.2438195
  16. M. A. Umar, C. Derashid, I. Ibrahim, and Z. Bidin, “Public governance quality and tax compliance behavior in developing countries,” International Journal of Social Economics, vol. 46, no. 3, pp. 338–351, Oct. 2018, doi: 10.1108/IJSE-11-2016-0338.
  17. K. P. Chen and C. Y. C. Chu, “Internal control versus external manipulation: A model of corporate income tax evasion,” The RAND Journal of Economics, vol. 36, no. 1, pp. 151–164, 2005.
  18. M. Hanlon and S. Heitzman, “A review of tax research,” Journal of Accounting and Economics, vol. 50, no. 2–3, pp. 127–178, 2010.
  19. J. West and M. Bhattacharya, “Intelligent financial fraud detection: A comprehensive review,” Computers & Security, vol. 57, pp. 47–66, 2016.
  20. D. Pomeranz, “No taxation without information: Deterrence and self-enforcement in the value added tax,” American Economic Review, vol. 105, no. 8, pp. 2539–2569, 2015.

Value Added Tax (VAT) non-compliance remains a persistent challenge in Rwanda despite the nationwide rollout of Electronic Billing Machine (EBM) and other digital reforms. While retrospective VAT gap studies have been useful in quantifying the scale of revenue loss, they fall short of providing predictive insights that can proactively prevent non- compliance. To address this gap, this study developed and validated an industry-aware machine learning model capable of predicting VAT non-compliance using integrated administrative microdata. The study also benchmarked VAT non- compliance across taxpayer scales and industries to identify systematic sectoral heterogeneity to generate actionable evidence for risk-based auditing and more targeted policy design. This study integrated VAT declarations, EBM transactions, and customs import records from Rwanda Revenue Authority (RRA) for the period 2020–2024, linking them at the taxpayer level to build a comprehensive compliance dataset. An Extreme gradient Boosting (XGBoost) classifier was applied, with class imbalance addressed through weighting to ensure that the minority class of VAT non-compliant returns contributed proportionately to model learning. Hyperparameters were optimized through grid search and validation to ensure robust generalization, while decision thresholds were tuned to prioritize high recall without compromising precision. Model performance was evaluated using accuracy, precision, recall, F1-score, and both ROC-AUC and PR-AUC, with additional out-of-time validation to confirm stability. Feature interpretability was ensured through SHARP-based importance analysis, which highlighted the relative contribution of discrepancies between EBM sales and declared turnover, penalty history, and trade activity in predicting VAT non-compliance. The model achieved high predictive performance for the non-compliant class (accuracy 98.9%, precision 0.932, recall 0.887, F1-score 0.909) with robust generalization across tax years. The VAT non-compliance is 6.9% overall, with statistically significant between-industry dispersion (ANOVA p- value<0.001). Elevated risk appears in transport and storage, wholesale and retail trade, manufacturing, mining and quarrying, electricity, gas, steam & air conditioning supply, and activities of households as employers. Non-compliance also increases with taxpayer scale (large 11.5%, medium 9.4%, small 6.0%). Feature importance confirms the operational salience of EBM sales and total value of supplies declared discrepancies and penalty history.  Conclusion: Industry-aware predictive analytics can materially strengthen risk-based auditing in Rwanda by targeting higher-risk sectors and scales, improving audit efficiency and revenue recovery, and providing replicable benchmarks for sector-specific policy design.

Keywords : VAT, Non-Compliance, Machine Learning, Rwanda, EBM.

CALL FOR PAPERS


Paper Submission Last Date
31 - December - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe