Machine Learning-Based Detection of SQL Injection and Data Exfiltration Through Behavioral Profiling of Relational Query Patterns


Authors : Semirat Abidemi Balogun; Onuh Matthew Ijiga; Nonso Okika; Lawrence Anebi Enyejo; Ogboji James Agbo

Volume/Issue : Volume 10 - 2025, Issue 8 - August


Google Scholar : https://tinyurl.com/9uec8jt7

Scribd : https://tinyurl.com/ztfayb65

DOI : https://doi.org/10.38124/ijisrt/25aug324

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : SQL injection and data exfiltration remain among the most severe threats to relational database security, often leading to critical data breaches in enterprise systems. This review explores the application of machine learning techniques for detecting such threats by profiling the behavioral patterns of relational SQL queries. Unlike traditional rule-based approaches, machine learning models enable the dynamic identification of anomalous query structures and access behaviors indicative of malicious intent. The study synthesizes recent advancements in supervised, unsupervised, and deep learning methods tailored for query classification, anomaly detection, and user behavior modeling. Furthermore, it evaluates the efficacy of these techniques in detecting stealthy exfiltration attacks under evolving threat landscapes. Emphasis is placed on data preprocessing strategies, feature extraction from SQL logs, and the use of graph-based and sequence-aware models for enhanced detection accuracy. The review concludes by outlining emerging challenges such as adversarial query generation, concept drift, and the need for explainable models in high-assurance environments.

Keywords : SQL Injection Detection, Data Exfiltration, Machine Learning, Behavioral Profiling, Relational Query Patterns, Anomaly Detection.

References :

  1. Adebayo, A. B., & Al-Dubai, A. Y. (2020). Leveraging machine learning for secure database access: A behavioral profiling approach. Information Systems, 92, 101521. https://doi.org/10.1016/j.is.2020.101521
  2. Adekunle, F., & Zhang, T. (2024). Event-driven frameworks for real-time intrusion detection in SQL-intensive applications. ACM Transactions on Privacy and Security, 27(1), 1–25. https://doi.org/10.1145/3591230
  3. Aggarwal, C. C., & Sathe, S. (2020). Theoretical foundations and algorithms for outlier ensembles. ACM Computing Surveys, 53(6), 1–36. https://doi.org/10.1145/3398037
  4. Akhtar, S., & Farooq, M. (2020). Real-time detection of SQL anomalies using deep autoencoders and stream processors. Journal of Network and Computer Applications, 157, 102591. https://doi.org/10.1016/j.jnca.2020.102591
  5. Alghawazi, et al. (2023). SQL injection detection using RNN autoencoder and LSTM. arXiv preprint.
  6. Alomari, M., & Wang, J. (2021). Deep structural analysis of SQL queries for anomaly detection. IEEE Transactions on Dependable and Secure Computing.
  7. Alshammari, R., Alwan, Z., & Alzain, M. A. (2021). Advanced SQL injection attack detection using behavioral features and statistical analysis. Computers, Materials & Continua, 66(2), 1631–1646. https://doi.org/10.32604/cmc.2021.013267
  8. Altwaijry, H., & El-Alfy, E. M. (2021). Evaluation of rule-based intrusion detection systems for SQLi vulnerabilities. IEEE Transactions on Dependable and Secure Computing, 18(4), 1549–1562. https://doi.org/10.1109/TDSC.2019.2896769
  9. Awotiwon,  B. O.,  Enyejo, J. O., Owolabi, F. R. A., Babalola, I. N. O., & Olola, T. M. (2024). Addressing Supply Chain Inefficiencies to Enhance Competitive Advantage in Low-Cost Carriers (LCCs) through Risk Identification and Benchmarking Applied to Air Australasia’s Operational Model. World Journal of Advanced Research and Reviews, 2024, 23(03), 355–370. https://wjarr.com/content/addressing-supply-chain-inefficiencies-enhance-competitive-advantage-low-cost-carriers-lccs
  10. Banerjee, A., & Roy, A. (2022). Intelligent profiling of SQL attack surfaces: A review of recent progress. Information & Computer Security, 30(4), 597–617. https://doi.org/10.1108/ICS-11-2021-0146
  11. Banerjee, A., & Singh, R. (2022). Metric-aware performance evaluation in SQL-based threat detection systems. Computers & Security, 115, 102620. https://doi.org/10.1016/j.cose.2022.102620
  12. Bashir, F., Rauf, A., & Shahid, A. R. (2023). A hybrid AI-based framework for behavioral anomaly detection in SQL transactions. Neural Computing and Applications, 35, 11571–11585. https://doi.org/10.1007/s00521-023-08159-2
  13. Bashiru, O., Ochem, C., Enyejo, L. A., Manuel, H. N. N., & Adeoye, T. O. (2024). The crucial role of renewable energy in achieving the sustainable development goals for cleaner energy. *Global Journal of Engineering and Technology Advances*, 19(03), 011-036. https://doi.org/10.30574/gjeta.2024.19.3.0099
  14. Bianchi, F., Grana, M., & Rossi, C. (2021). Graph neural networks for anomaly detection in SQL query graphs. IEEE Transactions on Neural Networks and Learning Systems.
  15. Chatterjee, M., Gupta, S., & Bera, P. (2021). Profiling SQL behavior using deep learning for injection attack detection. Computers & Security, 105, 102240. https://doi.org/10.1016/j.cose.2021.102240
  16. Chen, D., & Zhao, Q. (2021). Low-latency SQL injection detection in distributed databases using recurrent neural networks. Future Generation Computer Systems, 117, 71–84. https://doi.org/10.1016/j.future.2020.11.014
  17. Chen, H., Yu, L., & Zhang, Y. (2021). Static and signature-based detection of SQL injection: A retrospective and limitations. Journal of Information Security and Applications, 59, 102836. https://doi.org/10.1016/j.jisa.2021.102836
  18. Chen, L., Rao, Q., & Zhao, Y. (2020). Temporal sequence modeling of SQL queries for anomaly detection. IEEE Transactions on Information Forensics and Security.
  19. Corporal Machine Learning Algorithms in SIEM Systems for Enhanced Detection. (2023). ResearchGate Conference Paper.
  20. Demilie, W. B., & Deriba, F. G. (2022). Detection and prevention of SQL‑injection attacks and developing compressive framework using machine learning and hybrid techniques. Journal of Big Data, 9(1), 124.
  21. Falor, A., Hirani, M., Vedant, H., Mehta, P., & Krishnan, D. (2022). A deep learning approach for detection of SQL injection attacks using convolutional neural networks. In Proceedings of Data Analytics and Management: ICDAM 2021 (Vol. 2, pp. 293–304).
  22. Garcia, R., & Watts, B. (2021). Role-aware machine learning for insider threat detection. ACM Transactions on Privacy and Security.
  23. Geeksforgeeks, (2024). Supervised and Unsupervised learning, https://www.geeksforgeeks.org/machine-learning/supervised-unsupervised-learning/
  24. Godwins, O. P., David-Olusa, A., Ijiga, A. C.,  Olola, T. M., & Abdallah, S. (2024). The role of renewable and cleaner energy in achieving sustainable development goals and enhancing nutritional outcomes: Addressing malnutrition, food security, and dietary quality. World Journal of Biology Pharmacy and Health Sciences, 2024, 19(01), 118–141. https://wjbphs.com/sites/default/files/WJBPHS-2024-0408.pdf
  25. Godwins, O. P., Ochagwuba, E., Idoko, I. P., Akpa, F. A., Olajide, F. I., & Olatunde, T. I. (2024). Comparative analysis of disaster management strategies and their impact on nutrition outcomes in the USA and Nigeria. *Business and Economics in Developing Countries (BEDC)*, 2(2), 34-42. http://doi.org/10.26480/bedc.02.2024.34.42
  26. Gomez, P., Sánchez, F., & Molina, J. (2022). Context-augmented profiling of database queries. Journal of Big Data Security.
  27. Haque, A., & Soliman, H. (2025). A transformer‑based autoencoder with Isolation Forest and XGBoost for malfunction and intrusion detection in wireless sensor networks. Future Internet, 17(4), 164.
  28. Hussain, S., Ahmed, T., & Nazir, U. (2022). User-role activity profiling in relational databases. Information Sciences.
  29. Ibokette, A. I., Aboi, E. J., Ijiga, A. C., Ugbane, S. I., Odeyemi, M. O., & Umama, E. E. (2024). The impacts of curbside feedback mechanisms on recycling performance of households in the United States. *World Journal of Biology Pharmacy and Health Sciences*, 17(2), 366-386.
  30. Ibokette., A. I. Ogundare, T. O., Danquah, E. O., Anyebe, A. P., Agaba, J. A., & Olola, T. M. (2024). The impacts of emotional intelligence and IOT on operational efficiency in manufacturing: A cross-cultural analysis of Nigeria and the US. Computer Science & IT Research Journal P-ISSN: 2709-0043, E-ISSN: 2709-0051. DOI: 10.51594/csitrj.v5i8.1464
  31. Ibokette., A. I. Ogundare, T. O., Danquah, E. O., Anyebe, A. P., Agaba, J. A., & Agaba, J. A. (2024). Optimizing maritime communication networks with virtualization, containerization and IoT to address scalability and real – time data processing challenges in vessel – to –shore communication. Global Journal of Engineering and Technology Advances, 2024, 20(02), 135–174. https://gjeta.com/sites/default/files/GJETA-2024-0156.pdf
  32. Ibrahim, M. M., & Suryani, V. (2023). Classification of SQL injection attacks using ensemble learning SVM and Naïve Bayes. In Proceedings of 2023 International Conference on Data Science and Its Applications (ICODSA) (pp. 230–236).
  33. Idoko P. I., Igbede, M. A., Manuel, H. N. N.,  Ijiga, A. C.,   Akpa, F. A.,  & Ukaegbu, C. (2024). Assessing the impact of wheat varieties and processing methods on diabetes risk: A systematic review. World Journal of Biology Pharmacy and Health Sciences, 2024, 18(02), 260–277. https://wjbphs.com/sites/default/files/WJBPHS-2024-0286.pdf
  34. Idoko, I. P., Igbede, M. A., Manuel, H. N. N., Adeoye, T. O., Akpa, F. A., & Ukaegbu, C. (2024). Big data and AI in employment: The dual challenge of workforce replacement and protecting customer privacy in biometric data usage. *Global Journal of Engineering and Technology Advances*, 19(02), 089-106. https://doi.org/10.30574/gjeta.2024.19.2.0080
  35. Idoko, I. P., Ijiga, O. M., Agbo, D. O., Abutu, E. P., Ezebuka, C. I., & Umama, E. E. (2024). Comparative analysis of Internet of Things (IOT) implementation: A case study of Ghana and the USA-vision, architectural elements, and future directions. *World Journal of Advanced Engineering Technology and Sciences*, 11(1), 180-199.
  36. Idoko, I. P., Ijiga, O. M., Akoh, O., Agbo, D. O., Ugbane, S. I., & Umama, E. E. (2024). Empowering sustainable power generation: The vital role of power electronics in California's renewable energy transformation. *World Journal of Advanced Engineering Technology and Sciences*, 11(1), 274-293.
  37. Idoko, I. P., Ijiga, O. M., Enyejo, L. A., Akoh, O., & Ileanaju, S. (2024). Harmonizing the voices of AI: Exploring generative music models, voice cloning, and voice transfer for creative expression.
  38. Idoko, I. P., Ijiga, O. M., Enyejo, L. A., Ugbane, S. I., Akoh, O., & Odeyemi, M. O. (2024). Exploring the potential of Elon Musk's proposed quantum AI: A comprehensive analysis and implications. *Global Journal of Engineering and Technology Advances*, 18(3), 048-065.
  39. Igba, E., Danquah, E. O.,  Ukpoju, E. A.,   Obasa, J.,  Olola, T. M., & Enyejo, J. O. (2024). Use of Building Information Modeling (BIM) to Improve Construction Management in the USA. World Journal of Advanced Research and Reviews, 2024, 23(03), 1799–1813. https://wjarr.com/content/use-building-information-modeling-bim-improve-construction-management-usa
  40. Ijiga, O. M., Idoko, I. P., Ebiega, G. I., Olajide, F. I., Olatunde, T. I., & Ukaegbu, C. (2024). Harnessing adversarial machine learning for advanced threat detection: AI-driven strategies in cybersecurity risk assessment and fraud prevention. Open Access Research Journals. Volume 13, Issue.  https://doi.org/10.53022/oarjst.2024.11.1.0060I
  41. Integrating SIEM with Data Lakes and AI: Enhancing Threat Detection and Response. (2024). ResearchGate Paper.
  42. Iqbal, W., & Naeem, M. (2024). Behavior-aware database intrusion detection: Trends and gaps. Journal of Cybersecurity and Privacy, 4(2), 207–230. https://doi.org/10.3390/jcp4020013
  43. Kamble, M. Y., Wankhade, K., & Barde, B. (2020). Comparative study on SQL injection detection using rule-based methods. Procedia Computer Science, 172, 641–648. https://doi.org/10.1016/j.procs.2020.05.088
  44. Kaushik, A., & Joshi, R. (2020). Structured feature representation of SQL queries for anomaly detection. Future Generation Computer Systems, 111, 504–517. https://doi.org/10.1016/j.future.2020.05.031
  45. Khan, S., & Ahmad, R. (2022). Grammar-based normalization of SQL statements for effective injection detection. International Journal of Information Security.
  46. Lee, H., Kim, D., & Park, S. (2023). Adaptive SQL query normalization with machine learning. ACM Transactions on Database Systems.
  47. Li, Y., Yang, T., & Jiang, M. (2023). Adaptive anomaly detection in streaming data using hybrid neural models. Journal of Artificial Intelligence Research, 76, 231–257. https://doi.org/10.1613/jair.1.13564
  48. Liu, F. T., Ting, K. M., & Zhou, Z. H. (2021). Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(3), 1–28. https://doi.org/10.1145/3458446
  49. Liu, H., Guo, Q., & Li, S. (2023). Systematic review of injection vulnerabilities and data leakage in cloud-based databases. Future Generation Computer Systems, 145, 259–272. https://doi.org/10.1016/j.future.2023.03.018
  50. Liu, Y., Zhao, H., & Fan, Y. (2022). Anomaly-based detection of SQLi using LSTM sequence learning. Expert Systems with Applications, 193, 116385. https://doi.org/10.1016/j.eswa.2021.116385
  51. Lu, Y., Chen, X., & Fang, J. (2022). Representing relational queries as graphs for intrusion detection. Applied Soft Computing.
  52. Mehrotra, R., & Thakur, R. (2023). Extraction of behavioral features from SQL logs using unsupervised deep encoders. Pattern Recognition Letters, 169, 30–38. https://doi.org/10.1016/j.patrec.2023.01.015
  53. Mohd Yazid Idris et al. (2023). An improved LSTM‑PCA ensemble classifier for SQL injection and XSS detection. UTM e‑prints.
  54. Onuh, J. E., Idoko, I. P., Igbede, M. A., Olajide, F. I., Ukaegbu, C., & Olatunde, T. I. (2024). Harnessing synergy between biomedical and electrical engineering: A comparative analysis of healthcare advancement in Nigeria and the USA. *World Journal of Advanced Engineering Technology and Sciences*, 11(2), 628-649.
  55. Ouyang, X., Lin, W., & Zhang, H. (2023). Online anomaly detection for relational databases using attention-based streaming models. Neurocomputing, 522, 87–101. https://doi.org/10.1016/j.neucom.2022.12.072
  56. Owolabi, F. R. A., Enyejo, J. O., Babalola, I. N. O., & Olola, T. M. (2024).  Overcoming engagement shortfalls and financial constraints in Small and Medium Enterprises (SMES) social media advertising through cost-effective Instagram strategies in Lagos and New York City. International Journal of Management & Entrepreneurship Research P-ISSN: 2664-3588, E-ISSN: 2664-3596.  DOI: 10.51594/ijmer.v6i8.1462
  57. Patel, D., Sharma, K., & Mehta, S. (2023). Supervised modeling of user-based SQL activity for anomaly detection. Computers & Security.
  58. Pu, et al. (2022). Detecting zero‑day web attacks with an ensemble of LSTM, GRU, and stacked autoencoders. Computers, 14(6), 205.
  59. Qureshi, M. A., & Khan, S. (2023). Detecting data exfiltration from relational queries: A machine learning perspective. IEEE Access, 11, 74501–74514. https://doi.org/10.1109/ACCESS.2023.3282905
  60. Rahman, M., Ahmed, F., & Miah, M. S. (2022). The weakness of black-box SQL injection scanners in modern web applications. Security and Privacy, 5(2), e205. https://doi.org/10.1002/spy2.205
  61. Sabottke, C. F., & Abraham, J. (2022). Survey of anomaly detection for relational data using supervised and unsupervised learning. IEEE Transactions on Knowledge and Data Engineering, 34(4), 1527–1540. https://doi.org/10.1109/TKDE.2021.3053062
  62. Sajjad, A., Nasir, Q., & Shafique, M. (2022). A taxonomy and survey of SQL injection detection and prevention techniques. Journal of Network and Computer Applications, 195, 103217. https://doi.org/10.1016/j.jnca.2021.103217
  63. Setiyaji, A., & Ramli, K. (2024). A technique utilizing CNN for identification of SQL injection attacks. 2024 ICSINTESA Conference Proceedings.
  64. Sharma, P., & Desai, R. (2023). Query–table interaction graphs for exfiltration detection. Knowledge-Based Systems.
  65. Sharma, R., Dey, L., & Kumar, S. (2022). Semantic embedding of structured query language statements for intrusion detection. Knowledge-Based Systems, 240, 108025. https://doi.org/10.1016/j.knosys.2022.108025
  66. Singh, A., & Jang‑Jaccard, J. (2022). Autoencoder‑based unsupervised intrusion detection using multi‑scale convolutional recurrent networks. arXiv preprint.
  67. Singh, A., & Kumar, P. (2023). LSTM‑based behavioral profiling of SQL query streams. Future Generation Computer Systems.
  68. Stiawan, D., et al. (2023). LSTM+PCA composite model to detect SQL injection and XSS. Scientific Reports.
  69. Suretysystems, (2025).  Enhance SAP System Security: Top Strategies for SAP SIEM Integration, https://www.suretysystems.com/insights/enhance-sap-system-security-top-strategies-for-sap-siem-integration/
  70. Tang, L., et al. (2020). Attack detection in network flow data using LSTM for SQL injection. International Journal of Applied Engineering Research, 15(6), 569–580.
  71. The Future of SIEM in a Machine Learning‑Driven Cybersecurity. (2023). Turkish Journal of Computer and Mathematics Education.
  72. Uetz, R., Herzog, M., Hackländer, L., Schwarz, S., & Henze, M. (2023). You cannot escape me: detecting evasions of SIEM rules in enterprise networks. arXiv preprint.
  73. Vakharia, M., & Patel, V. (2020). Benchmarking datasets for anomaly detection in SQL injection scenarios. Journal of Cybersecurity, 6(1), taaa011. https://doi.org/10.1093/cybsec/taaa011
  74. Wang, T., & Li, M. (2024). Context-aware detection of data exfiltration via query patterns. Computers & Security.
  75. Web Traffic Anomaly Detection Using Isolation Forest. (2024). MDPI International Journal of Data, 11(4), 83.
  76. Xu, J., & Tan, Z. (2023). A framework for benchmark dataset creation in SQL-based attack detection using graph learning. IEEE Access, 11, 48526–48538. https://doi.org/10.1109/ACCESS.2023.3265270
  77. Yoon, J., Park, E., & Han, S. (2024). Hybrid role-based anomaly detection in enterprise queries. IEEE Transactions on Software Engineering.
  78. Zhang, J., & Yu, W. (2021). Feature transformation for SQL injection detection using query dependency graphs. Information Sciences, 569, 1–18. https://doi.org/10.1016/j.ins.2021.02.005
  79. Zhang, M., & Wang, X. (2021). Comparative analysis of evaluation metrics for SQL anomaly classifiers. Expert Systems with Applications, 185, 115550. https://doi.org/10.1016/j.eswa.2021.115550
  80. Zhang, Y., Xu, L., & Li, X. (2024). Structural feature extraction for SQL anomaly detection. Journal of Computer Security.
  81. Zheng, Y., Xie, T., & Xu, D. (2020). From SQL injection to data exfiltration: Challenges and countermeasures. IEEE Access, 8, 172495–172508. https://doi.org/10.1109/ACCESS.2020.3025084
  82. Zhou, W., Li, Z., & Xu, H. (2024). Graph stream learning of SQL behaviors. Information Sciences.
  83. Zhou, Y., Xu, Y., & Wang, C. (2021). Machine learning for database security: A systematic review. ACM Computing Surveys, 54(9), 1–36. https://doi.org/10.1145/3457600

SQL injection and data exfiltration remain among the most severe threats to relational database security, often leading to critical data breaches in enterprise systems. This review explores the application of machine learning techniques for detecting such threats by profiling the behavioral patterns of relational SQL queries. Unlike traditional rule-based approaches, machine learning models enable the dynamic identification of anomalous query structures and access behaviors indicative of malicious intent. The study synthesizes recent advancements in supervised, unsupervised, and deep learning methods tailored for query classification, anomaly detection, and user behavior modeling. Furthermore, it evaluates the efficacy of these techniques in detecting stealthy exfiltration attacks under evolving threat landscapes. Emphasis is placed on data preprocessing strategies, feature extraction from SQL logs, and the use of graph-based and sequence-aware models for enhanced detection accuracy. The review concludes by outlining emerging challenges such as adversarial query generation, concept drift, and the need for explainable models in high-assurance environments.

Keywords : SQL Injection Detection, Data Exfiltration, Machine Learning, Behavioral Profiling, Relational Query Patterns, Anomaly Detection.

CALL FOR PAPERS


Paper Submission Last Date
30 - November - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe