⚠ Official Notice: www.ijisrt.com is the official website of the International Journal of Innovative Science and Research Technology (IJISRT) Journal for research paper submission and publication. Please beware of fake or duplicate websites using the IJISRT name.



Fincure AI: Predicting Insurance Charges Using Machine Learning Techniques


Authors : Dr. N. Dhivya; M. Subalakshmi

Volume/Issue : Volume 11 - 2026, Issue 4 - April


Google Scholar : https://tinyurl.com/3nhe9ztv

Scribd : https://tinyurl.com/884mz5fc

DOI : https://doi.org/10.38124/ijisrt/26apr1803

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : The insurance industry increasingly relies on data-driven technologies to enhance pricing strategies and risk assessment. This study presents a machine learning-based approach to predict insurance charges using customer demographic and health-related attributes. The dataset consists of features such as age, gender, body mass index (BMI), number of children, smoking status, and region. Data preprocessing techniques, including data cleaning, categorical encoding, and normalization, were applied to improve model performance. A Linear Regression algorithm was implemented to develop the prediction model. The model was evaluated using performance metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R²) score. The proposed model achieved an R² score of 0.85 (85% accuracy), indicating a strong relationship between predicted and actual insurance charges. The results demonstrate that machine learning techniques can effectively model complex relationships in insurance data and provide reliable predictions. This system can assist insurance companies in making accurate, data-driven pricing decisions.

Keywords : Machine Learning, Insurance Prediction, Linear Regression, Data Science, Data Preprocessing, Predictive Analytics.

References :

  1. Brati, E., Braimllari, A., & Gjeçi, A. Machine Learning Applications for Predicting High-Cost Claims Using Insurance Data. MDPI Data Journal, 2025.
  2. Kulkarni, M., et al. Medical Insurance Cost Prediction Using Machine Learning. IJRASET Journal.
  3. AbdElminaam, D., et al. An Efficient Framework for Predicting Medical Insurance Prices Using Machine Learning.
  4. Zanke, P., Raparthi, M. Predictive Modelling for Insurance Pricing Using Machine Learning.
  5. Kshirsagar, R., et al. Machine Learning Regression Framework for Predicting Health Insurance Premiums.
  6. T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.
  7. L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
  8. J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. Morgan Kaufmann, 2011.
  9. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016.
  10. P. Domingos, “A Few Useful Things to Know About Machine Learning,” Communications of the ACM, vol. 55, no. 10, pp. 78–87, 2012.
  11. S. B. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques,” Informatica, vol. 31, pp. 249–268, 2007.
  12. C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
  13. J. Friedman, T. Hastie, and R. Tibshirani, The Elements of Statistical Learning, Springer, 2009.
  14. D. Dua and C. Graff, “UCI Machine Learning Repository,” University of California, Irvine, 2017.
  15. A. Géron, Hands-On Machine Learning with Scikit-Learn and TensorFlow, O’Reilly Media, 2017.
  16. W. McKinney,Data Structures for Statistical Computing in Python,Proceedings of the Python in Science Conference, 2010.
  17. J. D. Hunter, “Matplotlib: A 2D Graphics Environment,Computing in Science & Engineering, vol. 9, no. 3, pp. 90–95, 2007.
  18. F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
  19. T. Mitchell, Machine Learning, McGraw-Hill, 1997.
  20. S. Raschka and V. Mirjalili, Python Machine Learning, Packt Publishing, 2017.
  21. J. Brownlee, Machine Learning Mastery with Python, Machine Learning Mastery, 2017.
  22. V. Vapnik, The Nature of Statistical Learning Theory, Springer, 1995.
  23. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
  24. F. Chollet, Deep Learning with Python, Manning Publications, 2017.
  25. E. Alpaydin, Introduction to Machine Learning, MIT Press, 2014.
  26. S. Haykin, Neural Networks and Learning Machines, Pearson Education, 2009.
  27. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  28. R. B. Myerson, “Game Theory and Insurance Decisions,” Journal of Economic Perspectives, vol. 7, no. 2, pp. 43–64, 1993.
  29. S. Shalev-Shwartz and S. Ben-David, Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014.
  30. OECD, “The Role of Big Data in Insurance,” OECD Publishing, 2020.
  31. D. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT Press, 2001.
  32. G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning, Springer, 2013.
  33. J. Leskovec, A. Rajaraman, and J. Ullman, Mining of Massive Datasets, Cambridge University Press, 2014.
  34. M. Kuhn and K. Johnson, Applied Predictive Modeling, Springer, 2013.
  35. T. Hastie, R. Tibshirani, and J. Friedman, Statistical Learning with Sparsity, CRC Press, 2015.
  36. A. Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd ed., O’Reilly Media, 2019.
  37. J. VanderPlas, Python Data Science Handbook, O’Reilly Media, 2016.
  38. R. Elmasri and S. Navathe, Fundamentals of Database Systems, Pearson, 2016.
  39. S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Pearson, 2010.
  40. C. Aggarwal, Data Mining: The Textbook, Springer, 2015.

The insurance industry increasingly relies on data-driven technologies to enhance pricing strategies and risk assessment. This study presents a machine learning-based approach to predict insurance charges using customer demographic and health-related attributes. The dataset consists of features such as age, gender, body mass index (BMI), number of children, smoking status, and region. Data preprocessing techniques, including data cleaning, categorical encoding, and normalization, were applied to improve model performance. A Linear Regression algorithm was implemented to develop the prediction model. The model was evaluated using performance metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared (R²) score. The proposed model achieved an R² score of 0.85 (85% accuracy), indicating a strong relationship between predicted and actual insurance charges. The results demonstrate that machine learning techniques can effectively model complex relationships in insurance data and provide reliable predictions. This system can assist insurance companies in making accurate, data-driven pricing decisions.

Keywords : Machine Learning, Insurance Prediction, Linear Regression, Data Science, Data Preprocessing, Predictive Analytics.

Paper Submission Last Date
30 - June - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS
Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe