Prediction and Analysis of Diabetes Using Machine Learning


Authors : Edelyn A. Bautista

Volume/Issue : Volume 10 - 2025, Issue 10 - October


Google Scholar : https://tinyurl.com/5b75fvtt

Scribd : https://tinyurl.com/yzfftw4c

DOI : https://doi.org/10.38124/ijisrt/25oct294

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : This study focuses on diabetes prediction and analysis using machine learning techniques. Its goal is to develop accurate and reliable models for early detection and better understanding of diabetes. The Diabetes UCI Dataset, containing variables like gender, polyuria, and polydipsia, is used for model training and evaluation. Data preprocessing ensures feature normalization and consistency, while feature selection identifies the most relevant variables. Several classification algorithms, including the Random Tree algorithm, are tested using WEKA. Model performance is evaluated through metrics such as accuracy, precision, and recall. Results show that Random Tree, when combined with other algorithms, achieves high accuracy and robustness in classifying diabetic and non-diabetic individuals. The study highlights the effectiveness of machine learning in early diabetes detection and decision-making support for healthcare professionals. Overall, it demonstrates how computational approaches can enhance diabetes management, improve patient outcomes, and reduce the impact of this chronic disease.

Keywords : Diabetes Prediction, Dataset, Classification Algorithm, Analysis, Prediction, Healthcare.

References :

  1. ‌A. Mujumandar and V. Vaidehi, “Diabetes Prediction using Machine Learning Algorithms,” Procedia Computer Science, vol. pp. 292-299, Feb.27,2020. https://www.sciencedirect.com/science/article/pii/S1877050920300557 (accessed Mar. 24, 2023).
  2. B. Shamreen Ahamed, M. Arya, S. K. B. Sangeetha, N. Auxilia Osvin, "Diabetes Mellitus Disease Prediction and Type Classification Involving Predictive Modeling Using Machine Learning Techniques and Classifiers", Applied Computational Intelligence and Soft Computing, vol. 2022, Article ID 7899364, 11 pages, 2022. https://www.hindawi.com/journals/acisc/2022/7899364/ (accessed Mar. 24, 2023).
  3. CDC, “What is Diabetes,” Center for Disease Control and Prevention, 2022. https://www.cdc.gov/diabetes/basics/diabetes.html (accessed Mar. 24, 2023).
  4. ‌H. Rashid Abdulqadir, A. Mohsin Abdulazeez, and D. Assad Zebari, “Data Mining Classification Techniques for Diabetes Prediction”, QAJ, vol. 1, no. 2, pp. 125–133, May 2021. https://journal.qubahan.com/index.php/qaj/article/view/55 (accessed Mar. 24, 2023).
  5. IBM SPSS, “What is random forest?,” IBM. https://www.ibm.com/topics/random-forest (accessed Mar. 29, 2023).
  6. IBM SPSS, “What is logistic regression?,” IBM. https://www.ibm.com/topics/logistic-regression (accessed Mar. 29, 2023).
  7. J. Brownlee, “Naive Bayes for Machine Learning,” Machine Learning Mastery, Aug. 15, 2020. https://machinelearningmastery.com/naive-bayes-for-machine-learning/ (accessed Mar. 29, 2023).
  8. J. Khanam, “A comparison of machine learning algorithms for diabetes prediction,” ICT Express, vol. 7, no.4, pp.432-439, Feb. 20, 2021. https://www.sciencedirect.com/science/article/pii/S2405959521000205 (accessed Mar. 24, 2023).
  9. M. Javaid, A. Haleem, R. Singh, R. Suman, and S. Rab, “Significance of machine learning in healthcare: Features, pillars and applications,” International Journal of Intelligent Networks, vol. 3, pp.58–73,2022. https://www.sciencedirect.com/science/article/pii/S2666603022000069 (accessed Mar. 24, 2023).
  10. M. Chandrasekaran, “Logistic Regression for Machine Learning,” Capital One. https://www.capitalone.com/tech/machine-learning/what-is-logistic-regression/ (accessed Mar. 29, 2023).
  11. MicroFocus, “What is Machine Learning?,” Open Text Corporation, 2023. https://www.microfocus.com/en-us/what-is/machine-learning (accessed Mar. 24, 2023).
  12. N. Chauhan, “Naïve Bayes Algorithm: Everything You Need to Know,” KD Nuggets, Apr. 08, 2022. https://www.kdnuggets.com/2020/06/naive-bayes-algorithm-everything.html (accessed Mar. 29, 2023).
  13. N. Khanna, “J48 Classification (C4.5 Algorithm) in a Nutshell,” Medium, Aug. 18, 2021. https://medium.com/@nilimakhanna1/j48-classification-c4-5-algorithm-in-a-nutshell-24c50d20658e (accessed Mar. 30, 2023).
  14. ‌O. Adigun, F. Okikiola, N. Yekini, and R. Babatunde, “Classification of Diabetes Types using Machine Learning,” (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 13, no. 9, 2022. https://thesai.org/Downloads/ Volume13No9/Paper_18- Classification_of_Diabetes _Types_using_Machine_Learning.pdf (accessed Mar. 24, 2023).
  15. S. Saru and S. Subashree, “Analysis and Prediction of Diabetes Using Machine Learning,” International Journal of Emerging Technology and Innovative Engineering, vol. 5, no.4, Apr. 23, 2019. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3368308 (accessed Mar. 24, 2023).
  16. SAS, “Machine Learning: What it is and why it matters,” SAS Insights,2023. https://www.sas.com/en_us/insights/analytics/machine-learning.html (accessed Mar. 24, 2023).
  17. WHO, “Diabetes,” Health Topics, 2023. https://www.who.int/health-topics/diabetes#tab=tab_1 (accessed Mar. 24, 2023).
  18. X. Li and et. al, “Artificial intelligence-assisted reduction in patients’ waiting time for outpatient process: a retrospective cohort study,” BMC Health Services Research, Mar. 17, 2021. https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-021-06248-z (accessed Mar. 24, 2023).

This study focuses on diabetes prediction and analysis using machine learning techniques. Its goal is to develop accurate and reliable models for early detection and better understanding of diabetes. The Diabetes UCI Dataset, containing variables like gender, polyuria, and polydipsia, is used for model training and evaluation. Data preprocessing ensures feature normalization and consistency, while feature selection identifies the most relevant variables. Several classification algorithms, including the Random Tree algorithm, are tested using WEKA. Model performance is evaluated through metrics such as accuracy, precision, and recall. Results show that Random Tree, when combined with other algorithms, achieves high accuracy and robustness in classifying diabetic and non-diabetic individuals. The study highlights the effectiveness of machine learning in early diabetes detection and decision-making support for healthcare professionals. Overall, it demonstrates how computational approaches can enhance diabetes management, improve patient outcomes, and reduce the impact of this chronic disease.

Keywords : Diabetes Prediction, Dataset, Classification Algorithm, Analysis, Prediction, Healthcare.

CALL FOR PAPERS


Paper Submission Last Date
31 - December - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe