Authors :
Edelyn A. Bautista
Volume/Issue :
Volume 10 - 2025, Issue 10 - October
Google Scholar :
https://tinyurl.com/5b75fvtt
Scribd :
https://tinyurl.com/yzfftw4c
DOI :
https://doi.org/10.38124/ijisrt/25oct294
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
This study focuses on diabetes prediction and analysis using machine learning techniques. Its goal is to develop
accurate and reliable models for early detection and better understanding of diabetes. The Diabetes UCI Dataset, containing
variables like gender, polyuria, and polydipsia, is used for model training and evaluation. Data preprocessing ensures feature
normalization and consistency, while feature selection identifies the most relevant variables. Several classification
algorithms, including the Random Tree algorithm, are tested using WEKA. Model performance is evaluated through
metrics such as accuracy, precision, and recall. Results show that Random Tree, when combined with other algorithms,
achieves high accuracy and robustness in classifying diabetic and non-diabetic individuals. The study highlights the
effectiveness of machine learning in early diabetes detection and decision-making support for healthcare professionals.
Overall, it demonstrates how computational approaches can enhance diabetes management, improve patient outcomes, and
reduce the impact of this chronic disease.
Keywords :
Diabetes Prediction, Dataset, Classification Algorithm, Analysis, Prediction, Healthcare.
References :
- A. Mujumandar and V. Vaidehi, “Diabetes Prediction using Machine Learning Algorithms,” Procedia Computer Science, vol. pp. 292-299, Feb.27,2020. https://www.sciencedirect.com/science/article/pii/S1877050920300557 (accessed Mar. 24, 2023).
- B. Shamreen Ahamed, M. Arya, S. K. B. Sangeetha, N. Auxilia Osvin, "Diabetes Mellitus Disease Prediction and Type Classification Involving Predictive Modeling Using Machine Learning Techniques and Classifiers", Applied Computational Intelligence and Soft Computing, vol. 2022, Article ID 7899364, 11 pages, 2022. https://www.hindawi.com/journals/acisc/2022/7899364/ (accessed Mar. 24, 2023).
- CDC, “What is Diabetes,” Center for Disease Control and Prevention, 2022. https://www.cdc.gov/diabetes/basics/diabetes.html (accessed Mar. 24, 2023).
- H. Rashid Abdulqadir, A. Mohsin Abdulazeez, and D. Assad Zebari, “Data Mining Classification Techniques for Diabetes Prediction”, QAJ, vol. 1, no. 2, pp. 125–133, May 2021. https://journal.qubahan.com/index.php/qaj/article/view/55 (accessed Mar. 24, 2023).
- IBM SPSS, “What is random forest?,” IBM. https://www.ibm.com/topics/random-forest (accessed Mar. 29, 2023).
- IBM SPSS, “What is logistic regression?,” IBM. https://www.ibm.com/topics/logistic-regression (accessed Mar. 29, 2023).
- J. Brownlee, “Naive Bayes for Machine Learning,” Machine Learning Mastery, Aug. 15, 2020. https://machinelearningmastery.com/naive-bayes-for-machine-learning/ (accessed Mar. 29, 2023).
- J. Khanam, “A comparison of machine learning algorithms for diabetes prediction,” ICT Express, vol. 7, no.4, pp.432-439, Feb. 20, 2021. https://www.sciencedirect.com/science/article/pii/S2405959521000205 (accessed Mar. 24, 2023).
- M. Javaid, A. Haleem, R. Singh, R. Suman, and S. Rab, “Significance of machine learning in healthcare: Features, pillars and applications,” International Journal of Intelligent Networks, vol. 3, pp.58–73,2022. https://www.sciencedirect.com/science/article/pii/S2666603022000069 (accessed Mar. 24, 2023).
- M. Chandrasekaran, “Logistic Regression for Machine Learning,” Capital One. https://www.capitalone.com/tech/machine-learning/what-is-logistic-regression/ (accessed Mar. 29, 2023).
- MicroFocus, “What is Machine Learning?,” Open Text Corporation, 2023. https://www.microfocus.com/en-us/what-is/machine-learning (accessed Mar. 24, 2023).
- N. Chauhan, “Naïve Bayes Algorithm: Everything You Need to Know,” KD Nuggets, Apr. 08, 2022. https://www.kdnuggets.com/2020/06/naive-bayes-algorithm-everything.html (accessed Mar. 29, 2023).
- N. Khanna, “J48 Classification (C4.5 Algorithm) in a Nutshell,” Medium, Aug. 18, 2021. https://medium.com/@nilimakhanna1/j48-classification-c4-5-algorithm-in-a-nutshell-24c50d20658e (accessed Mar. 30, 2023).
- O. Adigun, F. Okikiola, N. Yekini, and R. Babatunde, “Classification of Diabetes Types using Machine Learning,” (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 13, no. 9, 2022. https://thesai.org/Downloads/ Volume13No9/Paper_18- Classification_of_Diabetes _Types_using_Machine_Learning.pdf (accessed Mar. 24, 2023).
- S. Saru and S. Subashree, “Analysis and Prediction of Diabetes Using Machine Learning,” International Journal of Emerging Technology and Innovative Engineering, vol. 5, no.4, Apr. 23, 2019. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3368308 (accessed Mar. 24, 2023).
- SAS, “Machine Learning: What it is and why it matters,” SAS Insights,2023. https://www.sas.com/en_us/insights/analytics/machine-learning.html (accessed Mar. 24, 2023).
- WHO, “Diabetes,” Health Topics, 2023. https://www.who.int/health-topics/diabetes#tab=tab_1 (accessed Mar. 24, 2023).
- X. Li and et. al, “Artificial intelligence-assisted reduction in patients’ waiting time for outpatient process: a retrospective cohort study,” BMC Health Services Research, Mar. 17, 2021. https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-021-06248-z (accessed Mar. 24, 2023).
This study focuses on diabetes prediction and analysis using machine learning techniques. Its goal is to develop
accurate and reliable models for early detection and better understanding of diabetes. The Diabetes UCI Dataset, containing
variables like gender, polyuria, and polydipsia, is used for model training and evaluation. Data preprocessing ensures feature
normalization and consistency, while feature selection identifies the most relevant variables. Several classification
algorithms, including the Random Tree algorithm, are tested using WEKA. Model performance is evaluated through
metrics such as accuracy, precision, and recall. Results show that Random Tree, when combined with other algorithms,
achieves high accuracy and robustness in classifying diabetic and non-diabetic individuals. The study highlights the
effectiveness of machine learning in early diabetes detection and decision-making support for healthcare professionals.
Overall, it demonstrates how computational approaches can enhance diabetes management, improve patient outcomes, and
reduce the impact of this chronic disease.
Keywords :
Diabetes Prediction, Dataset, Classification Algorithm, Analysis, Prediction, Healthcare.