A Novel Random Forest-Based Algorithm for Diabetes Diagnosis


Authors : Avishek Gupta; Sudeshna Das; Sohini Banerjee

Volume/Issue : Volume 10 - 2025, Issue 11 - November


Google Scholar : https://tinyurl.com/46nvak6x

Scribd : https://tinyurl.com/bdudu7f3

DOI : https://doi.org/10.38124/ijisrt/25nov256

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : This chapter genuinely addresses the crucial need for the diagnosis of diabetes, since the techniques that are currently in use today are still lacking in both efficiency and accuracy. For the present investigation, we have chosen to use the Random Forest classifier as our model. The Random Forest algorithm is an ensemble learning method that creates a lot of decision trees during training and produces a class that is either the mode of the classes for classification or the mean forecast of the individual trees for regression. The investigation builds and compares different intelligent systems based on multilayer algorithms using the data set that Kare published. Such variables include levels of blood glucose, levels of HbA1c, smoking histories, cardiovascular disease, hypertension, age, gender, and body mass index (BMI). Furthermore, to offering insights into the trends and patterns in diabetes risk, this thorough analysis will lay the groundwork for future studies. In particular, studies can be conducted to better understand how these factors interact and affect the development and course of diabetes, which is essential information for enhancing patient care and results in this increasingly important field of medicine.

Keywords : Diabetes; Machine Learning; Random Forest; Early Detection.

References :

  1. P. Arsi and O. Somantri, “Deteksi Dini Penyakit Diabetes Menggunakan Algoritma Neural Network Berbasiskan Algoritma Genetika,” Jurnal Informatika: Jurnal Pengembangan IT, vol. 3, no. 3, pp. 290–294, 2018, doi: 10.30591/jpit. v3i3.1008.
  2. S. B. Kotsiantis, I. D. Zaharakis, and P. E. Pintelas, “Machine learning: A review of classification and combining techniques,” Artificial Intelligence Review, vol.  26, no.  3, pp.  159–190, 2006, doi: 10.1007/s10462-007-9052-3.
  3. F. A. Jaber and J. W. James, “Early Prediction of Diabetic Using Data Mining,” SN Computer Science, vol. 4, no. 2, pp. 1–7, 2023, doi: 10.1007/s42979-022-01594-z.
  4. R. Birjais, A. K. Mourya, R. Chauhan, and H. Kaur, “Prediction and diagnosis of future diabetes risk: a machine learning approach,” SN Applied Sciences, vol.  1, no.  9, pp.  1–8, 2019, doi: 10.1007/s42452-019-1117-9.
  5. L. J. Muhammad, E. A. Algehyne, and S. S. Usman, “Predictive Supervised Machine Learning Models for Diabetes Mellitus,” SN Computer Science, vol.  1, no.  5, pp.  1–10, 2020, doi: 10.1007/s42979-020-00250-8.
  6. N. Sneha and T. Gangil, “Analysis of diabetes mellitus for early prediction using optimal features selection,” Journal of Big Data, vol. 6, no. 1, 2019, doi: 10.1186/s40537-019-0175-6.
  7. A. Doğru, S. Buyrukoğlu, and M. Arı, “A hybrid super ensemble learning model for the early-stage prediction of diabetes risk,” Medical and Biological Engineering and Computing, vol. 61, no. 3, pp. 785–797, 2023, doi: 10.1007/s11517-022-02749-z.

This chapter genuinely addresses the crucial need for the diagnosis of diabetes, since the techniques that are currently in use today are still lacking in both efficiency and accuracy. For the present investigation, we have chosen to use the Random Forest classifier as our model. The Random Forest algorithm is an ensemble learning method that creates a lot of decision trees during training and produces a class that is either the mode of the classes for classification or the mean forecast of the individual trees for regression. The investigation builds and compares different intelligent systems based on multilayer algorithms using the data set that Kare published. Such variables include levels of blood glucose, levels of HbA1c, smoking histories, cardiovascular disease, hypertension, age, gender, and body mass index (BMI). Furthermore, to offering insights into the trends and patterns in diabetes risk, this thorough analysis will lay the groundwork for future studies. In particular, studies can be conducted to better understand how these factors interact and affect the development and course of diabetes, which is essential information for enhancing patient care and results in this increasingly important field of medicine.

Keywords : Diabetes; Machine Learning; Random Forest; Early Detection.

CALL FOR PAPERS


Paper Submission Last Date
30 - November - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe