Authors :
Diri, Ezekiel Ebere; Diri, Grace Oluchi; Rita Chikeru Owhonda; Nbaakee, Lebari Goodday; Unula Godknows; Kingsley Theophilus Igulu
Volume/Issue :
Volume 10 - 2025, Issue 9 - September
Google Scholar :
https://tinyurl.com/r4hus3p8
Scribd :
https://tinyurl.com/2rbcwxav
DOI :
https://doi.org/10.38124/ijisrt/25sep706
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Stroke recurrence remains one of the most devastating challenges in managing cerebrovascular disease, adding to
disability, mortality, and rising healthcare costs worldwide. Being able to predict recurrence early could mean the difference
between timely intervention and irreversible outcomes. In this study, we explored whether machine learning models -
Logistic Regression, Random Forest, and XGBoost - could predict recurrence risk using only a small set of routine clinical
features. Preprocessing involved managing missing values, scaling variables, and applying SMOTE to balance the classes
without distorting real patient patterns. Models were evaluated across accuracy, precision, recall, F1 Score, and AUC-ROC,
with greater weight placed on recall and F1 given the clinical need to minimize missed recurrences. Random Forest delivered
the strongest results, achieving an accuracy of 92.39%, a recall of 94.05%, an F1 Score of 92.56%, and an AUC-ROC of
97.04%. These findings suggest that even simple, carefully designed predictive models could offer real clinical value,
particularly in healthcare environments where rich data resources are limited and early warnings could make a critical
difference for patient care.
Keywords :
Stroke Recurrence Prediction, Machine Learning Models, Random Forest Classifier, Minimal Clinical Features, Secondary Stroke Prevention.
References :
- Feigin, V., Brainin, M., Norrving, B., Martins, S., Pandian, J., Lindsay, M., ... Rautalin, I. (2025). "World stroke organization: global stroke fact sheet 2025." International Journal of Stroke, 20(2), 132--144. https://doi.org/10.1177/17474930241308142
- Yu, Q., Tian, Y., Jiang, N., Zhao, F., Wang, S., Sun, M., & Liu, X. (2025). "Global, regional, and national burden and trends of stroke among youths and young adults aged 15--39 years from 1990 to 2021: findings from the global burden of disease study 2021." Frontiers in Neurology, 16. https://doi.org/10.3389/fneur.2025.1535278
- Liu, X., Wu, X., Yan, S., Chu, C., Wang, L., Li, H., ... Li, Q. (2024). "Association of MRI markers of cerebral small vessel disease and ischemic stroke recurrence in patients treated with intravenous thrombolysis: a three-year prospective cohort study." https://doi.org/10.21203/rs.3.rs-4891113/v1
- Diri, G. O., Diri, E. E., Nbaakee, L. G., James, N. H., & Igulu, K. T. (2025). "Electrocardiographic and biochemical feature integration for automated cardiovascular risk stratification." International Journal of Research and Innovation in Applied Science, 10(6). https://doi.org/10.51584/IJRIAS.2025.10060042
- Mbalinda, S., Kaddumukasa, M., Najjuma, J., Kaddumukasa, M., Nakibuuka, J., Burant, C., & Sajatovic, M. (2024). "Stroke recurrence rate and risk factors among stroke survivors in sub-Saharan Africa: a systematic review." Neuropsychiatric Disease and Treatment, 20, 783--791. https://doi.org/10.2147/ndt.s442507
- Zhao, J., Wang, D., Liu, X., Wang, Y., & Zhao, X. (2023). "The predictive value of Essen and SPI-II on the risk of 5-year recurrence in Chinese patients with acute ischemic stroke." Neuropsychiatric Disease and Treatment, 19, 2251--2260. https://doi.org/10.2147/ndt.s433383
- Heo, J. (2025). "Application of artificial intelligence in acute ischemic stroke: a scoping review." Neurointervention, 20(1), 4--14. https://doi.org/10.5469/neuroint.2025.00052
- Paliwal, S., Parveen, S., Alam, M., & Ahmed, J. (2023). "Improving brain stroke prediction through oversampling techniques: a comparative evaluation of machine learning algorithms." https://doi.org/10.20944/preprints202306.1444.v1
- Chen, Y., Chung, J., Yeh, Y., Lou, S., Lin, H., Lin, C., ... Shi, H. (2022). "Predicting 30-day readmission for stroke using machine learning algorithms: a prospective cohort study." Frontiers in Neurology, 13. https://doi.org/10.3389/fneur.2022.875491
- Heo, J., Yoo, J., Lee, H., Lee, I., Kim, J., Park, E., ... Nam, H. (2022). "Prediction of hidden coronary artery disease using machine learning in patients with acute ischemic stroke." Neurology, 99(1). https://doi.org/10.1212/wnl.0000000000200576
- Parvathi, S., B, A., Kulkarni, G., Murugan, S., & Vijayammal, B. (2024). "Exploring feature relationships in brain stroke data using polynomial feature transformation and linear regression modeling." Journal of Machine and Computing, 1158--1169. https://doi.org/10.53759/7669/jmc202404107
- Hadiyoso, S., Ong, P., Zakaria, H., & Rajab, T. (2022). "EEG-based spectral dynamic in characterization of poststroke patients with cognitive impairment for early detection of vascular dementia." Journal of Healthcare Engineering, 2022, 1--11. https://doi.org/10.1155/2022/5666229
- Zheng, P., Huiyu, S., Li, M., Qingke, B., Qiuyun, L., & Xu, C. (2023). "Explainable machine learning for long-term outcome prediction in two-center stroke patients after intravenous thrombolysis." Frontiers in Neuroscience, 17. https://doi.org/10.3389/fnins.2023.1146197
- Padimi, V., Telu, V., & Ningombam, D. (2022). "Performance analysis and comparison of various machine learning algorithms for early stroke prediction." ETRI Journal, 45(6), 1007--1021. https://doi.org/10.4218/etrij.2022-0271
- He, W., Le, H., & Du, P. (2022). "Stroke prediction model based on XGBoost algorithm." International Journal of Applied Sciences & Development, 1, 7--10. https://doi.org/10.37394/232029.2022.1.2
- Mitra, R., & Rajendran, T. (2022). "Efficient prediction of stroke patients using random forest algorithm in comparison to support vector machine." https://doi.org/10.3233/apc220075
- Shahade, A., & Deshmukh, P. (2025). "Gradient boosting for heart stroke prediction: investigating unexpected risk factors." Journal of Computer Science, 21(1), 124--133. https://doi.org/10.3844/jcssp.2025.124.133
- Shih, H., Law, K., Yeh, Y., Wu, K., Lai, J., Lin, C., ... Kao, C. (2022). "Applying machine learning to carotid sonographic features for recurrent stroke in patients with acute stroke." Frontiers in Cardiovascular Medicine, 9. https://doi.org/10.3389/fcvm.2022.804410
- Ma, L., Fu, G., Liu, R., Zhou, F., Dong, S., Zhou, Y., ... Wang, X. (2023). "Phenylacetyl glutamine: a novel biomarker for stroke recurrence warning." BMC Neurology, 23(1). https://doi.org/10.1186/s12883-023-03118-5
- Pucar, Đ., & Šimović, V. (2024). "Predictive modeling of stroke occurrence using Python for improved risk assessment." Journal of Process Management New Technologies, 12(1--2), 110--120. https://doi.org/10.5937/jpmnt12-50921
- Setyarini, D., Gayatri, A., Aditya, C., & Chandranegara, D. (2024). "Stroke prediction with enhanced gradient boosting classifier and strategic hyperparameter." Matrik Jurnal Manajemen Teknik Informatika Dan Rekayasa Komputer, 23(2), 477--490. https://doi.org/10.30812/matrik.v23i2.3555
- Cao, S., Zhao, L., Pei, L., Gao, Y., Fang, H., Liu, K., & Xu, Y. (2023). "ABCD2 score has equivalent stroke risk prediction for anterior circulation TIA and posterior circulation TIA." Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-41260-9
- Irie, F., Matsumoto, K., Matsuo, R., Nohara, Y., Wakisaka, Y., Ago, T., ... Kamouchi, M. (2024). "Predictive performance of machine learning--based models for poststroke clinical outcomes in comparison with conventional prognostic scores: multicenter, hospital-based observational study." JMIR AI, 3, e46840. https://doi.org/10.2196/4684
- Gao, Y., Li, Z., Zhai, X., Han, L., Ping, Z., Cheng, S., ... Cui, H. (2024). "An interpretable machine learning model for stroke recurrence in patients with symptomatic intracranial atherosclerotic arterial stenosis." Frontiers in Neuroscience, 17. https://doi.org/10.3389/fnins.2023.1323270
- Shao, S., Wang, T., Zhu, L., Yin, G., Fan, X., Lu, Y., ... Qian, J. (2025). "Correlation of intracranial and extracranial carotid atherosclerotic plaque characteristics with ischemic stroke recurrence: a high-resolution vessel wall imaging study." Frontiers in Neurology, 15. https://doi.org/10.3389/fneur.2024.1514711
- Sousanidou, A., Tsiptsios, D., Christidi, F., Karatzetzou, S., Kokkotis, C., Gkantzios, A., ... Vadikolias, Κ. (2023). "Exploring the impact of cerebral microbleeds on stroke management." Neurology International, 15(1), 188--224. https://doi.org/10.3390/neurolint15010014
- Dimaras, T., Merkouris, E., Tsiptsios, D., Christidi, F., Sousanidou, A., Orgianelis, I., ... Vadikolias, Κ. (2023). "Leukoaraiosis as a promising biomarker of stroke recurrence among stroke survivors: a systematic review." Neurology International, 15(3), 994--1013. https://doi.org/10.3390/neurolint15030064
- Li, Y., Wang, Z., Wu, T., & Zhou, T. (2023). "Comparison of six machine learning algorithms for stroke risk estimation." Applied and Computational Engineering, 8(1), 556--561. https://doi.org/10.54254/2755-2721/8/20230274
- Park, S., Choi, J., Kim, Y., & You, J. (2024). "Clinical machine learning predicting best stroke rehabilitation responders to exoskeletal robotic gait rehabilitation." Neurorehabilitation, 54(4), 619--628. https://doi.org/10.3233/nre-240070
- Shahade, A., & Deshmukh, P. (2025). "Gradient boosting for heart stroke prediction: investigating unexpected risk factors." Journal of Computer Science, 21(1), 124--133. https://doi.org/10.3844/jcssp.2025.124.133 [Second occurrence]
- Hairani, H., Widiyaningtyas, T., & Prasetya, D. (2024). "Feature selection and hybrid sampling with machine learning methods for health data classification." Revue d'Intelligence Artificielle, 38(4), 1255--1261. https://doi.org/10.18280/ria.380419
- Yin, Q., Ye, X., Huang, B., Qin, L., Ye, X., & Wang, J. (2023). "Stroke risk prediction: comparing different sampling algorithms." International Journal of Advanced Computer Science and Applications, 14(6). https://doi.org/10.14569/ijacsa.2023.01406115
- Wakisaka, K., Matsuo, R., Matsumoto, K., Nohara, Y., Irie, F., Wakisaka, Y., ... Kitazono, T. (2023). "Non-linear association between body weight and functional outcome after acute ischemic stroke." Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-35894-y
- Iguchi, T., Kojima, K., Hayashi, D., Tokunaga, T., Okishio, K., & Yoon, H. (2025). "Preoperative maximum standardized uptake value emphasized in explainable machine learning model for predicting the risk of recurrence in resected non--small cell lung cancer." JCO Clinical Cancer Informatics, (9). https://doi.org/10.1200/cci-24-00194
- Nasution, N., Nasution, F., Erlin, E., & Hasan, M. (2024). "Evaluation study of the chi-square method for feature selection in stroke prediction with random forest regression." https://doi.org/10.4108/eai.30-10-2023.2343096
Stroke recurrence remains one of the most devastating challenges in managing cerebrovascular disease, adding to
disability, mortality, and rising healthcare costs worldwide. Being able to predict recurrence early could mean the difference
between timely intervention and irreversible outcomes. In this study, we explored whether machine learning models -
Logistic Regression, Random Forest, and XGBoost - could predict recurrence risk using only a small set of routine clinical
features. Preprocessing involved managing missing values, scaling variables, and applying SMOTE to balance the classes
without distorting real patient patterns. Models were evaluated across accuracy, precision, recall, F1 Score, and AUC-ROC,
with greater weight placed on recall and F1 given the clinical need to minimize missed recurrences. Random Forest delivered
the strongest results, achieving an accuracy of 92.39%, a recall of 94.05%, an F1 Score of 92.56%, and an AUC-ROC of
97.04%. These findings suggest that even simple, carefully designed predictive models could offer real clinical value,
particularly in healthcare environments where rich data resources are limited and early warnings could make a critical
difference for patient care.
Keywords :
Stroke Recurrence Prediction, Machine Learning Models, Random Forest Classifier, Minimal Clinical Features, Secondary Stroke Prevention.