Optimising Stroke Recurrence Prediction Using Minimal Clinical Features and Machine Learning Models


Authors : Diri, Ezekiel Ebere; Diri, Grace Oluchi; Rita Chikeru Owhonda; Nbaakee, Lebari Goodday; Unula Godknows; Kingsley Theophilus Igulu

Volume/Issue : Volume 10 - 2025, Issue 9 - September


Google Scholar : https://tinyurl.com/r4hus3p8

Scribd : https://tinyurl.com/2rbcwxav

DOI : https://doi.org/10.38124/ijisrt/25sep706

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : Stroke recurrence remains one of the most devastating challenges in managing cerebrovascular disease, adding to disability, mortality, and rising healthcare costs worldwide. Being able to predict recurrence early could mean the difference between timely intervention and irreversible outcomes. In this study, we explored whether machine learning models - Logistic Regression, Random Forest, and XGBoost - could predict recurrence risk using only a small set of routine clinical features. Preprocessing involved managing missing values, scaling variables, and applying SMOTE to balance the classes without distorting real patient patterns. Models were evaluated across accuracy, precision, recall, F1 Score, and AUC-ROC, with greater weight placed on recall and F1 given the clinical need to minimize missed recurrences. Random Forest delivered the strongest results, achieving an accuracy of 92.39%, a recall of 94.05%, an F1 Score of 92.56%, and an AUC-ROC of 97.04%. These findings suggest that even simple, carefully designed predictive models could offer real clinical value, particularly in healthcare environments where rich data resources are limited and early warnings could make a critical difference for patient care.

Keywords : Stroke Recurrence Prediction, Machine Learning Models, Random Forest Classifier, Minimal Clinical Features, Secondary Stroke Prevention.

References :

  1. Feigin, V., Brainin, M., Norrving, B., Martins, S., Pandian, J., Lindsay, M., ... Rautalin, I. (2025). "World stroke organization: global stroke fact sheet 2025." International Journal of Stroke, 20(2), 132--144. https://doi.org/10.1177/17474930241308142
  2. Yu, Q., Tian, Y., Jiang, N., Zhao, F., Wang, S., Sun, M., & Liu, X. (2025). "Global, regional, and national burden and trends of stroke among youths and young adults aged 15--39 years from 1990 to 2021: findings from the global burden of disease study 2021." Frontiers in Neurology, 16. https://doi.org/10.3389/fneur.2025.1535278
  3. Liu, X., Wu, X., Yan, S., Chu, C., Wang, L., Li, H., ... Li, Q. (2024). "Association of MRI markers of cerebral small vessel disease and ischemic stroke recurrence in patients treated with intravenous thrombolysis: a three-year prospective cohort study." https://doi.org/10.21203/rs.3.rs-4891113/v1
  4. Diri, G. O., Diri, E. E., Nbaakee, L. G., James, N. H., & Igulu, K. T. (2025). "Electrocardiographic and biochemical feature integration for automated cardiovascular risk stratification." International Journal of Research and Innovation in Applied Science, 10(6). https://doi.org/10.51584/IJRIAS.2025.10060042
  5. Mbalinda, S., Kaddumukasa, M., Najjuma, J., Kaddumukasa, M., Nakibuuka, J., Burant, C., & Sajatovic, M. (2024). "Stroke recurrence rate and risk factors among stroke survivors in sub-Saharan Africa: a systematic review." Neuropsychiatric Disease and Treatment, 20, 783--791. https://doi.org/10.2147/ndt.s442507
  6. Zhao, J., Wang, D., Liu, X., Wang, Y., & Zhao, X. (2023). "The predictive value of Essen and SPI-II on the risk of 5-year recurrence in Chinese patients with acute ischemic stroke." Neuropsychiatric Disease and Treatment, 19, 2251--2260. https://doi.org/10.2147/ndt.s433383
  7. Heo, J. (2025). "Application of artificial intelligence in acute ischemic stroke: a scoping review." Neurointervention, 20(1), 4--14. https://doi.org/10.5469/neuroint.2025.00052
  8. Paliwal, S., Parveen, S., Alam, M., & Ahmed, J. (2023). "Improving brain stroke prediction through oversampling techniques: a comparative evaluation of machine learning algorithms." https://doi.org/10.20944/preprints202306.1444.v1
  9. Chen, Y., Chung, J., Yeh, Y., Lou, S., Lin, H., Lin, C., ... Shi, H. (2022). "Predicting 30-day readmission for stroke using machine learning algorithms: a prospective cohort study." Frontiers in Neurology, 13. https://doi.org/10.3389/fneur.2022.875491
  10. Heo, J., Yoo, J., Lee, H., Lee, I., Kim, J., Park, E., ... Nam, H. (2022). "Prediction of hidden coronary artery disease using machine learning in patients with acute ischemic stroke." Neurology, 99(1). https://doi.org/10.1212/wnl.0000000000200576
  11. Parvathi, S., B, A., Kulkarni, G., Murugan, S., & Vijayammal, B. (2024). "Exploring feature relationships in brain stroke data using polynomial feature transformation and linear regression modeling." Journal of Machine and Computing, 1158--1169. https://doi.org/10.53759/7669/jmc202404107
  12. Hadiyoso, S., Ong, P., Zakaria, H., & Rajab, T. (2022). "EEG-based spectral dynamic in characterization of poststroke patients with cognitive impairment for early detection of vascular dementia." Journal of Healthcare Engineering, 2022, 1--11. https://doi.org/10.1155/2022/5666229
  13. Zheng, P., Huiyu, S., Li, M., Qingke, B., Qiuyun, L., & Xu, C. (2023). "Explainable machine learning for long-term outcome prediction in two-center stroke patients after intravenous thrombolysis." Frontiers in Neuroscience, 17. https://doi.org/10.3389/fnins.2023.1146197
  14. Padimi, V., Telu, V., & Ningombam, D. (2022). "Performance analysis and comparison of various machine learning algorithms for early stroke prediction." ETRI Journal, 45(6), 1007--1021. https://doi.org/10.4218/etrij.2022-0271
  15. He, W., Le, H., & Du, P. (2022). "Stroke prediction model based on XGBoost algorithm." International Journal of Applied Sciences & Development, 1, 7--10. https://doi.org/10.37394/232029.2022.1.2
  16. Mitra, R., & Rajendran, T. (2022). "Efficient prediction of stroke patients using random forest algorithm in comparison to support vector machine." https://doi.org/10.3233/apc220075
  17. Shahade, A., & Deshmukh, P. (2025). "Gradient boosting for heart stroke prediction: investigating unexpected risk factors." Journal of Computer Science, 21(1), 124--133. https://doi.org/10.3844/jcssp.2025.124.133
  18. Shih, H., Law, K., Yeh, Y., Wu, K., Lai, J., Lin, C., ... Kao, C. (2022). "Applying machine learning to carotid sonographic features for recurrent stroke in patients with acute stroke." Frontiers in Cardiovascular Medicine, 9. https://doi.org/10.3389/fcvm.2022.804410
  19. Ma, L., Fu, G., Liu, R., Zhou, F., Dong, S., Zhou, Y., ... Wang, X. (2023). "Phenylacetyl glutamine: a novel biomarker for stroke recurrence warning." BMC Neurology, 23(1). https://doi.org/10.1186/s12883-023-03118-5
  20. Pucar, Đ., & Šimović, V. (2024). "Predictive modeling of stroke occurrence using Python for improved risk assessment." Journal of Process Management New Technologies, 12(1--2), 110--120. https://doi.org/10.5937/jpmnt12-50921
  21. Setyarini, D., Gayatri, A., Aditya, C., & Chandranegara, D. (2024). "Stroke prediction with enhanced gradient boosting classifier and strategic hyperparameter." Matrik Jurnal Manajemen Teknik Informatika Dan Rekayasa Komputer, 23(2), 477--490. https://doi.org/10.30812/matrik.v23i2.3555
  22. Cao, S., Zhao, L., Pei, L., Gao, Y., Fang, H., Liu, K., & Xu, Y. (2023). "ABCD2 score has equivalent stroke risk prediction for anterior circulation TIA and posterior circulation TIA." Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-41260-9
  23. Irie, F., Matsumoto, K., Matsuo, R., Nohara, Y., Wakisaka, Y., Ago, T., ... Kamouchi, M. (2024). "Predictive performance of machine learning--based models for poststroke clinical outcomes in comparison with conventional prognostic scores: multicenter, hospital-based observational study." JMIR AI, 3, e46840. https://doi.org/10.2196/4684
  24. Gao, Y., Li, Z., Zhai, X., Han, L., Ping, Z., Cheng, S., ... Cui, H. (2024). "An interpretable machine learning model for stroke recurrence in patients with symptomatic intracranial atherosclerotic arterial stenosis." Frontiers in Neuroscience, 17. https://doi.org/10.3389/fnins.2023.1323270
  25. Shao, S., Wang, T., Zhu, L., Yin, G., Fan, X., Lu, Y., ... Qian, J. (2025). "Correlation of intracranial and extracranial carotid atherosclerotic plaque characteristics with ischemic stroke recurrence: a high-resolution vessel wall imaging study." Frontiers in Neurology, 15. https://doi.org/10.3389/fneur.2024.1514711
  26. Sousanidou, A., Tsiptsios, D., Christidi, F., Karatzetzou, S., Kokkotis, C., Gkantzios, A., ... Vadikolias, Κ. (2023). "Exploring the impact of cerebral microbleeds on stroke management." Neurology International, 15(1), 188--224. https://doi.org/10.3390/neurolint15010014
  27. Dimaras, T., Merkouris, E., Tsiptsios, D., Christidi, F., Sousanidou, A., Orgianelis, I., ... Vadikolias, Κ. (2023). "Leukoaraiosis as a promising biomarker of stroke recurrence among stroke survivors: a systematic review." Neurology International, 15(3), 994--1013. https://doi.org/10.3390/neurolint15030064
  28. Li, Y., Wang, Z., Wu, T., & Zhou, T. (2023). "Comparison of six machine learning algorithms for stroke risk estimation." Applied and Computational Engineering, 8(1), 556--561. https://doi.org/10.54254/2755-2721/8/20230274
  29. Park, S., Choi, J., Kim, Y., & You, J. (2024). "Clinical machine learning predicting best stroke rehabilitation responders to exoskeletal robotic gait rehabilitation." Neurorehabilitation, 54(4), 619--628. https://doi.org/10.3233/nre-240070
  30. Shahade, A., & Deshmukh, P. (2025). "Gradient boosting for heart stroke prediction: investigating unexpected risk factors." Journal of Computer Science, 21(1), 124--133. https://doi.org/10.3844/jcssp.2025.124.133 [Second occurrence]
  31. Hairani, H., Widiyaningtyas, T., & Prasetya, D. (2024). "Feature selection and hybrid sampling with machine learning methods for health data classification." Revue d'Intelligence Artificielle, 38(4), 1255--1261. https://doi.org/10.18280/ria.380419
  32. Yin, Q., Ye, X., Huang, B., Qin, L., Ye, X., & Wang, J. (2023). "Stroke risk prediction: comparing different sampling algorithms." International Journal of Advanced Computer Science and Applications, 14(6). https://doi.org/10.14569/ijacsa.2023.01406115
  33. Wakisaka, K., Matsuo, R., Matsumoto, K., Nohara, Y., Irie, F., Wakisaka, Y., ... Kitazono, T. (2023). "Non-linear association between body weight and functional outcome after acute ischemic stroke." Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-35894-y
  34. Iguchi, T., Kojima, K., Hayashi, D., Tokunaga, T., Okishio, K., & Yoon, H. (2025). "Preoperative maximum standardized uptake value emphasized in explainable machine learning model for predicting the risk of recurrence in resected non--small cell lung cancer." JCO Clinical Cancer Informatics, (9). https://doi.org/10.1200/cci-24-00194
  35. Nasution, N., Nasution, F., Erlin, E., & Hasan, M. (2024). "Evaluation study of the chi-square method for feature selection in stroke prediction with random forest regression." https://doi.org/10.4108/eai.30-10-2023.2343096

Stroke recurrence remains one of the most devastating challenges in managing cerebrovascular disease, adding to disability, mortality, and rising healthcare costs worldwide. Being able to predict recurrence early could mean the difference between timely intervention and irreversible outcomes. In this study, we explored whether machine learning models - Logistic Regression, Random Forest, and XGBoost - could predict recurrence risk using only a small set of routine clinical features. Preprocessing involved managing missing values, scaling variables, and applying SMOTE to balance the classes without distorting real patient patterns. Models were evaluated across accuracy, precision, recall, F1 Score, and AUC-ROC, with greater weight placed on recall and F1 given the clinical need to minimize missed recurrences. Random Forest delivered the strongest results, achieving an accuracy of 92.39%, a recall of 94.05%, an F1 Score of 92.56%, and an AUC-ROC of 97.04%. These findings suggest that even simple, carefully designed predictive models could offer real clinical value, particularly in healthcare environments where rich data resources are limited and early warnings could make a critical difference for patient care.

Keywords : Stroke Recurrence Prediction, Machine Learning Models, Random Forest Classifier, Minimal Clinical Features, Secondary Stroke Prevention.

CALL FOR PAPERS


Paper Submission Last Date
31 - December - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe