Authors :
Divya P.; Ajmal Ahamed R.; Duraiarasi S.; Jaisuriya N.; Jayashree A.
Volume/Issue :
Volume 10 - 2025, Issue 2 - February
Google Scholar :
https://tinyurl.com/zrbvju3e
Scribd :
https://tinyurl.com/58v9h5t5
DOI :
https://doi.org/10.5281/zenodo.14944955
Abstract :
In this work, we aimed to predict the incidence of strokes using machine learning approaches. The dataset includes
demographic and health-related variables such as age, gender, heart disease, hypertension, and smoking status. After pre-
processing the data, which included encoding categorical variables and handling missing values, we trained several
classification techniques, including Random Forest Classifier, AdaBoost Classifier, and Gradient Boosting Classifier. To
evaluate the models' performance, we employed metrics such as F1-score, recall, accuracy, and precision. The Random
Forest Classifier achieved the highest accuracy of 94.72% on the test set. In order to solve the issue of class imbalance, we
also employed techniques like Random Over Sampler and SMOTE (Synthetic Minority Over-Sampling Technique), which
improved the models' capacity to predict the recurrence of strokes. All things considered, our findings suggest that machine
learning algorithms, with the Random Forest Classifier showing promising accuracy results, may be able to predict the
incidence of strokes based on demographic and health-related data.
Keywords :
Brain Stroke, Cerebrovascular Accident, Oxygen and Nutrients, Ischemic Stroke.
References :
- N. Hatami, L. Mechtouff, D. Rousseau, T.-H. Cho, O. Eker, Y. Berthezene, and C. Frindel, "A Novel Autoencoders-LSTM Model for Stroke Outcome Prediction using Multimodal MRI Data," arXiv preprint arXiv:2303.09484, March 2023.
- L. García-Terriza, J. L. Risco-Martín, G. Reig Roselló, and J. L. Ayala, "Predictive and diagnosis models of stroke from hemodynamic signal monitoring," arXiv preprint arXiv:2306.05289, May 2023.
- M. Bahrami and M. Forouzanfar, ‘‘Sleep apnea detection from single-lead ECG: A comprehensive analysis of machine learning and deep learning algorithms,’’ IEEE Trans. Instrum. Meas., vol. 71, pp. 1–11, 2022
- L. Ismail and H. Materwala, "From Conception to Deployment: Intelligent Stroke Prediction Framework using Machine Learning and Performance Evaluation," arXiv preprint arXiv:2304.00249, April 2023.
- S. H. Lee, C. S. Chan, S. J. Mayo, and P. Remagnino, "An Exploration on the Machine-Learning-Based Stroke Prediction Model," Frontiers in Neurology, vol. 15, p. 1372431, 2024
- F. Ren, W. Liu, and G. Wu, "Using an Interpretable Classifier to Predict Stroke Risk," IEEE Access, vol. 7, pp. 122758–122768, 2019.
- D. Lai, ‘‘Prognosis of sleep bruxism using power spectral density approach applied on EEG signal of both EMG1-EMG2 and ECG1- ECG2 channels,’’ IEEE Access, vol. 7, pp. 82553–82562, 2019,
- T.-Y. Kim and S.-B. Cho, ‘‘Predicting residential energy consumption using CNN-LSTM neural networks,’’ Energy, vol. 182, no. 1, pp. 72–81, Sep. 2019
- F. Rundo, S. Conoci, A. Ortis, and S. Battiato, ‘‘An advanced bio-inspired photoplethysmography (PPG) and ECG pattern recognition system for medical assessment,’’ Sensors, vol. 18, no. 2, pp. 1–22, Jan. 2018.
In this work, we aimed to predict the incidence of strokes using machine learning approaches. The dataset includes
demographic and health-related variables such as age, gender, heart disease, hypertension, and smoking status. After pre-
processing the data, which included encoding categorical variables and handling missing values, we trained several
classification techniques, including Random Forest Classifier, AdaBoost Classifier, and Gradient Boosting Classifier. To
evaluate the models' performance, we employed metrics such as F1-score, recall, accuracy, and precision. The Random
Forest Classifier achieved the highest accuracy of 94.72% on the test set. In order to solve the issue of class imbalance, we
also employed techniques like Random Over Sampler and SMOTE (Synthetic Minority Over-Sampling Technique), which
improved the models' capacity to predict the recurrence of strokes. All things considered, our findings suggest that machine
learning algorithms, with the Random Forest Classifier showing promising accuracy results, may be able to predict the
incidence of strokes based on demographic and health-related data.
Keywords :
Brain Stroke, Cerebrovascular Accident, Oxygen and Nutrients, Ischemic Stroke.