Authors :
Md. Sharfuddin; Afroja Akter Mim
Volume/Issue :
Volume 11 - 2026, Issue 6 - June
Google Scholar :
https://tinyurl.com/2s3s4k7w
Scribd :
https://tinyurl.com/3tvmpwab
DOI :
https://doi.org/10.38124/ijisrt/26jun414
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Heart stroke is a state of the body in which circulation capacity to a segment of the heart is halted or stopped.
Here, blood clots are created and block the passage of arterial blood as well as oxygen. If heart stroke is informed in advance,
then it can be cured. Well, if we know it happens in advance, we can minimize your chances to die by diagnosing (it). But
now, it can be early predicted through machine learning in this era. Although most machine learning models are trained on
the same dataset, only one model gets all of the spotlight. Again features are all extracted from the dataset but feature
importance is mostly not used, It says nothing about the fact that which feature is considered more important. Model
Aggregation When aggregating multiple models in our model, we used softvoting to create an ensemble and additionally
have a single model with hyperparameter tuning since it showed better accuracy. We also used Explainable ai Shap & LIME
which explains the importance of feature.
Keywords :
Heart Stroke Prediction, Machine Learning, Explainable AI. SHAP, LIME Ensemble Learning XGBoost, LightGBM, Healthcare Analytics.
References :
- S. Aljanabi, M. Al-Shargabi, and A. Al-Madi, "Performance evaluation of machine learning algorithms for heart disease prediction," Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 10, pp. 649–656, 2020, doi: 10.14569/IJACSA.2020.0110115. [Online]. Available: https://thesai.org/Publications/ViewPaper?Volume=11&Issue=10&Code=IJACSA&SerialNo=15
- R. Vignesh, S. P. Rajamhoana, and C. R. Vignesh, "A machine learning approach for 10-year CHD risk prediction using Framingham dataset," in Proc. IEEE Int. Conf. Smart Syst. Inventive Technol., 2022, pp. 1432–1439, doi: 10.1109/ICSSIT.2022.9716234. [Online]. Available: https://ieeexplore.ieee.org/document/9716234
- A. K. Jha, A. Mehta, and N. S. Raghava, "Comparative analysis of machine learning techniques for cardiological stroke and CHD prediction," in Proc. Int. Conf. Innov. Comput. Commun., 2021, pp. 341–352, doi: 10.1007/978-981-16-2594-7_32. [Online]. Available: https://link.springer.com/chapter/10.1007/978-981-16-2594-7_32
- J. P. Li, M. U. Haq, and S. U. Din, "Implementation of heart disease prediction system using minimal clinical parameters," J. Med. Syst., vol. 43, no. 9, p. 289, 2019, doi: 10.1007/s10916-019-1402-7. [Online]. Available: https://link.springer.com/article/10.1007/s10916-019-1402-7
- M. G. R. Alam, M. S. Uddin, and A. S. M. L. Hoque, "Impact of feature reduction on the efficiency of cardiac disease prediction models," Informatics Med. Unlocked, vol. 25, p. 100678, 2021, doi: 10.1016/j.imu.2021.100678. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S235291482100143X
- K. Srinivas, B. K. Rani, and A. Govardhan, "Application of data mining techniques in healthcare for CHD risk assessment," Int. J. Comput. Appl., vol. 180, no. 38, pp. 18–24, 2018, doi: 10.5120/ijca2018917321. [Online]. Available: https://www.ijcaonline.org/archives/volume180/number38/srinivas-2018-ijca-917321.pdf
- Y. Khourdifi and M. Bahaj, "Predicting ten-year risk of coronary heart disease using sparse medical data," IEEE Access, vol. 8, pp. 99102–99111, 2020, doi: 10.1109/ACCESS.2020.2991102. [Online]. Available: https://ieeexplore.ieee.org/document/8991102
- T. Karayilan and O. Kilic, "Critical evaluation of cardiovascular risk prediction models: A case study on Framingham data," J. Biomedical Informatics, vol. 99, p. 103310, 2019, doi: 10.1016/j.jbi.2019.103310. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S002197381930310X
- N. K. Kumar, G. S. Kumar, and V. R. S. Mani, "Baseline machine learning classifiers for heart disease prognosis with missing data analysis," in Proc. IEEE Int. Conf. Comput., Commun. Intell. Syst., 2021, pp. 412–417, doi: 10.1109/ICCCIS.2021.9687114. [Online]. Available: https://ieeexplore.ieee.org/document/9687114
- S. Radhimeenakshi, S. Chidambaranathan, and R. Selvakumar, "Heart risk prediction using non-invasive clinical features: A machine learning approach," in Proc. Lecture Notes Electr. Eng., 2020, pp. 451–460, doi: 10.1007/978-981-15-3242-9_45. [Online]. Available: https://link.springer.com/chapter/10.1007/978-981-15-3242-9_45
- G. N. Ahmad, S. Shafi, and M. Z. M. J. Khan, "Comparative study of heart disease prediction with diminished feature subsets," Int. J. Eng. Advanced Technol., vol. 9, no. 1, pp. 2321–2325, 2019, doi: 10.35940/ijeat.A1143.109119. [Online]. Available: https://www.ijeat.org/wp-content/uploads/papers/v9i1/A1143109119.pdf
- M. M. Ali, B. K. Paul, and M. A. Moni, "Evaluation of classification algorithms for 10-year risk of CHD using reduced variance features," Diagnostics, vol. 12, no. 8, p. 1844, 2022, doi: 10.3390/diagnostics12081844. [Online]. Available: https://www.mdpi.com/2075-4418/12/8/1844
- P. S. Kohli and S. Arora, "Simplified machine learning models for early detection of cardiovascular risk," in Proc. IEEE Int. Conf. Emerging Smart Comput. Informatics, 2022, pp. 1–6, doi: 10.1109/ESCI53501.2022.9758241. [Online]. Available: https://ieeexplore.ieee.org/document/9758241
- R. S. S. Kumari, S. N. Deepa, and K. V. S. R. Prasad, "Feature selection impact on machine learning-based coronary heart disease prediction," in Proc. Comput. Commun. Inf. Sci., 2020, pp. 125–136, doi: 10.1007/978-3-030-63393-6_12. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-030-63393-6_12
- H. Benjamin, S. S. Babu, and S. Jeeva, "Baseline exploration of the Framingham cohort using simple linear and tree classifiers," Int. J. Inf. Technol., vol. 13, no. 3, pp. 1141–1148, 2021, doi: 10.1007/s41870-021-00741-w. [Online]. Available: https://link.springer.com/article/10.1007/s41870-021-00741-w Part 4: Documentation & Learning Resources
- Google ref: https://www.geeksforgeeks.org/machine-learning/auc-roc-curve/
- Scikit Learn Voting Classifier Docs: Learn how you can combine several models directly in Python with soft or hard voting: [Scikit-Learn Ensemble Voting] https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html
- WHO ref: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1
19. Dataset Available : https://www.kaggle.com/datasets/mirzahasnine/heart-disease-dataset
Heart stroke is a state of the body in which circulation capacity to a segment of the heart is halted or stopped.
Here, blood clots are created and block the passage of arterial blood as well as oxygen. If heart stroke is informed in advance,
then it can be cured. Well, if we know it happens in advance, we can minimize your chances to die by diagnosing (it). But
now, it can be early predicted through machine learning in this era. Although most machine learning models are trained on
the same dataset, only one model gets all of the spotlight. Again features are all extracted from the dataset but feature
importance is mostly not used, It says nothing about the fact that which feature is considered more important. Model
Aggregation When aggregating multiple models in our model, we used softvoting to create an ensemble and additionally
have a single model with hyperparameter tuning since it showed better accuracy. We also used Explainable ai Shap & LIME
which explains the importance of feature.
Keywords :
Heart Stroke Prediction, Machine Learning, Explainable AI. SHAP, LIME Ensemble Learning XGBoost, LightGBM, Healthcare Analytics.