Machine learning based stroke detection a predictive approach| International Journal of Innovative Science and Research Technology

Machine Learning Based Stroke Detection: A Predictive Approach

Authors : Akinsola Adeniyi F.; Sokunbi. M. A.; Ogundele. I. O.; Onadokun I. O.

Volume/Issue : Volume 11 - 2026, Issue 2 - February

Google Scholar : https://tinyurl.com/2n2d39mh

Scribd : https://tinyurl.com/2hvmxcjv

DOI : https://doi.org/10.38124/ijisrt/26feb1439

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : Stroke is one of the biggest challenges facing the world’s public health today and is ranked as the second most common cause of death and the third most common cause of long term disability globally [1]. Early diagnosis is important for lowering the risk of death as well as long-term disability from stroke. Traditional methods of diagnosing stroke, such as CT scans and MRI's, however, can be expensive, time-consuming and require specialists to interpret them. The challenges associated with these traditional methods of diagnosis can delay the making of decisions regarding treatment, especially in low-resource settings in which there is a lack of access to advanced imaging technologies or qualified personnel. Due to this, new approaches are now being studied in an effort to identify stroke rapidly and efficiently. The stroke detection model was developed based on the machine learning application to structured data from patients. The four supervised learning methods used were: LightGBM, CatBoost, XGBoost, and Random Forest. These methods were applied using a 70/30 split for the training and testing set. Accuracy, Precision, Recall, F1-Score, and the Area Under Curve Receiver Operating Characteristic Curve (AUC-ROC) were all used as measures of model performance. Of the models compared in the study, the Random Forest method demonstrated superior model performance with an accuracy of 90% and an F1-Score of 0.95. Additionally, the Random Forest model was able to achieve higher performance than gradient boosting methods for each of the most important performance metrics. These results indicate that machine learning algorithms particularly those based on ensemble techniques (such as Random Forest) can potentially be used to enhance current diagnostic pathways for predicting stroke by providing faster, more scalable and more accessible predictions of stroke risk than are currently available, which could provide the opportunity for earlier clinical interventions and better patient outcomes, specifically in low resource health care settings. Machine learning tools should not be used to supplant clinical expertise; however, if integrated into routine practice, they could mark an important step toward using data more effectively and equitably when caring for patients with stroke.

Keywords : Stroke Detection, Machine Learning, Random Forest, Predictive Modeling, Healthcare AI

References :

Valery L Feigin , Michael Brainin , Bo Norrving , Sheila O Martins , Jeyaraj pandian, Patrice Lindsay, Maria F Grupper , Ilari Rautalin. World Stroke Organization, WSO Global Stroke Fact Sheet 2025. World Stroke Organization, 2025. DOI: 10.1177/17474930241308142
GBD 2021 Stroke Collaborators, “Global, regional, and national burden of stroke and its risk factors, 1990–2021,” The Lancet Neurology, vol. 23, no. 4, pp. 345–367, 2024.
Binbin Sui, Peiyi Gao, (2020), “Imaging evaluation of acute ischemic stroke,” Journal of International Medical Research, https://doi.org/10.1177/0300060518802530
S. Dritsas and M. Trigka, (2022) “Stroke risk prediction with machine learning techniques,” Sensors, vol. 22, no.13, doi: 10.3390/s22134670
Nojood Alageel, Rahaf Alharbi, Rehab Alharbi, Lubna A. Alharbi, Maryam Alsayil (2023) “Using Machine Learning Algorithm as a Method for Improving Stroke Prediction,” International Journal of Advanced Computer Science and Applications. DOI: 10.14569/IJACSA.2023.0140481
Senjuti Rahman, Mehedi Hasan, Ajay Sarkar, “Prediction of Brain Stroke Using Machine Learning Algorithms and Deep Neural Network Techniques,” European Journal of Electrical Engineering and Computer Science, 7(1):23-30, 2023. DOI: 10.24018/ejece.2023.7.1.483
Mandeep Kaur, Sachin R. Sakhare, Kirti Wanjale, Farzana Akter, (2022) “Early Stroke Prediction Methods for Prevention of Strokes,” Behavioural Neurology, https://doi.org/10.1155/2022/7725597.
R. Pitchai, Bhasker Dappuri, P. V. Pramila, M. Vidhyalakshmi, S. Shanthi, Wadi B. Alonazi, Khalid M. A. Almutairi, R. S. Sundaram, Ibsa Beyene. (2023). “An Artificial Intelligence-Based Bio-Medical Stroke Prediction and Analytical System Using a Machine Learning Approach,” Computational Intelligence and Neuroscience, https://doi.org/10.1155/2022/5489084
Nouf Saeed Alotaibi, Abdullah Shawan Alotaibi, M. Eliazer, Asadi Srinivasulu (2022), “Detection of Ischemic Stroke Tissue Fate from the MRI Images Using a Deep Learning Approach,” Mobile Information Systems, https://doi.org/10.1155/2022/9399876.
Soumyabrata Dev, Hewei Wang, Chidozie Shamrock Nwosu, Nishtha Jain, Bharadwaj Veeravalli, Deepu John (2022), “A Predictive Analytics Approach for Stroke Prediction Using Machine Learning and Neural Networks,” Healthcare Analytics, https://doi.org/10.1016/j.health.2022.100032
Eman M Alanazi , Aalaa Abdou , Jake Luo (2021), “Predicting Risk of Stroke from Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models,” JMIR Formative Research 5(12), DOI: 10.2196/23440.
JoonNyung Heo , Jihoon G Yoon, Hyungjong Park, Young Dae Kim, Hyo Suk Nam, Ji Hoe Heo (2019), “Machine Learning–Based Model for Prediction of Outcomes in Acute Stroke,” Stroke, vol. 50, no. 5, doi: 10.1161/STROKEAHA.118.024293
L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001 https://doi.org/10.1023/A:1010933404324.
G. Ke et al., “LightGBM: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
L. Prokhorenkova, G. Gusev, A. Vorobev, A. Dorogush, and A. Gulin, “CatBoost: Unbiased boosting with categorical features,” in Advances in Neural Information Processing Systems (NeurIPS), 2018.
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 2016, pp. 785–794. https://doi.org/10.1145/2939672.29397
D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, p. 6, 2020

Stroke is one of the biggest challenges facing the world’s public health today and is ranked as the second most common cause of death and the third most common cause of long term disability globally [1]. Early diagnosis is important for lowering the risk of death as well as long-term disability from stroke. Traditional methods of diagnosing stroke, such as CT scans and MRI's, however, can be expensive, time-consuming and require specialists to interpret them. The challenges associated with these traditional methods of diagnosis can delay the making of decisions regarding treatment, especially in low-resource settings in which there is a lack of access to advanced imaging technologies or qualified personnel. Due to this, new approaches are now being studied in an effort to identify stroke rapidly and efficiently. The stroke detection model was developed based on the machine learning application to structured data from patients. The four supervised learning methods used were: LightGBM, CatBoost, XGBoost, and Random Forest. These methods were applied using a 70/30 split for the training and testing set. Accuracy, Precision, Recall, F1-Score, and the Area Under Curve Receiver Operating Characteristic Curve (AUC-ROC) were all used as measures of model performance. Of the models compared in the study, the Random Forest method demonstrated superior model performance with an accuracy of 90% and an F1-Score of 0.95. Additionally, the Random Forest model was able to achieve higher performance than gradient boosting methods for each of the most important performance metrics. These results indicate that machine learning algorithms particularly those based on ensemble techniques (such as Random Forest) can potentially be used to enhance current diagnostic pathways for predicting stroke by providing faster, more scalable and more accessible predictions of stroke risk than are currently available, which could provide the opportunity for earlier clinical interventions and better patient outcomes, specifically in low resource health care settings. Machine learning tools should not be used to supplant clinical expertise; however, if integrated into routine practice, they could mark an important step toward using data more effectively and equitably when caring for patients with stroke.

Keywords : Stroke Detection, Machine Learning, Random Forest, Predictive Modeling, Healthcare AI

Paper Submission Last Date
31 - July - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.