Predicting Mortality in Patients with Stroke Using Data Mining Techniques

2021 
The mortality due to stroke is increasing. Accurate prediction of stroke-caused death is very important for healthcare. Data mining methods are novel ways to predict these mortality risks. The aim of this study is to employ popular data mining algorithms to predict the survival of stroke patients and extract decision rules. The data of stroke patients (n=4149) were collected from paper medical records. The missing data were managed with the multiple imputation (MI) method. Also, the target variable was balanced with methods such as over-sampling, under-sampling and Synthetic Minority Oversampling (SMOTE). The support vector machine (SVM), decision tree, and logistic regression (LR) algorithms were employed to predict the survival of stroke patients. Also, the Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm was used to extract the decision rules from the main dataset. LR outperformed other algorithms in terms of accuracy (76.96%), sensitivity (79.06%) and Kappa (33.34). However, specificity (65.35%) and AUC (0.77) scores were lower than other algorithms’. An independent dataset with 234 records was selected to challenge the algorithm (LR) with the highest performance from the main dataset. After employing this algorithm on the external validation dataset, its performance was improved in accuracy (79.91%), sensitivity (83.94%), Kappa (39.26) and AUC (0.8), except specificity (60.98%). The constructed model predicted the survival of stroke patients with high scores and also the useful rules were extracted for clinical usage.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []