Feature Selection in Pre-Diagnosis Heart Coronary Artery Disease Detection: A heuristic approach for feature selection based on Information Gain Ratio and Gini Index

2020 
Cardiovascular disease is one of the most common causes of mortality in the world. Among the different types of this disease, the coronary artery is the most important, which the correct and timely diagnosis of which is vital. Diagnostic and treatment methods of this disease have many side effects and costs. The best and most accurate diagnostic method here is angiography. Researchers seek to find economical and high-accuracy methods for this purpose. The disease-related features and different data mining techniques are described to increase the accuracy of the diagnosis through one dataset of essential and useful features. Data are collected from 303 suspected cardiovascular patients in Shahid Rajaee Hospital, Tehran. Among the samples, 87 are healthy, and 216 are sick. The features are selected through their optimal subsets of performance, speed of diagnosis, and precision in the first step to determine the severity of coronary artery disease (CAD). This feature selection can predict and promote a learning model. Then the optimal machine learning models are applied to analyze and predict CAD. The accuracy of 99.67% is found in this diagnosis, indicating the highest obtained accuracy in this field. The left anterior descending (LAD), the left circumflex (LCX), and the right coronary artery (RCA) features are diagnosed with high accuracy by using those models. It seems these three features define the CAD and are dependent on angiography. If they are eliminated for the prediagnosis situation, the accuracy of CAD will be between 83% to 86% for the new reduced subset of features proposed concerning legible performance reduction.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    3
    Citations
    NaN
    KQI
    []