Effective Prediction of Type II Diabetes Mellitus Using Data Mining Classifiers and SMOTE

2020 
Diabetes is a metabolic disorder and currently is one of the most appalling diseases that mankind is facing. In diabetic disease, the body does not properly respond to “insulin”, an important hormone that converts sugar into energy needed for the proper functioning of regular life. The disease comes with severe complications on our body as it increases the risk of developing kidney disease, heart disease, eye retinal disease, nerve damage, and blood vessel damage. As per the World Health Organization, about 8.8% of the world was diabetic in the year 2017 and they have projected it to reach nearly 10% by 2045. This study develops a model for diabetic prediction based on data mining classifications techniques. Classification of imbalanced data especially in medical informatics is challenging and was the motivational factor for developing a classifier using a rebalancing algorithm. A two-phase classification model is employed in which the first step is preprocessing the data by use of Synthetic Minority Oversampling Technique (SMOTE), and the second one is feeding five classifiers (Bagging, SVM (Support Vector Machine), MLP (Multi-Layer Perceptron), Simple Logistic and Decision Tree) with the preprocessed data in order to select the best classifier for balanced dataset to predict diabetes. We have achieved an accuracy of 94.7013%, and 0.953 receiver operator characteristics (ROC) curve with decision tree classifier. The validation was achieved via a 10-fold cross validation with an experiment that was conducted on clinical records of 734 patients.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    17
    Citations
    NaN
    KQI
    []