CatBoost Ensemble Approach for Diabetes Risk Prediction at Early Stages

2021 
Diabetes prediction at the early stage is an important issue in the healthcare field and helps an individual to avoid dangerous situations by initiating treatment. For the prediction of diabetes at the early stages, many techniques in the area of machine learning and ensemble learning have been used. In this paper, we propose an ensemble technique CatBoost which is a Gradient Boosting Decision Tree (GBDT) for diabetes prediction at early stages. The experiment is conducted by comparing the performance of CatBoost with other machine learning methods such as K-Nearest neighbor, Multi-layer perceptron, Logistic regression, Gaussian Naive Bayes, and Stochastic gradient descent and the result is evaluated using accuracy, precision, recall, f1-score, and AUC-ROC curve. Experimentation is conducted using the dataset available in the UCI machine learning repository named “Early stage diabetes risk prediction”. The results prove that CatBoost outperforms compared to the other machine learning methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []