Nonlaboratory-based risk assessment model for type 2 diabetes mellitus screening in Chinese rural population: a joint bagging-boosting model.

2021 
Diabetes mellitus is one of the major public health problems in the world due to its high prevalence and medical costs. The prevention effort necessitates reliable risk assessment models which can effectively identify high-risk individuals and enable healthcare practitioners to initiate appropriate preventive interventions. However, diabetes risk assessment models based on data analysis face multiple challenges, such as class imbalance and low identification rate. To cope with these challenges, this paper proposed an analytical framework based on data-driven approaches using large population data from the Henan Rural Cohort Study. A joint bagging-boosting model (JBM) was developed and validated. For the convenience of large-scale population screening, our study excluded laboratory variables and collinearity variables using the maximum likelihood ratio method to obtain accessibility variables. Then, we explored the effects of different methods for dealing with the unbalanced nature of the available data, including over-sampling and under-sampling methods. Finally, to improve the overall model performance, a joint model which combined the bagging and boosting algorithms with the stacking algorithm was constructed. The model we built demonstrated good discrimination, with an area under the curve (AUC) value of 0.885, and acceptable calibration (Brier score =0.072). Compared with the benchmark model, the proposed framework improved the AUC value of the overall model performance by 13.5%, and the recall increased from 0.744 to 0.847. The proposed model contributes to the personalized management of diabetes, especially in medical resource-poor settings.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    70
    References
    1
    Citations
    NaN
    KQI
    []