Fasting Glucose Identification Using Machine Learning: A Fundamental Step in Automated Diabetes Phenotyping in Electronic Medical Records

2021 
Background: The fasting blood glucose (FBG) values extracted from electronic medical records (EMR) are typically assumed to be valid in existing research, which may cause diagnostic bias. We proposed a machine learning (ML) algorithm to predict the fasting status of blood samples. Methods: We evaluated the distribution of BG values in the China Medical University Hospital Clinical Research Data Repository and established the theoretical fasting status when the ontologically labeled FBG was lower than the estimated average glucose calculated from HbA1c. In addition to logistic regression, we extracted 67 features to predict the fasting status by eXtreme Gradient Boosting (XGBoost). The discrimination and calibration of the prediction models were also assessed. Findings: Of the 784340 ontologically labeled fasting samples, 77·1% were considered theoretical FBGs. The median (IQR) glucose and HbA1c level of ontological and theoretical fasting samples in patients without diabetes mellitus (DM) were 110 mg/dL and 5·6%, and 92 mg/dL and 5·6%, respectively. In multiple logistic regression, 14 selected variables were significantly associated with the fasting status, but the area under the receiver operating characteristic curve (AUROC) was only 0·868. XGBoost showed better calibration and achieved an AUROC of 0·892 with the top five contributing predictors of glucose level, distance from home to hospital, age, height, and concomitantly prescribed serum creatinine. Interpretation: Our results demonstrated an innovative approach to clean EMR data and detect true FBG, which could aid more accurate estimation of the global and local epidemiology of DM. Funding: This study was supported by the Ministry of Science and Technology of Taiwan (grant number: 108-2314-B-039-038-MY3 and 109-2321-B-468-001) Declaration of Interest: None to declare. Ethical Approval: The study was approved by the Research Ethical Committee/Institutional Review Board of China Medical University Hospital (CMUH105-REC3-068).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []