Electronic, personalized clinical decision support tools to optimize glycated hemoglobin (HbA1c) screening are lacking. Current screening guidelines are based on simple, categorical rules developed for populations of patients. Although personalized diabetes risk calculators have been created, none are designed to predict current glycemic status using structured data commonly available in electronic health records (EHRs).The goal of this project was to create a mathematical equation for predicting the probability of current elevations in HbA1c (≥5.7%) among patients with no history of hyperglycemia using readily available variables that will allow integration with EHR systems.The reduced model was compared head-to-head with calculators created by Baan and Griffin. Ten-fold cross-validation was used to calculate the bias-adjusted prediction accuracy of the new model. Statistical analyses were performed in R version 3.2.5 (The R Foundation for Statistical Computing) using the rms (Regression Modeling Strategies) package.The final model to predict an elevated HbA1c based on 22,635 patient records contained the following variables in order from most to least importance according to their impact on the discriminating accuracy of the model: age, body mass index, random glucose, race, serum non-high-density lipoprotein, serum total cholesterol, estimated glomerular filtration rate, and smoking status. The new model achieved a concordance statistic of 0.77 which was statistically significantly better than prior models. The model appeared to be well calibrated according to a plot of the predicted probabilities versus the prevalence of the outcome at different probabilities.The calculator created for predicting the probability of having an elevated HbA1c significantly outperformed the existing calculators. The personalized prediction model presented in this paper could improve the efficiency of HbA1c screening initiatives.
OBJECTIVE Diabetes surveillance often requires manual medical chart reviews to confirm status and type. This project aimed to create an electronic health record (EHR)-based procedure for improving surveillance efficiency through automation of case identification. RESEARCH DESIGN AND METHODS Youth (<20 years old) with potential evidence of diabetes (N = 8,682) were identified from EHRs at three children’s hospitals participating in the SEARCH for Diabetes in Youth Study. True diabetes status/type was determined by manual chart reviews. Multinomial regression was compared with an ICD-10 rule-based algorithm in the ability to correctly identify diabetes status and type. Subsequently, the investigators evaluated a scenario of combining the rule-based algorithm with targeted chart reviews where the algorithm performed poorly. RESULTS The sample included 5,308 true cases (89.2% type 1 diabetes). The rule-based algorithm outperformed regression for overall accuracy (0.955 vs. 0.936). Type 1 diabetes was classified well by both methods: sensitivity (Se) (>0.95), specificity (Sp) (>0.96), and positive predictive value (PPV) (>0.97). In contrast, the PPVs for type 2 diabetes were 0.642 and 0.778 for the rule-based algorithm and the multinomial regression, respectively. Combination of the rule-based method with chart reviews (n = 695, 7.9%) of persons predicted to have non–type 1 diabetes resulted in perfect PPV for the cases reviewed while increasing overall accuracy (0.983). The Se, Sp, and PPV for type 2 diabetes using the combined method were ≥0.91. CONCLUSIONS An ICD-10 algorithm combined with targeted chart reviews accurately identified diabetes status/type and could be an attractive option for diabetes surveillance in youth.
Surveillance of DM status, type, and date of diagnosis requires chart review. We previously showed that ≥ 2 international classification of diseases (ICD) DM codes predicts DM status well. This project aimed to derive an EHR-based algorithm to predict diagnosis date. Youth (< 20 yrs) with potential DM evidence in 2017 (ICD DM code, elevated glucose or HbA1c, or DM medication) were identified from the inpatient and outpatient EHR data of 3 Children’s Hospitals participating in the SEARCH for Diabetes in Youth Study. Potential cases were chart reviewed to determine true DM status, DM type, and diagnosis date. Cases were restricted to those with diagnosis date and data post-2008 due to EHR limitations. We compared 2 algorithms for predicting diagnosis date: (1) first occurrence of an ICD DM code, and (2) first occurrence of any of the following: ICD DM code, elevated glucose or HbA1c, or DM medication. Among cases identified by the ICD status algorithm (n=3,678), the ICD code and multi-criteria algorithms classified diagnosis year correctly 88.9% and 88.4% of the time, respectively. Classification accuracy improved over time (Figure). Performance was poorer for type 2 than type 1. An ICD code model to predict date of diagnosis can accurately identify diagnosis date within these pediatric hospital systems. Improvement over time is likely due to increases in the amount of EHR data. Manual review of type 2 cases may be necessary. Disclosure K.M. Lenoir: None. L.E. Wagenknecht: None. J. Divers: None. R. Casanova: None. J.M. Lawrence: None. D. Dabelea: None. C. Pihoker: None. S. Saydah: None. A.D. Liese: None. D. Standiford: None. B.J. Wells: None. Funding Centers for Disease Control and Prevention (5U18DP006131-05-00)
The utilization of Annual Wellness Visits (AWVs), preventive healthcare visits covered by Medicare Part B, has grown steadily since their inception in 2011. However, longitudinal patterns and variations in use across enrollees, providers, and clinics remain poorly understood.
Abstract Frailty and social determinants of health (SDOH) have been associated with mortality for older adults. Given time limitations, passive electronic tools such as the eFI and publicly available data such as the Area Deprivation Index (ADI) hold appeal for targeting limited resources to support at-risk older adults. Literature is conflicting with regards to the relationship between frailty and SDOH. A retrospective, observational cohort of adults 65+ (n=44,548) identified as part of the Wake Forest Baptist Health (WFBH) accountable care organization was used to evaluate the association between ADI, eFI, and mortality between 1/1/2019 and 1/1/2020. A cox proportional hazard model was fit, adjusting for age, sex, race, and weighted Charlson Comorbidity Index. Sources of mortality data include claims data, the EHR at WFBH, and NC Vital Statistics. Block-level geographic identifiers (GEOID) were extracted and used to merge ADI national percentiles (Neighborhood Atlas), derived from U.S. Census 5-year American Community Survey estimates, which incorporates 17 SDOH measures (e.g., income, education, housing, employment.) Frailty was calculated by the WFBH eFI. 9216 (20.7%) were frail by eFI (eFI>0.21) and 235 (0.5%) died. The interaction between ADI tertile and eFI category was not significant (p=0.78). Being frail was associated with poorer survival when compared to the fit group; HR= 1.94 (95% CI =1.23, 3.08; p<0.01.) Survival did not differ between the maximum deprivation tertile and the minimum tertile, HR=1.21 (95% CI=0.88-1.68, p=0.25). Frailty and SDOH may represent independent constructs in risk stratification for older adults. Future work will explore associations within healthcare utilization.
The oblique random survival forest (RSF) is an ensemble supervised learning method for right-censored outcomes. Trees in the oblique RSF are grown using linear combinations of predictors, whereas in the standard RSF, a single predictor is used. Oblique RSF ensembles have high prediction accuracy, but assessing many linear combinations of predictors induces high computational overhead. In addition, few methods have been developed for estimation of variable importance (VI) with oblique RSFs. We introduce a method to increase computational efficiency of the oblique RSF and a method to estimate VI with the oblique RSF. Our computational approach uses Newton-Raphson scoring in each non-leaf node, We estimate VI by negating each coefficient used for a given predictor in linear combinations, and then computing the reduction in out-of-bag accuracy. In benchmarking experiments, we find our implementation of the oblique RSF is hundreds of times faster, with equivalent prediction accuracy, compared to existing software for oblique RSFs. We find in simulation studies that "negation VI" discriminates between relevant and irrelevant numeric predictors more accurately than permutation VI, Shapley VI, and a technique to measure VI using analysis of variance. All oblique RSF methods in the current study are available in the aorsf R package, and additional supplemental materials are available online.