rECHOmmend: an ECG-based machine-learning approach for identifying patients at high-risk of undiagnosed structural heart disease detectable by echocardiography

2021 
Background Early diagnosis of structural heart disease improves patient outcomes, yet many remain underdiagnosed. While population screening with echocardiography is impractical, electrocardiogram (ECG)-based prediction models can help target high-risk patients. We developed a novel ECG-based machine learning approach to predict multiple structural heart conditions, hypothesizing that a composite model would yield higher prevalence and positive predictive values (PPVs) to facilitate meaningful recommendations for echocardiography. Methods Using 2,232,130 ECGs linked to electronic health records and echocardiography reports from 484,765 adults between 1984-2021, we trained machine learning models to predict the presence of any of seven echocardiography-confirmed diseases within one year. This composite label included: moderate or severe valvular disease (aortic/mitral stenosis or regurgitation, tricuspid regurgitation), reduced ejection fraction 15mm. We tested various combinations of input features (demographics, labs, structured ECG data, ECG traces) and evaluated model performance using 5-fold cross-validation, multi-site validation trained on one clinical site and tested on 11 other independent sites, and simulated retrospective deployment trained on pre-2010 data and deployed in 2010. Findings Our composite rECHOmmend model using age, sex and ECG traces had an area under the receiver operating characteristic curve (AUROC) of 0.91 and a PPV of 42% at 90% sensitivity at a prevalence of 17.9% for our composite label. Individual disease models had AUROCs ranging from 0.86-0.93 and lower PPVs from 1%-31%. The AUROC for models using different input features ranged from 0.80-0.93, increasing with additional features. Multi-site validation showed similar results to the cross-validation, with an aggregate AUROC of 0.91 across our independent test set of 11 clinical sites after training on a separate site. Our simulated retrospective deployment showed that for ECGs acquired in patients without pre-existing known structural heart disease in a single year, 2010, 11% were classified as high-risk, of which 41% developed true, echocardiography-confirmed disease within one year. Interpretation An ECG-based machine learning model using a composite endpoint can predict previously undiagnosed, clinically significant structural heart disease while outperforming single disease models and improving practical utility with higher PPVs. This approach can facilitate targeted screening with echocardiography to improve under-diagnosis of structural heart disease.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    0
    Citations
    NaN
    KQI
    []