Machine Learning to Predict Mortality and Critical Events in COVID-19 Positive New York City Patients: A Cohort Study

Akhil Vaid,Sulaiman Somani,Adam J. Russak,J. K. De Freitas,Fayzan Chaudhry,Ishan Paranjpe,Kipp W. Johnson,Samuel J. Lee,Riccardo Miotto,Felix Richter,Shan Zhao,Noam D. Beckmann,Nidhi Naik,Arash Kia,Prem Timsina,Anuradha Lala,Manish Paranjpe,Eddye Golden,Matteo Danieletto,Manbir Singh,Dara Meyer,Paul F. OReilly,Laura M. Huckins,Patricia Kovatch,Joseph Finkelstein,Robert Freeman,Edgar Argulian,Andrew Kasarskis,Bethany Percha,Judith A. Aberg,Emilia Bagiella,Carol R. Horowitz,Barbara Murphy,Eric J. Nestler,Eric E. Schadt,Judy H. Cho,Carlos Cordon-Cardo,Valentin Fuster,Dennis S. Charney,David Reich,Erwin P. Bottinger,Matthew A. Levin,Jagat Narula,Zahi A. Fayad,Allan C. Just,Alexander W. Charney,Girish N. Nadkarni,Benjamin S. Glicksberg

Machine Learning to Predict Mortality and Critical Events in COVID-19 Positive New York City Patients: A Cohort Study

2020

BACKGROUND: Coronavirus disease 2019 (COVID-19) has infected millions of patients worldwide and has been responsible for several hundred thousand fatalities This has necessitated thoughtful resource allocation and early identification of high-risk patients However, effective methods for achieving this are lacking OBJECTIVE: We analyze Electronic Health Records from COVID-19 positive hospitalized patients admitted to the Mount Sinai Health System in New York City (NYC) We present machine learning models for making predictions about the hospital course over clinically meaningful time horizons based on patient characteristics at admission We assess performance of these models at multiple hospitals and time points METHODS: We utilized XGBoost and baseline comparator models, for predicting in-hospital mortality and critical events at time windows of 3, 5, 7 and 10 days from admission Our study population included harmonized electronic health record (EHR) data from five hospitals in NYC for 4,098 COVID-19+ patients admitted from March 15, 2020 to May 22, 2020 Models were first trained on patients from a single hospital (N=1514) before or on May 1, externally validated on patients from four other hospitals (N=2201) before or on May 1, and prospectively validated on all patients after May 1 (N=383) Finally, we establish model interpretability to identify and rank variables that drive model predictions RESULTS: On cross-validation, the XGBoost classifier outperformed baseline models, with area under the receiver operating characteristic curve (AUC-ROC) for mortality at 0 89 at 3 days, 0 85 at 5 and 7 days, and 0 84 at 10 days;XGBoost also performed well for critical event prediction with AUC-ROC of 0 80 at 3 days, 0 79 at 5 days, 0 80 at 7 days, and 0 81 at 10 days In external validation, XGBoost achieved an AUC-ROC of 0 88 at 3 days, 0 86 at 5 days, 0 86 at 7 days, and 0 84 at 10 days for mortality prediction Similarly, XGBoost achieved an AUC-ROC of 0 78 at 3 days, 0 79 at 5 days, 0 80 at 7 days, and 0 81 at 10 days Trends in performance on prospective validation sets were similar At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers for mortality prediction CONCLUSIONS: We trained and validated (both externally and prospectively) machine-learning models for mortality and critical events at different time horizons These models identify at-risk patients, as well as uncover underlying relationships predicting outcomes

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations