Objectives: We validate a machine learning-based sepsis prediction algorithm (InSight) for detection and prediction of three sepsis-related gold standards, using only six vital signs. We evaluate robustness to missing data, customization to site-specific data using transfer learning, and generalizability to new settings. Design: A machine learning algorithm with gradient tree boosting. Features for prediction were created from combinations of only six vital sign measurements and their changes over time. Setting: A mixed-ward retrospective data set from the University of California, San Francisco (UCSF) Medical Center (San Francisco, CA) as the primary source, an intensive care unit data set from the Beth Israel Deaconess Medical Center (Boston, MA) as a transfer learning source, and four additional institutions' datasets to evaluate generalizability. Participants: 684,443 total encounters, with 90,353 encounters from June 2011 to March 2016 at UCSF. Interventions: none Primary and secondary outcome measures: Area under the receiver operating characteristic curve (AUROC) for detection and prediction of sepsis, severe sepsis, and septic shock. Results: For detection of sepsis and severe sepsis, InSight achieves an area under the receiver operating characteristic (AUROC) curve of 0.92 (95% CI 0.90 - 0.93) and 0.87 (95% CI 0.86 - 0.88), respectively. Four hours before onset, InSight predicts septic shock with an AUROC of 0.96 (95% CI 0.94 - 0.98), and severe sepsis with an AUROC of 0.85 (95% CI 0.79 - 0.91). Conclusions: InSight outperforms existing sepsis scoring systems in identifying and predicting sepsis, severe sepsis, and septic shock. This is the first sepsis screening system to exceed an AUROC of 0.90 using only vital sign inputs. InSight is robust to missing data, can be customized to novel hospital data using a small fraction of site data, and retained strong discrimination across all institutions.
Sepsis is one of the leading causes of mortality in hospitalized patients. Despite this fact, a reliable means of predicting sepsis onset remains elusive. Early and accurate sepsis onset predictions could allow more aggressive and targeted therapy while maintaining antimicrobial stewardship. Existing detection methods suffer from low performance and often require time-consuming laboratory test results.To study and validate a sepsis prediction method, InSight, for the new Sepsis-3 definitions in retrospective data, make predictions using a minimal set of variables from within the electronic health record data, compare the performance of this approach with existing scoring systems, and investigate the effects of data sparsity on InSight performance.We apply InSight, a machine learning classification system that uses multivariable combinations of easily obtained patient data (vitals, peripheral capillary oxygen saturation, Glasgow Coma Score, and age), to predict sepsis using the retrospective Multiparameter Intelligent Monitoring in Intensive Care (MIMIC)-III dataset, restricted to intensive care unit (ICU) patients aged 15 years or more. Following the Sepsis-3 definitions of the sepsis syndrome, we compare the classification performance of InSight versus quick sequential organ failure assessment (qSOFA), modified early warning score (MEWS), systemic inflammatory response syndrome (SIRS), simplified acute physiology score (SAPS) II, and sequential organ failure assessment (SOFA) to determine whether or not patients will become septic at a fixed period of time before onset. We also test the robustness of the InSight system to random deletion of individual input observations.In a test dataset with 11.3% sepsis prevalence, InSight produced superior classification performance compared with the alternative scores as measured by area under the receiver operating characteristic curves (AUROC) and area under precision-recall curves (APR). In detection of sepsis onset, InSight attains AUROC = 0.880 (SD 0.006) at onset time and APR = 0.595 (SD 0.016), both of which are superior to the performance attained by SIRS (AUROC: 0.609; APR: 0.160), qSOFA (AUROC: 0.772; APR: 0.277), and MEWS (AUROC: 0.803; APR: 0.327) computed concurrently, as well as SAPS II (AUROC: 0.700; APR: 0.225) and SOFA (AUROC: 0.725; APR: 0.284) computed at admission (P<.001 for all comparisons). Similar results are observed for 1-4 hours preceding sepsis onset. In experiments where approximately 60% of input data are deleted at random, InSight attains an AUROC of 0.781 (SD 0.013) and APR of 0.401 (SD 0.015) at sepsis onset time. Even with 60% of data missing, InSight remains superior to the corresponding SIRS scores (AUROC and APR, P<.001), qSOFA scores (P=.0095; P<.001) and superior to SOFA and SAPS II computed at admission (AUROC and APR, P<.001), where all of these comparison scores (except InSight) are computed without data deletion.Despite using little more than vitals, InSight is an effective tool for predicting sepsis onset and performs well even with randomly missing data.
We present a convolutional-recurrent neural network architecture with long short-term memory for real-time processing and classification of digital sensor data. The network implicitly performs typical signal processing tasks such as filtering and peak detection, and learns time-resolved embeddings of the input signal. We use a prototype multi-sensor wearable device to collect over 180h of photoplethysmography (PPG) data sampled at 20Hz, of which 36h are during atrial fibrillation (AFib). We use end-to-end learning to achieve state-of-the-art results in detecting AFib from raw PPG data. For classification labels output every 0.8s, we demonstrate an area under ROC curve of 0.9999, with false positive and false negative rates both below $2\times 10^{-3}$. This constitutes a significant improvement on previous results utilising domain-specific feature engineering, such as heart rate extraction, and brings large-scale atrial fibrillation screenings within imminent reach.
We present a convolutional-recurrent neural network architecture with long short-term memory for real-time processing and classification of digital sensor data. The network implicitly performs typical signal processing tasks such as filtering and peak detection, and learns time-resolved embeddings of the input signal. We use a prototype multi-sensor wearable device to collect over 180h of photoplethysmography (PPG) data sampled at 20Hz, of which 36h are during atrial fibrillation (AFib). We use end-to-end learning to achieve state-of-the-art results in detecting AFib from raw PPG data. For classification labels output every 0.8s, we demonstrate an area under ROC curve of 0.9999, with false positive and false negative rates both below $2\times 10^{-3}$. This constitutes a significant improvement on previous results utilising domain-specific feature engineering, such as heart rate extraction, and brings large-scale atrial fibrillation screenings within imminent reach.
Severity-of-illness scoring systems have primarily been developed for, and validated in, younger trauma patients.We sought to determine the accuracy of the injury severity score (ISS) and the revised trauma score (RTS) in predicting mortality and hospital length of stay (LOS) in trauma patients over the age of 65 treated in our emergency department (ED).Using the Illinois Trauma Registry, we identified all patients 65 years and older treated in our level I trauma facility from January 2004 to November 2007. The primary outcome was death; the secondary outcome was overall hospital length of stay (LOS). We measured associations between scores and outcomes with binary logistic and linear regression.A total of 347 patients, 65 years of age and older were treated in our hospital during the study period. Median age was 76 years (IQR 69-82), with median ISS 13 (IQR 8-17), and median RTS 7.8 (IQR 7.1-7.8). Overall mortality was 24%. A higher value for ISS showed a positive correlation with likelihood of death, which although statistically significant, was numerically small (OR=1.10, 95% CI 1.06 to 1.13, P<0.001). An elevated RTS had an inverse correlation to likelihood of death that was also statistically significant (OR=0.48, 95% CI 0.39 to 0.58, P<0.001). Total hospital LOS increased with increasing ISS, with statistical significance decreasing at the highest levels of ISS, but an increase in RTS not confirming the predicted decrease in total hospital LOS consistently across all ranges of RTS.The ISS and the RTS were better predictors of mortality than hypothesized, but had limited correlation with hospital LOS in elderly trauma patients. Although there may be some utility in these scores when applied to the elderly population, caution is warranted if attempting to predict the prognosis of patients.
Objectives We validate a machine learning-based sepsis-prediction algorithm ( InSight ) for the detection and prediction of three sepsis-related gold standards, using only six vital signs. We evaluate robustness to missing data, customisation to site-specific data using transfer learning and generalisability to new settings. Design A machine-learning algorithm with gradient tree boosting. Features for prediction were created from combinations of six vital sign measurements and their changes over time. Setting A mixed-ward retrospective dataset from the University of California, San Francisco (UCSF) Medical Center (San Francisco, California, USA) as the primary source, an intensive care unit dataset from the Beth Israel Deaconess Medical Center (Boston, Massachusetts, USA) as a transfer-learning source and four additional institutions’ datasets to evaluate generalisability. Participants 684 443 total encounters, with 90 353 encounters from June 2011 to March 2016 at UCSF. Interventions None. Primary and secondary outcome measures Area under the receiver operating characteristic (AUROC) curve for detection and prediction of sepsis, severe sepsis and septic shock. Results For detection of sepsis and severe sepsis, InSight achieves an AUROC curve of 0.92 (95% CI 0.90 to 0.93) and 0.87 (95% CI 0.86 to 0.88), respectively. Four hours before onset, InSight predicts septic shock with an AUROC of 0.96 (95% CI 0.94 to 0.98) and severe sepsis with an AUROC of 0.85 (95% CI 0.79 to 0.91). Conclusions InSight outperforms existing sepsis scoring systems in identifying and predicting sepsis, severe sepsis and septic shock. This is the first sepsis screening system to exceed an AUROC of 0.90 using only vital sign inputs. InSight is robust to missing data, can be customised to novel hospital data using a small fraction of site data and retains strong discrimination across all institutions.
Introduction: Emergency department (ED) crowding has been shown to negatively impact patient outcomes. Few studies have addressed the effect of ED crowding on patient satisfaction. Our objective was to evaluate the impact of ED crowding on patient satisfaction in patients discharged from the ED.Methods: We measured patient satisfaction using Press-Ganey surveys returned by patients that visited our ED between August 1, 2007 and March 31, 2008. We recorded all mean satisfaction scores and obtained mean ED occupancy rate, mean EDWIN score and hospital diversion status over each 8-hour shift from data archived in our electronic tracking board. Univariate and multivariate logistic regression analysis was calculated to determine the effect of ED crowding and hospital diversion status on the odds of achieving a mean satisfaction score ≥85, which was the patient satisfaction goal set forth by our ED administration.Results: A total of 1591 surveys were returned over the study period. Mean satisfaction score was 77.6 (SD±16) and mean occupancy rate was 1.23 (SD±0.31). The likelihood of failure to meet patient satisfaction goals was associated with an increase in average ED occupancy rate (OR 0.32, 95% CI 0.17 to 0.59, P<0.001) and an increase in EDWIN score (OR 0.05, 95% CI 0.004 to 0.55, P=0.015). Hospital diversion resulted in lower mean satisfaction scores, but this was not statistically significant (OR 0.62, 95% CI 0.36 to 1.05). In multivariable analysis controlling for hospital diversion status and time of shift, ED occupancy rate remained a significant predictor of failure to meet patient satisfaction goals (OR 0.34, 95% CI 0.18 to 0.66, P=0.001).Conclusions: Increased crowding, as measured by ED occupancy rate and EDWIN score, was significantly associated with reduced patient satisfaction. Although causative attribution was limited, our study suggested yet another negative impact resulting from ED crowding. [West J Emerg Med.2013;14(1):11-15.]