Our healthcare information is trapped.It is trapped in the proprietary data models of the electronic medical record (EMR) and in our healthcare systems' data warehouses.This reality has become strikingly clear as the coronavirus disease 2019 (COVID-19) pandemic has swept across the globe, killing >80 000 people in the United States alone.We need answers but struggle to address even the simplest questions.How many individuals are infected?Who is at the highest risk for developing severe infection?What therapies are being used to treat hospitalized patients?This crisis is testing the limits of our public health and healthcare systems in many ways, including a quarantined health information system.This perspective reviews several deficiencies in healthcare information technology that currently limit our ability to deal with the pandemic and suggests current solutions moving forward.In an ideal world, healthcare systems would speak the same language, communicate with public health agencies, and engage directly with the community.This type of system would allow us to track, learn, and innovate during the current crisis.The COVID-19 pandemic lays bare just how far we are from this vision, and, sadly, the deficiency will have dire consequences.For example, pooling data across multiple institutions is critical for scientific discovery and community surveillance.No single healthcare system reflects the status of a community or region, and individually each lacks the sample size and diversity for robust, generalizable results.Yet bringing data together for pooled analyses is currently too difficult.Different healthcare systems essentially speak different languages.Even if 2 systems use the same EMR software, each build is individualized such that the same concept may be hidden in different places in the data.For example, hydroxychloroquine has emerged as a potential treatment option for COVID-19 but also has known QTprolonging effects and can cause ventricular arrhythmias.As these are rare complications, healthcare systems and researchers must pool data to identify these events among treated patients.When a clinician researcher says, "Find me the patients treated with hydroxychloroquine," a data scientist hears, "Find me the patients with one of the drug codes that represents hydroxychloroquine."Yet one healthcare system may use the National Drug Code directory to represent medications and another system might use Medi-Span to represent medications.The systems speak different languages-a barrier to pooling data for rapid analyses.Common data models (CDMs) address the interoperability issue to some degree, but this is an imperfect solution.Using our example above, the 2 different hydroxychloroquine representations could be mapped to a single data format, to facilitate pooling.But systems must still map their data accurately to the CDM, and the upfront cost is generally steep for this labor-intensive process.The CDM is essentially a middle man, and, currently, the transformation process is not automated nor does it occur in real time.After data are mapped to the CDM, the tables in
Background: Learning healthcare systems need techniques that can accurately and automatically identify health outcomes in large populations. Outcomes are often described in clinical narration in the electronic medical record. Objective: To develop and compare two natural language processing (NLP) approaches, rules-based (RB) and machine-learning (ML), for identifying bleeding events in clinical notes. Methods: We used de-identified notes from the Medical Information Mart for Intensive Care. We randomly selected 990 notes for a training set and 660 notes for a test set. Physicians classified each note as present or absent for a clinically relevant bleeding event during the hospitalization. We developed a dictionary of target and modifier words for the RB approach. In RB, the computer “reads” the text and tags bleeding targets as present or absent based on the modifier words; the mentions are aggregated to arrive at a classification for the note. For the ML approach, each note was represented as a high-dimensional vector where each dimension corresponds to the frequency of a certain word. Similar notes (e.g. bleeding present notes) have similar vectors; the computer learns these patterns to predict the class for an unseen note. One RB and three ML models (support vector machine (SVM), extra trees (ET), convolutional neural network (CNN)) were trained using the full 990-note training set. Another instance of each ML model was also trained on a down-sampled (DS) set of 450 notes, with equal positive and negative notes. We ran the trained models on the 660-note test set and compared classification performance using McNemar’s test. Results: The 660 note test set represented 527 unique patients, 40% female. Bleeding events were present in 21% of the notes. The ET-DS model was the most sensitive, followed by the RB approach (93.8% versus 91.1%, p=0.44). The PPV value for the ET-DS model, however, was <50%. The RB had the best performance overall, with 84.6% specificity, 62.7% positive predictive value, and 97.1% negative predictive value (NPV) for identifying clinically relevant bleeding. Discussion: A RB NLP approach, compared to ML, has the best overall performance in independently identifying bleeding events among critically ill patients. The current models have high NPV, so could be used to reduce the chart review burden.
Background: Oral anticoagulants reduce stroke risk among atrial fibrillation (AF) patients, yet treatment rates remain low. A technology known as SMART on FHIR allows third party apps to integrate with electronic medical records (EMR) and provide decision support tools. University of Utah Health implemented an alpha version of an integrated MDCalc app - MDCalc on FHIR (MoF) - in the Epic EHR using a SMART on FHIR interface to enable multiple calculations, one of which was the CHA2DS2-VASc calculation. While MDCalc on FHIR is designed to automate inputs that are then reviewed by the practicing clinician with an opportunity to override with clinical judgment, we prospectively compared the automated app score - without physician oversight - with the clinician score, as an early measurement to evaluate accuracy and identify areas for improvement. Methods: We identified outpatient AF patients who were seen in the University of Utah’s cardiovascular clinics between 11/5/2018 and 12/7/2018. We used the MoF app to automatically calculate the CHA2DS2-VASc score within 24 hours of the identified clinic visit - without the MoF feature of allowing clinician interaction - and compared these values to the score documented by the clinician. We also categorized patients as either low risk of stroke or high risk of stroke and calculated the net reclassification index (NRI) using the app compared to documentation. Patients with an app score ≥2 were considered high risk, while patients with a documented score of ≥2 or were prescribed an oral anticoagulant were considered high risk. Results: We identified 200 AF patients, of whom 111 had a documented clinician score. The mean MoF app score was 3.79 (SD 1.86) compared to a mean clinician score of 3.25 (SD 1.63; p=0.02). The NRI was 13.5% (27 of 200) using the app compared to documentation. Ten percent (19 of 200) of patients were “up-classified” by the app, meaning they were high risk by app and low risk by the clinician. Four percent (8 of 200) of patients were “down-classified” by the app, meaning they were low risk by the app and high risk by the clinician. Upon review of these cases, and after accounting for patients who were or were not anticoagulated for a clinically relevant reason (history of bleeding, recent cardioversion, or patient preference), we found that three percent (5 of 200) of patients were “up-classified” by the app, and two percent (3 of 200) of patients were “down-classified” by the app, making the adjusted NRI 4% (8 of 200). Conclusion: SMART on FHIR enables third party vendors to create EMR-based apps, which could provide decision support and improve care. We found that our FHIR-app based approach tended to identify more comorbidities by using medication data to assign conditions, resulting in a higher CHA2DS2-VASc score compared to clinicians. These differences would have had a 4% effect on the actual decision to anticoagulate.
Abstract The United States (US), which is currently the epicenter for the COVID-19 pandemic, is a country whose demographic composition differs from that of other highly-impacted countries. US-based descriptions of SARS-CoV-2 infections have, for the most part, focused on patient populations with severe disease, captured in areas with limited testing capacity. The objective of this study is to compare characteristics of positive and negative SARS-CoV-2 patients, in a population primarily comprised of mild and moderate infections, identified from comprehensive population-level testing. Here, we extracted demographics, comorbidities, and vital signs from 20,088 patients who were tested for SARS-CoV-2 at University of Utah Health clinics, in Salt Lake County, Utah; and for a subset of tested patients, we performed manual chart review to examine symptoms and exposure risks. To determine risk factors for testing positive, we used logistic regression to calculate the odds of testing positive, adjusting for symptoms and prior exposure. Of the 20,088 individuals, 1,229 (6.1%) tested positive for SARS-CoV-2. We found that Non-White persons were more likely to test positive compared to non-Hispanic Whites (adjOR=1.1, 95% CI: 0.8, 1.6), and that this increased risk is more pronounced among Hispanic or Latino persons (adjOR=2.0, 95%CI: 1.3, 3.1). However, we did not find differences in the duration of symptoms nor type of symptom presentation between non-Hispanic White and non-White individuals. We found that risk of hospitalization increases with age (adjOR=6.9 95% CI: 2.1, 22.5 for age 60+ compared to 0-19), and additionally show that younger individuals (aged 019), were underrepresented both in overall rates of testing as well as rates of testing positive. We did not find major race/ethnic differences in hospitalization rates. In this analysis of predominantly non-hospitalized individuals tested for SARS-CoV-2, enabled by expansive testing capacity, we found disparities in both testing and SARS-CoV-2 infection status by race/ethnicity and by age. Further work on addressing racial and ethnic disparities, particularly among Hispanic/Latino communities (where SARS-CoV-2 may be spreading more rapidly due to increased exposure and comparatively reduced testing), will be needed to effectively combat COVID-19 in the US.