Automated Detection of Periprosthetic Joint Infections and Data Elements Using Natural Language Processing

2020 
Abstract Introduction Periprosthetic joint infection (PJI) data elements are contained in both structured and unstructured documents in electronic health records and require manual data collection. The goal of this study was to develop a natural language processing (NLP) algorithm to replicate manual chart review for PJI data elements. Methods PJI were identified among all TJA procedures performed at a single academic institution between 2000 and 2017. Data elements that comprise the Musculoskeletal Infection Society (MSIS) criteria were manually extracted and used as the gold standard for validation. A training sample of 1197 TJA surgeries (170 PJI cases) was randomly selected to develop the prototype NLP algorithms and an additional 1179 surgeries (150 PJI cases) were randomly selected as the test sample. The algorithms were applied to all consultation notes, operative notes, pathology reports and microbiology reports to predict the correct status of PJI based on MSIS criteria. Results The algorithm --which identified patients with PJI based on MSIS criteria--achieved an f1-score (harmonic mean of precision and recall) of 0.911. Algorithm performance in extracting the presence of sinus tract, purulence, pathological documentation of inflammation, and growth of cultured organisms from the involved TJA achieved f1-scores ranged from 0.771 to 0.982, sensitivity ranged from 0.730 to 1.000, and specificity ranged from 0.947 to 1.000. Conclusion NLP-enabled algorithms have the potential to automate data collection for PJI diagnostic elements, which could directly improve patient care and augment cohort surveillance and research efforts. Further validation is needed in other hospital settings. Level of Evidence Level III, Diagnostic;
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    4
    Citations
    NaN
    KQI
    []