Performance of a Computational Phenotyping Algorithm for Sarcoidosis Using Diagnostic Codes in Electronic Medical Records: A Pilot Study from Two Veterans Affairs Medical Centers

2021 
BackgroundThe accuracy of identifying sarcoidosis cases in electronic medical records (EMR) using diagnostic codes is unknown. MethodsTo estimate the statistical performance of using diagnostic codes, ICD-9 and ICD-10 diagnostic codes in identifying sarcoidosis cases in EMR, we searched the San Francisco and Palo Alto Veterans Affairs (VA) medical centers EMR and randomly selected 200 patients coded as sarcoidosis. To further improve diagnostic accuracy, we developed an "index of suspicion" algorithm to identify probable sarcoidosis cases based on clinical and radiographic features. We then determined the positive predictive value (PPV) of diagnosing sarcoidosis by two computational methods using ICD only and ICD plus the "index of suspicion" against the gold standard developed through manual chart review based on the American Thoracic Society (ATS) practice guideline. Finally, we determined healthcare providers adherence to the guidelines using a new scoring system. ResultsThe PPV of identifying sarcoidosis cases in VA EMR using ICD codes only was 71% (95%CI=64.7%-77.3%). The inclusion of our construct of "index of suspicion" along with the ICD codes significantly increased the PPV to 90% (95%CI=85.2%-94.6%). The care of sarcoidosis patients was more likely to be classified as "Fully" or "Substantially" adherent with the ATS practice guideline if their managing provider was a specialist (45% of primary care providers vs. 74% of specialists; P=0.008). ConclusionsAlthough ICD codes can be used as reasonable classifiers to identify sarcoidosis cases within EMR, using computational algorithms to extract clinical and radiographic information ("index of suspicion") from unstructured data could significantly improve case identification accuracy. HighlightsO_LIIdentifying sarcoidosis cases using diagnostic codes in EMR has low accuracy. C_LIO_LI"Unstructured data" contain information useful in identifying cases of sarcoidosis. C_LIO_LIComputational algorithms could improve the accuracy and efficiency of case identification in EMR. C_LIO_LIWe introduce a new scoring system for assessing healthcare providers compliance with the American Thoracic Society (ATS) practice guideline. C_LIO_LICompliance scoring could help automatically assess sarcoidosis patients care delivery. C_LI
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []