Multi-Radiologist User Study for Artificial Intelligence-Guided Grading of COVID-19 Lung Disease Severity on Chest Radiographs.

2021 
ABSTRACT Rationale and Objectives Radiographic findings of COVID-19 pneumonia can be used for patient risk stratification; however, radiologist reporting of disease severity is inconsistent on chest radiographs (CXRs). We aimed to see if an artificial intelligence (AI) system could help improve radiologist interrater agreement. Materials and Methods We performed a retrospective multi-radiologist user study to evaluate the impact of an AI system, the PXS score model, on the grading of categorical COVID-19 lung disease severity on 154 chest radiographs into four ordinal grades (normal/minimal, mild, moderate, and severe). Four radiologists (two thoracic and two emergency radiologists) independently interpreted 154 CXRs from 154 unique patients with COVID-19 hospitalized at large academic center, before and after using the AI system (median washout time interval was 16 days). Three different thoracic radiologists assessed the same 154 CXRs using an updated version of the AI system trained on more imaging data. Radiologist interrater agreement was evaluated using Cohen and Fleiss kappa where appropriate. The lung disease severity categories were associated with clinical outcomes in a previously published outcomes data using Fisher's exact test and Chi-square test for trend. Results Use of the AI system improved radiologist interrater agreement (Fleiss κ=0.40 to 0.66, before and after use of the system). The Fleiss κ for three radiologists using the updated AI system was 0.74. Severity categories were significantly associated with subsequent intubation or death within 3 days. Conclusion An AI system used at the time of CXR study interpretation can improve the interrater agreement of radiologists.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    2
    Citations
    NaN
    KQI
    []