Diagnostic accuracy and inter-observer reliability of the O-RADS scoring system among staff radiologists in a North American academic clinical setting.

2021 
The objective of this study is to evaluate the diagnostic accuracy, interobserver variability, and common lexicon pitfalls of the ACR O-RADS scoring system among staff radiologists without prior experience to O-RADS. After independent review of the ACR O-RADS publications and 30 training cases, three fellowship-trained, board-certified staff radiologists scored 50 pelvic ultrasound exams using the O-RADS system. The diagnostic accuracy and area under receiver operating characteristic were analyzed for each reader. Overall agreement and pair-wise agreement between readers were also analyzed. Excellent specificities (92 to 100%), NPVs (92 to 100%), and variable sensitivities (72 to 100%), PPVs (66 to 100%) were observed. Considering O-RADS 4 and O-RADS 5 as predictors of malignancy, individual reader AUC values range from 0.94 to 0.98 (p < 0.001). Overall inter-reader agreement for all 3 readers was “very good,” k = 0.82 (0.73 to 0.90, 95% CI, p < 0.001). Pair-wise agreement between readers were also “very good,” k = 0.86–0.92. 14 out of 150 lesions were misclassified, with the most common error being down-scoring of a solid lesion with irregular outer contours. Even without specific training, experienced ultrasound readers can achieve excellent diagnostic performance and high inter-reader reliability with self-directed review of guidelines and cases. The study highlights the effectiveness of ACR O-RADS as a stratification tool for radiologists and supports its continued use in practice.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []