Assessing reader performance in radiology, an imperfect science: lessons from breast screening.

2012 
The purpose of this article is to review the limitations associated with current methods of assessing reader accuracy in mammography screening programmes. Clinical audit is commonly used as a quality-assurance tool to monitor the performance of screen readers; however, a number of the metrics employed, such as recall rate as a surrogate for specificity, do not always accurately measure the intended clinical feature. Alternatively, standardized screening test sets, which benefit from ease of application, immediacy of results, and quicker assessment of quality improvement plans, suffer from experimental confounders, thus questioning the relevance of these laboratory-type screening test sets to clinical performance. Four key factors that impact on the external validity of screening test sets were identified: the nature and extent of scrutiny of one's action, the artificiality of the environment, the over-simplification of responses, and prevalence of abnormality. The impact of these factors on radiological and other contexts is discussed, and although it is important to acknowledge the benefit of standardized screening test sets, issues relating to the relevance of test sets to clinical activities remain. The degree of correlation between performance based on real-life clinical audit and performances at screen read test sets must be better understood and specific causal agents for any lack of correlation identified.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    25
    Citations
    NaN
    KQI
    []