ROC Curve Analysis in the Presence of Imperfect Reference Standards

2017 
The receiver operating characteristic (ROC) curve is an important tool for the evaluation and comparison of predictive models when the outcome is binary. If the class membership of the outcomes is known, ROC can be constructed for a model, and the ROC with greater area under the curve indicates better performance. However in practice, imperfect reference standards often exist, in which class membership of every data point is not fully determined. This situation is especially prevalent in high-throughput biomedical data because obtaining perfect reference standards for all data points is either too costly or technically impractical. To construct ROC curves for these data, the common practice is to either ignore the uncertainties in references or remove data points with high uncertainties. Such approaches may cause bias to the ROC curves and generate misleading results in method evaluation. Here we present a framework to incorporate membership uncertainties into the construction of ROC curve, termed the expected ROC or “eROC” curve. We develop an efficient procedure for the estimation of eROC curve. The advantages of using eROC are demonstrated using simulated and real data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    9
    Citations
    NaN
    KQI
    []