Comparing agreement measures

Wei-Min Liu,Brandon D. Gallas

Comparing agreement measures

2008

Agreement is estimated by comparing correlated/paired scores (e.g. the scores from two doctors reading the same set of images), such as the correlation coefficient and measures of concordance. Some variance estimation techniques for these measures are also available in the literature. In this work, we compared four agreement measures: the widely used Pearson's product moment correlation coefficient, Kendall's tau, and two measures that are generalizations of AUC, the area under the receiver operating characteristics (ROC) curve. The generalization allows for ordinal truth that is polytomous (multi-state) or even continuous instead of just binary, and thus AUC is a special case. We investigate how these measures behave in a multi-reader multi-case (MRMC) simulation experiment as we change the intrinsic correlation and number of rating levels. We also investigate a few variance estimation techniques for these measures that are available in the literature. These agreement measures will help investigators developing model observers to compare their models against a human on a case-by-case basis instead of with a summary figure of merit that requires and is limited by binary truth , like AUC. The model observer AUC can equal the human observer AUC, while making very different decisions on a case-by-case basis.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations