Interrater Reliability for Multiple Raters in Clinical Trials of Ordinal Scale

2007 
This article discusses an evaluation method of reliability regarding the overall ratings of ordinal scales by multiple raters (kP ≧ 3). It is shown that when the sample size (n) is large enough compared with the number of raters (n >> k), both the simple mean Fleiss-Cohen-type weighted kappa statistics averaged over all pairs of raters and the Davies-Fleiss-Schouten—type weighted kappa statistics for multiple raters are approximately equivalent to the intraclass correlation coefficient (ICC) obtained by assigning the integer (natural number) scores for ordinal scales. These kappa statistics and the corresponding ICCs are illustrated in the overall ratings independently given by the three raters from some diagnostic agents studies for magnetic resonance imaging. Both fixed and random effects for raters are discussed and some comparative methods between treatment groups (test and reference) are proposed with an interpretation of the reliability for the overall ratings in the ordinal scale.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    11
    Citations
    NaN
    KQI
    []