Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds

Dylan T. Beckler,Zachary C. Thumser,Jonathon S. Schofield,Paul D. Marasco

Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds

2018

Dylan T. Beckler
Zachary C. Thumser
Jonathon S. Schofield
Paul D. Marasco

Background Indices of inter-evaluator reliability are used in many fields such as computational linguistics, psychology, and medical science; however, the interpretation of resulting values and determination of appropriate thresholds lack context and are often guided only by arbitrary “rules of thumb” or simply not addressed at all. Our goal for this work was to develop a method for determining the relationship between inter-evaluator agreement and error to facilitate meaningful interpretation of values, thresholds, and reliability.

Keywords:

Econometrics
Rule of thumb
Computational linguistics
Inter-rater reliability
Psychology
Population
Approximation error
Cutoff
medical science
Statistics
Systematic error

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations