Inter-rater reliability in systematic review methodology: exploring variation in coder decision-making

2019 
A methodologically sound systematic review is characterized by transparency, replicability and a clear inclusion criteria. However, little attention has been paid to reporting the details of inter-rater reliability (IRR) when multiple coders are used to make decisions at various points in the screening and data extraction stages of a study. Prior research has mentioned the paucity of information on IRR, including number of coders involved, at what stages and how IRR tests were conducted, and how disagreements were resolved. This paper examines and reflects on the human factors that affect decision-making in systematic reviews via reporting on three IRR tests, conducted at three different points in the screening process, for two distinct reviews. Results of the two studies are discussed in the context of inter rater and intra rater reliability in terms of the accuracy, precision and reliability of coding behaviour of multiple coders. Findings indicated that coding behaviour changes both between and within individuals over time, emphasising the importance of conducting regular and systematic inter and intra-rater reliability tests, especially when multiple coders are involved, to ensure consistency and clarity at the screening and coding stages. Implications for good practice while screening/coding for systematic reviews are discussed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    34
    Citations
    NaN
    KQI
    []