Ignored evident multiplicity harms replicability -- adjusting for it offers a remedy.

2021 
It is a central dogma in science that a result of a study should be replicable. Only 90 of the 190 replications attempts were successful. We attribute a substantial part of the problem to selective inference evident in the paper, which is the practice of selecting some of the results from the many. 100 papers in the Reproducibility Project in Psychology were analyzed. It was evident that the reporting of many results is common (77.7 per paper on average). It was further found that the selection from those multiple results is not adjusted for. We propose to account for selection using the hierarchical false discovery rate (FDR) controlling procedure TreeBH of Bogomolov et al. (2020), which exploits hierarchical structures to gain power. Results that were statistically significant after adjustment were 97% of the replicable results (31 of 32). Additionally, only 1 of the 21 non-significant results after adjustment was replicated. Given the easy deployment of adjustment tools and the minor loss of power involved, we argue that addressing multiplicity is an essential missing component in experimental psychology. It should become a required component in the arsenal of replicability enhancing methodologies in the field.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    57
    References
    0
    Citations
    NaN
    KQI
    []