Abstract
Empirical evaluations of replication have become increasingly common, but there has been no unified approach to doing so. Some evaluations conduct only a single replication study while others run several, usually across multiple laboratories. Designing such programs has largely contended with difficult issues about which experimental components are necessary for a set of studies to be considered replications. However, another important consideration is that replication studies be designed to support sufficiently sensitive analyses. For instance, if hypothesis tests are to be conducted about replication, studies should be designed to ensure these tests are well-powered; if not, it can be difficult to determine conclusively if replication attempts succeeded or failed. This paper describes methods for designing ensembles of replication studies to ensure that they are both adequately sensitive and cost-efficient. It describes two potential analyses of replication studies—hypothesis tests and variance component estimation—and approaches to obtaining optimal designs for them. Using these results, it assesses the statistical power, precision of point estimators and optimality of the design used by the Many Labs Project and finds that while it may have been sufficiently powered to detect some larger differences between studies, other designs would have been less costly and/or produced more precise estimates or higher-powered hypothesis tests.
Original language | English (US) |
---|---|
Pages (from-to) | 868-886 |
Number of pages | 19 |
Journal | Journal of the Royal Statistical Society. Series A: Statistics in Society |
Volume | 184 |
Issue number | 3 |
DOIs | |
State | Published - Jul 2021 |
Keywords
- experimental design
- meta-analysis
- power
- replication
ASJC Scopus subject areas
- Statistics and Probability
- Social Sciences (miscellaneous)
- Economics and Econometrics
- Statistics, Probability and Uncertainty