Abandon Statistical Significance

Blakeley McShane, David Gal, Andrew Gelman, Christian Robert, Jennifer Tackett

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.

Original languageEnglish (US)
Pages (from-to)235-245
Number of pages11
JournalAmerican Statistician
Volume73
Issue numbersup1
DOIs
StatePublished - Mar 29 2019

Fingerprint

Statistical Significance
p-Value
Social Sciences
Null hypothesis
Paradigm
Bayes Factor
Testing
Data Quality
Replication
Screening
Confidence interval
Recommendations
Decision Making
P value
Statistical significance
Vary
Costs
Evidence
Factors
Social sciences

Keywords

  • Null hypothesis significance testing
  • Replication
  • Sociology of science
  • Statistical significance
  • p-Value

ASJC Scopus subject areas

  • Statistics and Probability
  • Mathematics(all)
  • Statistics, Probability and Uncertainty

Cite this

McShane, Blakeley ; Gal, David ; Gelman, Andrew ; Robert, Christian ; Tackett, Jennifer. / Abandon Statistical Significance. In: American Statistician. 2019 ; Vol. 73, No. sup1. pp. 235-245.
@article{f8f6fd8283d1477f9cf98ba9f4b52511,
title = "Abandon Statistical Significance",
abstract = "We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.",
keywords = "Null hypothesis significance testing, Replication, Sociology of science, Statistical significance, p-Value",
author = "Blakeley McShane and David Gal and Andrew Gelman and Christian Robert and Jennifer Tackett",
year = "2019",
month = "3",
day = "29",
doi = "10.1080/00031305.2018.1527253",
language = "English (US)",
volume = "73",
pages = "235--245",
journal = "American Statistician",
issn = "0003-1305",
publisher = "American Statistical Association",
number = "sup1",

}

Abandon Statistical Significance. / McShane, Blakeley; Gal, David; Gelman, Andrew; Robert, Christian; Tackett, Jennifer.

In: American Statistician, Vol. 73, No. sup1, 29.03.2019, p. 235-245.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Abandon Statistical Significance

AU - McShane, Blakeley

AU - Gal, David

AU - Gelman, Andrew

AU - Robert, Christian

AU - Tackett, Jennifer

PY - 2019/3/29

Y1 - 2019/3/29

N2 - We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.

AB - We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.

KW - Null hypothesis significance testing

KW - Replication

KW - Sociology of science

KW - Statistical significance

KW - p-Value

UR - http://www.scopus.com/inward/record.url?scp=85063197815&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063197815&partnerID=8YFLogxK

U2 - 10.1080/00031305.2018.1527253

DO - 10.1080/00031305.2018.1527253

M3 - Article

VL - 73

SP - 235

EP - 245

JO - American Statistician

JF - American Statistician

SN - 0003-1305

IS - sup1

ER -