Skip to main content

A Practical Solution to the Pervasive Problems of p Values

πŸ“„ Original study β†—
Wagenmakers, Eric-Jan β€’ 2007 Modern Era β€’ methodology

Plain English Summary

This paper lays out a devastating case against the p-value, the workhorse statistic that scientists use to decide whether a result is 'real.' Wagenmakers shows three damning problems: p-values depend on imaginary data you never collected, they change based on what the researcher intended to do (the same data can be 'significant' or not depending on when you planned to stop collecting), and identical p-values can mean wildly different things at different sample sizes. Here's the kicker: at a commonly used threshold, the probability that the boring null hypothesis is actually true can range from 69% to a whopping 92% as your sample grows. As a fix, Wagenmakers champions the Bayesian information criterion (BIC), a straightforward alternative that approximates Bayesian reasoning without the heavy mathematical machinery. This paper later became a loaded weapon in debates over claimed evidence for psychic phenomena.

Actual Paper Abstract

In the field of psychology, the practice of p value null-hypothesis testing is as widespread as ever. Despite this popularity, or perhaps because of it, most psychologists are not aware of the statistical peculiarities of the p value procedure. In particular, p values are based on data that were never observed, and these hypothetical data are themselves influenced by subjective intentions. Moreover, p values do not quantify statistical evidence. This article reviews these p value problems and illustrates each problem with concrete examples. The three problems are familiar to statisticians but may be new to psychologists. A practical solution to these p value problems is to adopt a model selection perspective and use the Bayesian information criterion (BIC) for statistical inference (Raftery, 1995). The BIC provides an approximation to a Bayesian hypothesis test, does not require the specification of priors, and can be easily calculated from SPSS output.

Research Notes

Foundational paper for the Bayesian statistics movement in psychology. Wagenmakers later applied these exact BIC and Bayes factor arguments to critique Bem's (2011) 'Feeling the Future' precognition experiments, making this a direct precursor to the most prominent statistical controversy in this library.

Three fundamental problems with p value null-hypothesis significance testing (NHST) are reviewed with concrete examples. First, p values depend on data never observed, violating the conditionality principle. Second, p values depend on the researcher's subjective sampling intentions β€” identical data yield p = 0.146 under binomial but p = 0.033 under negative binomial sampling. Third, the 'p postulate' (equal p values = equal evidence) is false: Bayesian analysis shows that for data with p = .05, posterior probability of Hβ‚€ rises from ~0.69 at n = 400 to ~0.92 at n = 10,000. The Bayesian information criterion (BIC) is proposed as a practical alternative, approximating Bayesian hypothesis testing without requiring prior specification.

Links

Related Papers

Also by these authors

More in Methodology

πŸ“‹ Cite this paper
APA
Wagenmakers, Eric-Jan (2007). A Practical Solution to the Pervasive Problems of p Values. Psychonomic Bulletin & Review. https://doi.org/10.3758/BF03194105
BibTeX
@article{wagenmakers_2007_practical_solution,
  title = {A Practical Solution to the Pervasive Problems of p Values},
  author = {Wagenmakers, Eric-Jan},
  year = {2007},
  journal = {Psychonomic Bulletin & Review},
  doi = {10.3758/BF03194105},
}