Skip to content

“The Statistical Crisis in Science”: My talk this Thurs at the Harvard psychology department

Noon Thursday, January 29, 2015, in William James Hall 765 room 1:

The Statistical Crisis in Science

Andrew Gelman, Dept of Statistics and Dept of Political Science, Columbia University

Top journals in psychology routinely publish ridiculous, scientifically implausible claims, justified based on “p < 0.05.” And this in turn calls into question all sorts of more plausible, but not necessarily true, claims, that are supported by this same sort of evidence. To put it another way: we can all laugh at studies of ESP, or ovulation and voting, but what about MRI studies of political attitudes, or embodied cognition, or stereotype threat, or, for that matter, the latest potential cancer cure? If we can’t trust p-values, does experimental science involving human variation just have to start over? And what to we do in fields such as political science and economics, where preregistered replication can be difficult or impossible? Can Bayesian inference supply a solution? Maybe. These are not easy problems, but they’re important problems.

Here are the slides from the last time I gave this talk, and here are some relevant articles:

[2014] Beyond power calculations: Assessing Type S (sign) and Type M (magnitude) errors. {\em Perspectives on Psychological Science} {\bf 9}, 641–651. (Andrew Gelman and John Carlin)

[2014] The connection between varying treatment effects and the crisis of unreplicable research: A Bayesian perspective. {\em Journal of Management}. (Andrew Gelman)

[2013] It’s too hard to publish criticisms and obtain data for replication. {\em Chance} {\bf 26} (3), 49–52. (Andrew Gelman)

[2012] P-values and statistical practice. {\em Epidemiology}. (Andrew Gelman)


  1. Rahul says:

    The “ridiculous” results are a blessing.

    An apparently reasonable approach that produces plausible results gets no critical examination. It is only when “ridiculous” conclusions are posited that the flaws in the methods get exposed.

    To me Daryl Bem was one of the best things that happened to Statistics in recent years.

    • Andrew says:


      To be fair, Ioannidis, Simonsohn, and others were talking about the flaws in standard statistical methods before that Bem paper came out. But I agree that various silly examples (including those of Bem, Kanazawa, and those ovulation researchers) helped things along. Certainly my own perspective on these sorts of statistically significant studies has changed over the past ten years. I’m much more aware of the possibility that the observed comparisons do not generalize to any larger population of interest.

  2. Keith O'Rourke says:

    An update from clinical research.

    Tom Jefferson, et al (of The Cochrane Collaboration). Risk of bias in industry-funded oseltamivir trials: comparison of core reports versus full clinical study reports 

    The conclusion:
    “This approach is not possible when assessing trials reported in journal publications, in which articles necessarily reflect post hoc reporting with a far more sparse level of detail. We suggest that when bias is so limiting as to make meta-analysis results unreliable, either it should not be carried out or a prominent explanation of its clear limitations should be included alongside the meta-analysis.”

    What is interesting here is that this is the first time anyone in the Cochrane Collaboration had access to the data usually only regulators have. 

  3. Anonymous says:

    There is no “statistical crises in science”. There is a science crises. Science does the following. It takes

    A: Physical fact A
    B: Physical fact B

    and combines them with:

    Theory C: whenever A then B

    If B turns out false, or isn’t reproducible, then there’s only two passible reasons. Either A wasn’t true, or C wasn’t true. If you fix those problems, you’ll get good, predictive, reproducible science. If you don’t, then you’ll be stuck with crap.

    The problem is that Frequentist philosophy, after a century long supremacy and a monopoly on teaching, regularly fools an enormous number of smart people into thinking A and C are both true, and then they’re shocked when B turns out to be false. Hence the claim that it’s a statistical crises.

    Their solution is to double down on Frequentist philosophy. Pre-registration is a prime example. Instead of working to get A’s an C’s which are true, they claim everything will work out if only we pay special attention to what day the question “is C true?” was asked. Somehow, according to them, if you asked it a long time ago it everything’s better. No word yet on what happens when two researchers work together, and one happens to ask that question early, while one asks it late.

    That’s the rabbit hole of insanity that Frequentist philosophy leads anyone to who takes it seriously enough for long enough. All of which can be fixed easily by using probabilities to model the real uncertainty of physical facts, rather than their imaginary frequency of occurance.

    • Martha says:

      ” Science does the following. It takes

      A: Physical fact A
      B: Physical fact B

      and combines them with:

      Theory C: whenever A then B”

      This is an oversimplification. There are lots of situations in science where the theory (scientific hypothesis) in question has the form “A increases the probability of B” (e.g., “Having gene variant X increases the probability of developing disease Y” or “Exposure to X increases the probability of developing Y”)

  4. Anonymous says:

    Can you share the slides of the latest talk?

Leave a Reply