Skip to content

We’re gonna have a discussion of Deborah Mayo’s new book!

That’s Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars. She’ll send us some pages that we can post here, we’ll get some people to share their thoughts, and there will be lots of opportunity for comments.


  1. Very good. Should be interesting.

  2. Thanatos Savehn says:

    I’m beginning to turn blue. Anyway, looking forward to adding to my library.

  3. I hope there is diverse participation. Sometimes a great topic doesn’t stick on blogs. Could be people’s schedules.

  4. Thanks! I’m very much looking forward to it!

  5. James Lee MD PhD says:

    I find this new book to be dynamite–both its carefully worded content and particularly the quality of the paper, the fonts, the order of chapters, etc. This is certainly a book that needs to be in the personal library of every person using statistical hypothesis testing in his/her research. I think that in time it will eclipse her classic Error and the Growth of Experimental Knowledge, which was a definite winner.

    In sum, another masterpiece of scholarship from Deborah Mayo. She deserves our thanks!

  6. I’d be interested to see discussion of her criticism (on pp361-370) of the Ioannidis (2015) paper “Why Most Published Research Findings Are False.”
    Surely, if checks on papers published in a journal in earlier years have judged that only 20% are replicable, and little has changed in the papers that are accepted, should this temper the trust placed in more recently published work? Mayo appears to think not. I disagree. It may reasonably temper the judgment of a PhD student who hopes to build on published work, unless she can find a clear reason why a paper of interest stands out from the crowd. Exactly such a query to Begley as lead author of the 2012 paper Begley et al. “Drug Development: Raise Standards for Preclinical Cancer Research” [Nature 483 (7391): 531–33] prompted the 2013 Begley paper
    “Reproducibility: Six Red Flags for Suspect Work” (Nature 497 (7450): 433–34).

    Suppose that the paper of interest appears, after checking for red flags, to be in a category where 3 out of 4 of the main results are not replicable, i.e., the odds ratio is R=3:1 against replicability. Ignoring possible bias, the Ioannidis argument is that the false finding rate (FFV; Mayo’s terminology), or probability that a positive is a false positive, is R*alpha/(1-beta), where alpha is the Type I error rate and beta is the Type II error rate.

    On p.361, with normal distribution assumptions for testing mu=0 against mu>0, she seems to be making the comparison:
    1: alpha=0.05, 1-beta = 0.8; R*alpha/(1-beta) = R/16 [reject H0 if standardized test statistic z > 1.64]
    vs 2: 1-beta = 0.9; for this one needs alpha ~0.115; R*alpha/(1-beta) = R/16 [reject H0 if standardized test statistic z > 1.28]

    [Mayo appears to say ‘almost 0.9’ where I have 0.115 . I’d assume she’d switched alpha and 1-alpha, except that she says “denial of
    the alternative H1 does not yield the same null . . . used to obtain the Type I error . . . 0.05 . Instead it would be high, nearly as high as 0.9 ]

    She goes on the say: “Thus, the identification of ‘effect’ and ‘no effect’ with the hypotheses used to compute the Type I error probability and power are inconsistent with one another.”

    I fail to see the inconsistency. In moving from 1) to 2) the number of positives increases, but more of them are false positives and the FFV increases accordingly.

    The model that underpins the formula may or may not be fit for purpose, at least in giving ballpark indications. That is another discussion. I’d prefer to work with densities, but that makes the mathematics more complicated.

Leave a Reply