In the third paragraph of my comment, the reference should be to page 361.

]]>John (or anyone else):

Yes, please save these longer comments for the future discussions; thanks.

]]>Should further comments/questions then be left until the dates are settled?

NB that FFV in my comment should be FFR — the ‘V’ belongs of course in

PPV = positive predictive value.

Sameera:

Yes, we’re going to have a discussion of Mayo’s book. Just waiting on her to pick the dates.

]]>I’m not to those pages from Deborah Mayo’s book. Will jump to them. Perhaps Deborah can clarify it here for us.

I thought her book was going to be discussed here after publication.

]]>Surely, if checks on papers published in a journal in earlier years have judged that only 20% are replicable, and little has changed in the papers that are accepted, should this temper the trust placed in more recently published work? Mayo appears to think not. I disagree. It may reasonably temper the judgment of a PhD student who hopes to build on published work, unless she can find a clear reason why a paper of interest stands out from the crowd. Exactly such a query to Begley as lead author of the 2012 paper Begley et al. “Drug Development: Raise Standards for Preclinical Cancer Research” [Nature 483 (7391): 531–33] prompted the 2013 Begley paper

“Reproducibility: Six Red Flags for Suspect Work” (Nature 497 (7450): 433–34).

Suppose that the paper of interest appears, after checking for red flags, to be in a category where 3 out of 4 of the main results are not replicable, i.e., the odds ratio is R=3:1 against replicability. Ignoring possible bias, the Ioannidis argument is that the false finding rate (FFV; Mayo’s terminology), or probability that a positive is a false positive, is R*alpha/(1-beta), where alpha is the Type I error rate and beta is the Type II error rate.

On p.361, with normal distribution assumptions for testing mu=0 against mu>0, she seems to be making the comparison:

1: alpha=0.05, 1-beta = 0.8; R*alpha/(1-beta) = R/16 [reject H0 if standardized test statistic z > 1.64]

vs 2: 1-beta = 0.9; for this one needs alpha ~0.115; R*alpha/(1-beta) = R/16 [reject H0 if standardized test statistic z > 1.28]

[Mayo appears to say ‘almost 0.9’ where I have 0.115 . I’d assume she’d switched alpha and 1-alpha, except that she says “denial of

the alternative H1 does not yield the same null . . . used to obtain the Type I error . . . 0.05 . Instead it would be high, nearly as high as 0.9 ]

She goes on the say: “Thus, the identification of ‘effect’ and ‘no effect’ with the hypotheses used to compute the Type I error probability and power are inconsistent with one another.”

I fail to see the inconsistency. In moving from 1) to 2) the number of positives increases, but more of them are false positives and the FFV increases accordingly.

The model that underpins the formula may or may not be fit for purpose, at least in giving ballpark indications. That is another discussion. I’d prefer to work with densities, but that makes the mathematics more complicated.

]]>In sum, another masterpiece of scholarship from Deborah Mayo. She deserves our thanks!

]]>