Following up on our recent discussion of p-values, let me link to this recent news article by Tom Siegfried, who interviewed me a bit over half a year ago on the topic. Some of my suggestions may have made their way into his article.
The main reason why I’m linking to this is that four different people emailed me about it! When I get four emails on the same topic, I’ll blog it. (With one exception, of course: as you know, there’s one topic I’m never blogging on again.)
I agree with most of what Siegfried wrote. But to keep my correspondents happy, I’ll mention the few places where I’d amend his article:
1. Siegfried describes prior probability as “an informed guess about the expected probability of something in advance of the study.” He immediately qualifies this: “Often this prior probability is more than a mere guess — it could be based, for instance, on previous studies.” Still, I disagree with his first sentence. I agree that sometimes–often!–a prior distributions is not constructed using previous studies. But when it’s not, I’d call it a model or an assumption, not a guess.
Why does this matter? Mere semantics? Not quite. I put the prior distribution on the same philosophical dimension as the likelihood. I have no problem with you calling my prior distribution an “informed guess” if you’ll also describe your normal distribution or your logistic regression as “informed guesses.” My point: the prior distribution, and also the likelihood (in most cases) are assumptions, they’re mathematical models, not really “guesses” at the truth so much as useful approximations to the truth. Or, more to the point, approximations to the truth that give useful inferences for quantities of interest.
2. Siegfried writes: “Standard or ‘frequentist’ statistics treat probabilities as objective realities; Bayesians treat probabilities as ‘degrees of belief’ based in part on a personal assessment or subjective decision about what to include in the calculation.”
Ummm . . . I completely disagree with this. Bayesians (at least, followers of Bayesian Data Analysis) think of probabilities as objective realities too–or, at least as much as any other statisticians to. Some probabilities are more objective than others. The probability that the die sitting in front of me now will come up “6” if I roll it . . . that’s about 1/6. But not exactly, because it’s not a perfectly symmetric die. The probability that I’ll be stopped by exactly three traffic lights on the way to school tomorrow morning: that’s . . . well, I don’t know exactly, but it is what it is. Some probabilities are more objective and real than others, but I don’t see this as having anything to do with Bayes.
See chapter 1 of BDA or more on where probabilities come from. To put it another way, Bayesian statistics (as I practice it) is no more subjective than any other approach to statistics.
3. Siegfried quotes my former Berkeley colleague Juliet Shaffer as saying:
Replication is vital. . . . But in the social sciences and behavioral sciences, replication is not common. This is a sad situation.
Really? Maybe, maybe not. Replication is costly. Is it worth the effort? Depends on the setting. Shaffer, like I, has worked extensively in the social and behavioral sciences. How often has she followed her own advice and replicated somebody else’s study, or even her own? I haven’t done this very often myself.
It’s easy to say that something is “vital” and that other people should do it. It’s not so easy to devote the time and effort to doing it oneself. Which suggests that the benefits of said activity do not necessarily exceed its costs.
I agree with Siegfried’s larger point, though, which is that statistical methods can often be used to give a misleading sense of scientific-ness to messy collections of data. And he makes some other good points along the way, for example that estimates of average treatment effects–even if based on perfect randomized experiments–can obscure potentially important variation.
P.S. If there were a stat blogosphere like there’s an econ blogosphere, Siegfried’s article would’ve spurred a ping-ponging discussion, bouncing from blog to blog. Unfortunately, there’s just me out here and a few others (see blogroll), not enough of a critical mass to keep discussion in the air.