The above remark, which came in the midst of my discussion of an analysis of Iranian voting data, illustrates a gap—nay, a gulf—in understanding between statisticians and (many) nonstatisticians, one of whom commented :that my quote “makes it sound that [I] have not a shred of a clue what a p-value is.”
Perhaps it’s worth a few sentences of explanation.
It’s a commonplace among statisticians that a chi-squared test (and, really, any p-value) can be viewed as a crude measure of sample size: When sample size is small, it’s very difficult to get a rejection (that is, a p-value below 0.05), whereas when sample size is huge, just about anything will bag you a rejection. With large n, a smaller signal can be found amid the noise.
In general: small n, unlikely to get small p-values. Large n, likely to find something. Huge n, almost certain to find lots of small p-values.
The other piece of the story is that our models are just about always wrong. Rejection via a low p-value (with no other information) tells us that the model is wrong, which we already knew. The real question is how large the discrepancies are. P-values can be useful in giving a sense of uncertainty.
The only situation I can think of where the model holds up even when sample sizes are huge is the sex ratio. Under normal conditions, the sexes of births really are statistically independent with pretty much constant probabilities; that is, the binomial model holds. Even for n in the millions. See here for some calculations showing the good model fit, for N’s ranging from 4000, to 400,000, to 3 million. And it’s news when the data don’t fit the model. But the sex ratio is an unusual example in that way. Usually when you have large N, you’ll reject the model as a matter of course.
P.S. In the actual example, as Eduardo Leoni pointed out, I made a mistake, saying the sample size was “huge” when it was only 366 (more of a “large” or a “moderate,” I’d say). So my argument about the p-value doesn’t apply so perfectly to the Iranian election data. But this mistake doesn’t really affect my general point above.