An unexpected benefit of Arrow’s other theorem

In my remarks on Arrow’s theorem (the weak form of Arrow’s Theorem is that any result can be published no more than five times. The strong form is that every result will be published five times), I meant no criticism of Bruno Frey, the author of the articles in question: I agree that it can be a contribution to publish in multiple places. Regarding the evaluation of contributions, it should be possible to evaluate research contributions and also evaluate communication. One problem is that communication is both under- and over-counted. It’s undercounted in that we mostly get credit for original ideas not for exposition; it’s overcounted in that we need communication skills to publish in the top journals. But I don’t think these two biases cancel out.

The real reason I’m bringing this up, though, is because Arrow’s theorem happened to me recently and in interesting way. Here’s the story.

Two years ago I was contacted by Harold Kincaid to write a chapter on Bayesian statistics for the Oxford Handbook of the Philosophy of the Social Sciences. I typically decline such requests because I don’t know that people often read handbooks anymore, but in this case I said yes, because for about 15 years I’d been wanting to write something on the philosophy of Bayesian inference but had never gotten around to collecting my thoughts on the topic. While writing the article for Kincaid, I realized I’d like to reach a statistical audience also, so I enlisted the collaboration of Cosma Shalizi. After quite a bit of effort, we wrote an article that was promptly rejected by a statistics journal. We’re now revising and I’m sure it will appear somewhere. (I liked the original a lot but the revision will be much better.)

In the meantime, though, we completed the chapter for the handbook. It overlaps with our journal article but we’re aiming for different audiences.

Then came opportunity #3: I was asked if I wanted to contribute something to an online symposium on the philosophy of statistics. I took this as an opportunity to express my views as clearly and succinctly as possible. Again, there’s overlap with the two previous papers but I felt that for some reason I was able to make my point more directly on this third try.

The symposium article is still under revision and I’ll post it when it’s done, but here’s how the first draft begins:


The frequentist approach to statistics is associated with a deductivist philosophy of science that follows Popper’s doctrine of falsification. In contrast, Bayesian inference is associated with inductive reasoning and the idea that a model can be dethroned by a competing mode but can never be falsified on its own.

The purpose of this article is to break these associations, which I think are incorrect and have been detrimental to statistical practice, in that they have steered falsificationists away from the very useful tools of Bayesian inference and have discouraged Bayesians from checking the fit of their models. From my experience using and developing Bayesian methods in social and environmental science, I have found model checking and falsification to be central in the modeling process.

1. The standard view of the philosophy of statistics, and its malign influence on statistical practice

Statisticians can be roughly divided into two camps, each with a clear alignment of practice and philosophy. I will divide some of the relevant adjectives into two columns:

Frequentist Bayesian
Objective Subjective
Procedures Models
P-values Bayes factors
Deduction Induction
Falsification Pr (model is true)

I shall call this the standard view of the philosophy of statistics and abbreviate it as S. The point of this article is that S is a bad idea and that one can be a better statistician–and a better philosopher–by picking and choosing among the two columns rather than simply choosing one.

8 thoughts on “An unexpected benefit of Arrow’s other theorem

  1. Some trichotomies might suit you better than all these dichotomies.

    Abduction, deduction, induction.

    Might, must, should.

    Possible model, model checked against brute reality, least wrong (for now) model.

    Look forward to reading it, thanks


  2. I like the start of the draft, except for the first sentence (although I think you mean model rather than mode, right?). It reads like the start of one of those widely parodied jargon-filled social science articles. I realize your audience is explicitly interested in the philosophy of statistics and thus presumably happy with, or at least tolerant of, discussing Popperian this and deductivist that, but still, why limit your audience that much?

  3. Phil:

    I'd like the article to be as readable as possible. Maybe I'm just too much of an insider to write this clearly. I think the article as a whole is pretty clear, but I agree that the first paragraph or two seem pretty jargony. If you could send me a revision of these two paragraphs, translated from jargon to English, I'd much appreciate it.

  4. Some have found the paragraph below makes some sense. But it is risky reading. I think though its more of that challange to "induce someone into telling the story to them selves rather than telling it to them" especially given they have missed most or even all of the first 2000 years of philosophy.

    As for Peirce, at… – only RA Fisher had as many lines as Peirce. More scholarly work by Stephen Stigler will likely confirm many of these wikipedia claims, but to me, Peirce’s sense of science and scholarly method is far more important than any of his particular (and perhaps mostly lost) contributions to statistics. To him, science was an ongoing communal process of inquiry with those engaged both realizing and agreeing that they would always be wrong in some sense but could jointly and continuously get less and less wrong. Scholarly method should then force this process of getting less wrong along as much as possible and efficient (pragmatic) methods would accelerate this process as much as possible. Statistical methods then, as they are primarily meant to deal with uncertainty pragmatically, should accelerate the process of becoming less and less wrong about important uncertainties. Paraphrasing John Dewey, my intent in this work is to bring the best of (my understanding of) Peirce’s philosophy to bear on challenges in applying statistics rather than to turn challenges in applying statistics into philosophical issues (fallibly) addressed by Peirce.


  5. Andrew, my comment was perhaps ambiguous: the FIRST SENTENCE is the only one I don't like!

    Just as you may be too close to the subject to recognize what is and isn't jargon, I may be too far away from it to write knowledgeably about it. I'm not even sure I know what 'frequentist" means. I thought it referred to the doctrine that statistics can only properly be calculated from repeated realizations of "the same" mechanism, so that really nobody is a "frequentist." I think of the term as often being used more loosely to refer to people who are interested in "statistical significance," and perhaps people who reject models based on numerical thresholds related to the probability of getting the observed data under the assumed model. But I suspect that my view of "frequentists" is something of a caricature.

    Similarly, I think I know what "Bayesian" meant, but since I don't know _anybody_ (whether or not they self-identify as "Bayesian") who doesn't check the fit of their models! So your view of "Bayesians" also seems like something of a caricature.

    In short, I don't know how to fix your first sentence, but I encourage you to make it less jargonny. Maybe that will require splitting it up into two or three sentences.

  6. I shall call this the standard view of the philosophy of statistics and abbreviate it as S

    Won't this make Type S errors rather confusing? Sounds like a great article, though.

  7. Phil: How's this for a revised first paragraph?

    The classical or frequentist approach to statistics (in which inference is centered on significance testing), is associated with a philosophy in which science is deductive and follows Popper’s doctrine of falsification. In contrast, Bayesian inference is associated with inductive reasoning and the idea that a model can be dethroned by a competing model but can never be falsified on its own.

  8. Andrew, if you say it's so then I might believe you but… I dunno, I do Bayesian statistics but I think of my models as being falsifiable, in the sense that if they fit really badly I realize I probably need to look for other ones. That seems distinct from your implication that, as a Bayesian, I shouldn't reject a model unless I test it against another one that is found to perform better. And I think even most frequentists would agree that "All models are false; some models are useful," so…

    I dunno, as far as the substance goes I just feel like you are working too hard to create the distinction that you are then going to try to erase!

    However, stylistically it's better than it was, so I guess you have addressed my main criticism.

Comments are closed.