After reading the reviews of Kris Burzdy’s book “The Search for
Certainty” that appeared on the blogs of Andrew Gelman and Christian
Robert, I was tempted to dismiss the book without reading it. However,
curiosity got the best of me and I ordered the book and read it. I am
glad I did. I think this is an interesting and important book.
Both Gelman and Robert were disappointed that Burzdy’s criticism of
philosophical work on the foundations of probability did not seem to
have any bearing on their work as statisticians. But that was
precisely the author’s point. In practice, statisticians completely
ignore (or misrepresent) the philosophical foundations espoused by de
Finetti (subjectivism) and von Mises (frequentism). This is itself a
damning criticism of the supposed foundational edifice of statistics.
Burdzy makes a convincing case that the philosophy of probability is a
He criticizes von Mises because his theory, based on defining limits
of sequences (or collectives) does not assign a probability to a given
event. (There are also technical issues with the mathematical
definition of a collective that von Mises was unable to resolve but
these can be fixed rigorously using modern computational complexity
theory. But that doesn’t blunt the force of Burzdy’s main criticism.)
His criticism of de Finetti is more thorough. There is the usual
criticism, namely, that subjective probability is unscientific as it
is not falsifiable. Moreover, there is no guidance on how to actually
set probabilities. Nor is there anything in de Finetti to suggest that
probabilities should be based on informed prior opinion, as many
Bayesians would argue. More surprising is Burdzy’s claim that
subjective probability has the same problem as von Mises’ frequency
theory: it does not provide probability for an individual event. This
claim will raise the hackles of die-hard Bayesians. But he is right:
de Finetti’s coherence argument requires that you bet on several
events. The rules of probability arise from the demand that you avoid
a sure losing bet (a Dutch book) on the collection of bets. The
argument does not work if we supply a probability only on a single
event. The criticisms of de Finetti’s subjectivism go beyond this and
I will not attempt to summarize them.
Burdzy provides his own foundation for probability. His idea is that
probability should be a science, not a philosophy, and that, as such,
it should be falsifiable. Allow me to make an analogy. Open any
elementary book on quantum mechanics and you will find a set of
axioms. These axioms can be used to make very specific predictions.
If the predictions are wrong, (and they never have been), then the
axioms would be rejected. But to use the axioms, one must inject some
specifics. In particular, one must supply the Hamiltonian for the
problem. If the resulting predictions fail to agree with reality, we
can reject that Hamiltonian.
To make probability scientific, Burzdy proposes laws that lead to
certain predictions that are vulnerable to falsification. More
importantly, the specific probability assignments we make are open to
being falsified. Before stating his laws, let me emphasize a
crucial aspect of Burzdy’s approach. Probability, he claims, is the
search for certainty; hence the title of the book. That might seem
counter to how we think of probability but I think his idea is
correct. In frequentist theory, we make deterministic predictions
about limits of sequences. In subjectivist theory, we make the
deterministic claim that if we assign probabilities consistent with
the rules of probability then we are certain to be immune to a Dutch
book. A philosophy of probability, according to Burdzy, is the search
for what claims we can make for certain.
Burdzy’s proposal is to have laws — not axioms — of probability.
Axioms, he points, merely encode fact we regard as uncontroversial.
Laws instead, are proposals for a scientific theory that are open to
falsification. Here are his five proposed laws (paraphrased):
(L1) Probabilities are numbers between 0 and 1.
(L2) If A and B are disjoint then P(A or B) = P(A) + P(B).
(L3) If A and B are physically independent then they are
mathematically independent meaning that P(A and B) = P(A)P(B).
(L4) If there exists a symmetry on the space of possible outcomes
which maps an event A onto an event B then P(A)=P(B).
(L5) P(A)=0 if and only if A cannot occur. P(A)=1 if and only if it must occur.
Some comments are in order. (L1) and (L2) are standard of course.
(L4) refers to ideas like independent and identically sequences, or
exchangeability. It is not an appeal to the principle of
indifference. Quite the opposite. Burdzy argues that introducing
symmetry requires information, not lack of information.
(L3) and (L4) are taught in every probability course as add-ons. But
in fact they are central to how we actually construct probabilities in
practice. The author asks: Why treat them as follow-up ideas? They
are so central to how we use probability that we should elevate them
to the status of fundamental laws.
(L5) is what makes the theory testable. Here is how it works. Based
on our probability assignments, we can construct events A that have
probability very close to 0 or 1. For example, A could be the event
that the proportion of heads in many tosses is within .00001 of 1/2.
If this doesn’t happen, then we have falsified the probability
assignment. Of course P(A) will rarely be exactly 0 or 1, rather, it
will be close to 0 or 1. But this is precisely what happens in all
sciences. We can test prediction of general relativity of quantum
mechanics to a level of essential certainty, but never exact
certainty. Thus Burdzy’s approach puts probability on a level the
same as other scientific theories.
To summarize, Burdzy’s approach is to treat probability as a
scientific theory. It has rules for making probability assignments
and the resulting probabilities can be falsified. Not only is this
simple, it is devoid of the murkiness of subjectivism and the weakness
of von Mises’ frequentism. And, perhaps most importantly, it reflects
how we use probability. It also happens to be easy to teach. My only
criticism is that I think the implications of (L1)-(L5) could be
fleshed out in more detail. It seems to me that they work well for
providing a foundation for testable frequency probability. That is,
it provides a convincing link between probability and frequency. But
that could reflect my own bias towards frequency probability. More
detail would have been nice.
My short summary of this book does not do justice to the author’s
arguments. In particular, there is much more to his critique of
subjective probability than I have presented in this review. The best
thing about this book is that it will offend and annoy both
frequentists and subjectivists. I implore my friends on both sides of
the philosophical divide to read the book with an open mind.
1. Whatever von Mises’s merits (or lack thereof) in general, I can’t take him seriously as a philosopher of statistical practice (see pages 3-4 of this article).
2. As I wrote earlier, Burdzy’s comments about subjectivism may or may not be accurate, but they have nothing to do with the Bayesian data analysis that I do. In that sense, I don’t think that Larry’s comment about “both sides of the philosophical divide” is not particularly helpful. I see no reason to choose between two discredited philosophies, and in fact in chapter 1 of BDA we are very clear about the position we take, which indeed is completely consistent with Popper’s ideas of refutation and falsifiability.
As I wrote before, “My guess is that Burdzy would differ very little from Christian Robert or myself when it comes to statistical practice. . . . but I suppose that different styles of presentation will be effective with different audiences.” Larry’s review suggests that there are such audiences out there.