A few months ago I was asked to review Do Dice Play God?, the latest book by mathematician and mathematics writer Ian Stewart.

Here are some excerpts from my review:

My favorite aspect of the book is the connections it makes in a sweeping voyage from familiar (to me) paradoxes, through modeling in human affairs, up to modern ideas in coding and much more. We get a sense of the different “ages of uncertainty”, as Stewart puts it.

But not all the examples work so well. The book’s main weakness, from my perspective, is its assumption that mathematical models apply directly to real life, without recognition of how messy real data are. That is something I’m particularly aware of, because it is the business of my field — applied statistics.

For example, after a discussion of uncertainty, surveys and random sampling, Stewart writes, “Exit polls, where people are asked who they voted for soon after they cast their vote, are often very accurate, giving the correct result long before the official vote count reveals it.” This is incorrect. Raw exit polls are not directly useful. Before they are shared with the public, the data need to be adjusted for non-response, to match voter demographics and election outcomes. The raw results are never even reported. The true value of the exit poll is not that it can provide an accurate early vote tally, but that it gives a sense of who voted for which parties once the election is over.

It is also disappointing to see Stewart trotting out familiar misconceptions of hypothesis testing . . . Here’s how Stewart puts it in the context of an otherwise characteristically clearly described example of counts of births of boys and girls: “The upshot here is that p = 0.05, so there’s only a 5% probability that such extreme values arise by chance”; thus, “we’re 95% confident that the null hypothesis is wrong, and we accept the alternative hypothesis”. . . .

As I recall the baseball analyst Bill James writing somewhere, the alternative to good statistics is not no statistics: it’s bad statistics. We must design our surveys, our clinical trials and our meteorological studies with an eye to eliminating potential biases, and we must adjust the resulting data to make up the biases that remain. . . . One thing I like about Stewart’s book is that he faces some of these challenges directly. . . .

I believe that a key future development in the science of uncertainty will be tools to ensure that the adjustments we need to make to data are more transparent and easily understood. And we will develop this understanding, in part, through mathematical and historical examples of the sort discussed in this stimulating book.

As you can see from the above excerpts, my review is negative in some of the specifics but positive in general. Stewart had some interesting things to say but, when he moved away from physics and pure mathematics to applied statistics, he got some details wrong.

A month or so after my review appeared, Stewart replied in the same journal. His reply is short so I’ll just quote the whole thing:

In his review of my book Do Dice Play God?, Andrew Gelman focuses on sections covering his own field of applied statistics (Nature 569, 628–629; 2019). However, those sections form parts of just two of 18 chapters. Readers might have been better served had he described the book’s central topics — such as quantum uncertainty, to which the title of the book alludes.

Gelman accuses me of “transposing the probabilities” when discussing P values and of erroneously stating that a confidence interval indicates “the level of confidence in the results”. The phrase ‘95% confident’, to which the reviewer objects, should be read in context. The first mention (page 166) follows a discussion that ends “there’s only a 5% probability that such extreme values arise by chance. We therefore … reject the null hypothesis at the 95% level”. The offending sentence is a simplified summary of something that has already been explained correctly. My discussion of confidence intervals has a reference to endnote 57 on page 274, which gives a more technical description and makes essentially the same point as the reviewer.

I also disagree with Gelman’s claim that I overlook the messiness of real data. I describe a typical medical study and explain how logistic and Cox regression address issues with real data (see pages 169–173). An endnote mentions the Kaplan-Meier estimator. The same passage deals with practical and ethical issues in medical studies.

Here’s my summary of what Stewart said:

1. My review focuses on my own areas of expertise, which only represent a small subset of what the book is about.

2. His technically erroneous statements about hypothesis testing should be understood in context.

3. He doesn’t mention the bit about polling. Maybe he agrees he made a mistake there but he doesn’t want to talk about it, or maybe he didn’t want to look into polling too deeply, or maybe thinks the details of exit polls don’t really matter.

In reply, I’ll just say:

1a. I like Stewart’s book, and my review was largely positive!

1b. I think my review is more valuable when I can engage with my areas of expertise. Had I focused my review on Stewart’s treatment of quantum mechanics, I wouldn’t have had much of anything useful to say.

2. I recognize that it’s a challenge to convey technical concepts in words. It’s easy to write something that vaguely seems correct but actually is not. Here’s an embarrassing example from one of my own textbooks! So I have sympathy for Stewart here. Still, he got it wrong.

3. I think polling is important! If you’re gonna include something on exit polls in your book, you should try your best to get it right.

By writing a book with many examples, you leave many hostages to fortune. That’s ok—a book can have mistakes and still be valuable.

This is a good case study in how debate often magnifies small differences.

> The offending sentence is a simplified summary of something that has already been explained correctly.

Perhaps people don’t understand the damage this can cause given it might be what most people take away with them. This temptation to simplify until wrong and misleading needs to be guarded against.

That was the main point of this post https://statmodeling.stat.columbia.edu/2019/12/18/attempts-at-providing-helpful-explanations-of-statistics-must-avoid-instilling-misleading-or-harmful-notions-statistical-significance-just-tells-us-whether-or-not-something-definitely-does-or-defin/

The response actually makes me more worried than what Andrew says haha.

> I also disagree with Gelman’s claim that I overlook the messiness of real data. I describe a typical medical study and explain how logistic and Cox regression address issues with real data (see pages 169–173). An endnote mentions the Kaplan-Meier estimator. The same passage deals with practical and ethical issues in medical studies.

Like, I’m sure logistic and Cox regression do something, but this doesn’t seem like a convincing argument to me as a non-reader that he really covers the messiness of real data. If the cookie cutter models worked, then we probly wouldn’t call the data messy! I assume the aside that the same passage deals with ethical issues would annoy someone who works with that. One section to rule them all!

It’s a common thing with social science types, point out their data don’t obey the assumptions of their statistical model and there’s always a standard *correction* they make. There’s not usually any willingness to drill down into the issue any more than that.

This is an interesting exchange. I have sympathy for both sides.

As a non-statistician (Ph.D. in Mechanics, work in medical research) I must confess to frequently not having the slightest idea what is being discussed on this blog.

Don’t get me wrong, I **LOVE** this blog (my review is largely positive) – but sometimes the terminology and assumed expertise/background leaves me thinking I’d be better off trying to decipher the Rosetta Stone. That’s fine. I’m sure most of the readers would be bored senseless if you always explained exactly what you meant in super-simplified terms that a non-statistician could understand.

But, I assert that while there is some harm in simplifying things with the introduction of errors, there is also some good that comes of it.

I often think about how undergraduate mechanics courses are taught. It’s all a lie (beam theory, statics, inviscid fluid flow, aerodynamics, etc…). But, by introducing people to concepts through these simplified frameworks, we seem to gain something (I can’t prove this). For those who end with their BS, they have decent tools for getting a job and working as engineers. For those going on to graduate school, they get the enjoyable experience of finding out “what’s really going on”.

I have NOT read the book in question, and I DO understand your arguments about the errors made in the book, and it WAS, after all, a book review (and largely positive), and the author DID seem to get a bit overly bent-out-of-shape (hasn’t he ever had a paper rejected? scientists usually get used to this kind of impersonal, reasonable criticism). But, at the same time, I get how hard it is to explain complex ideas to an audience lacking the background to understand “the truth” – I work with medical doctors every day!

I’m technically a mechanical engineer too!

> But, by introducing people to concepts through these simplified frameworks, we seem to gain something

Well maybe this is an audience thing? Check out Keith’s post. If you were teaching non-mechanical engineers about mechanical engineering, instead of telling them about equations of aerodynamics for <200mph airplanes or whatever, you might tell them, "well, pencil and paper will get you to X hundred miles per hour, but then turbulence kicks in, or whatever, and if you wanna go reaaaally fast, like 1000mph, then yada yada".

Like can't you actually talk about more stuff honestly if you don't get yourself caught in the weeds (Cox models, and whatnot)?

And really I think we introduce engineering ideas at this level too, but then it takes whole semesters to even get through little bitty bits of the details!

Keith said: “Perhaps people don’t understand the damage this can cause given it might be what most people take away with them. “

Michael said: “by introducing people to concepts through these simplified frameworks”

Ben said: “Well maybe this is an audience thing?”

and

Jim Says:

Three sound points. With respect to Keith’s (Andrew’s) point, there is a serious problem with oversimplification of some statistical methods, which is causing major errors in research results, which is why Andrew pointed it out.

OTOH, yes, lots of simplifications are really useful because they communicate a concept clearly, while still allowing us to build a more complex and accurate description on the simplification. The Ideal Gas Law for example, communicates all the fundamentals about how gasses behave even though it hardly applies to anything, and yet it can be extended to apply to almost anything.

I guess we have to use our judgement to decide which is which and as Ben said also take the audience into consideration.

Simplifying physics and simplifying social science models are very different things though.

Does anyone who’s read the book know if Stewart’s treatment of this is kosher – meaning informed by QPT etc. – and not just more floundering from people who aren’t even aware that there’s more to probability than classical (Kolmogorovian, commutative) probability?

I love the example of exit polls. It’s yet an example where using the “raw data” is more harmful than using adjusted data.

Stewart’s reaction to “transposing the probabilities” is worrying. It indicates that he doesn’t understand the issue even now.

Kaiser:

Unfortunately there’s a long tradition of mathematicians writing about statistics in a naive way and then berating applied people for not following mathematical rules. I had to deal with this when I taught at Berkeley: the stat department was full of people who I suspect wouldn’t be qualified to clean the erasers in a real math department, but they’d write textbooks and indoctrinate students into stupid math-purist attitudes about random samples and the like.