Who rotated my cookie? The all-too-common mistake when presenting Bayesian inference to strain at the gnat of the prior while swallowing the camel of the likelihood

Sander Greenland pointed me to this amusing little book that introduces Bayesian inference using a simple example of a kid taking a bite out of a cookie:

Literal-minded statistician that I am, I noticed a problem here: that 1/3 probability in the likelihood seems too high. Given the picture of the cookie with the candies, I’d say the probability of getting a bite with no candies is more like 1/10.

Then there’s all this with the prior distribution:

This demonstrates a general point that we see in Bayesian analysis: lots of obsessing over the prior, but the likelihood is set up without much thought at all! Obv the numbers in the cookie example are arbitrary: for the purpose of teaching, it doesn’t matter if it’s 1/3 or 1/10 or whatever. The larger problem comes both in teaching and in practice, when people use likelihoods that are way off, and they never even think to check.

13 thoughts on “Who rotated my cookie? The all-too-common mistake when presenting Bayesian inference to strain at the gnat of the prior while swallowing the camel of the likelihood

  1. “1/3 probability in the likelihood [of not getting a candy] seems too high. ”

    Interesting thought. I saw this illustration a while back somewhere (here?) and that didn’t occur to me. If you were to move the position of the bite 10° around the perimeter of the cookie each time, very roughly after 3-4 moves or 30-40° the bite would have candy, so the probability of a bite without candy is more like 1/10.

    What’s great is how convincing that graphic with the three bites is! It seems to make it obvious that the probability of a bite without candy is 1/3, but as you pointed out it’s not an accurate depiction of the probability. But it may have been just as convincing to the authors as it is to the public!

    • Anon:

      Yeah, that’s the point. For the prior they went into obsessive detail about the concept of a prior distribution, what could each of the cookies look like, etc. But for the likelihood they just took a quick look and made up a number without checking whether it was at all consistent with their picture. I’ve seen this a lot, where people trust their model more than their data. Consider all those ridiculous regression discontinuity analyses where the authors publish the outrageous-looking graph but they don’t even see the problem, as they’re so conditioned to think that their model must be correct (or, to be more precise, they think their statistical method must be appropriate).

  2. Since EJ has a book called “Bayesian Thinking for Toddlers” (https://psyarxiv.com/w5vbp/), does this mean we need a Bayes book for every stage of development? Certainly the examples would have to be appropriate for each age group.

    Babies get cookies. Toddlers get dinosaurs. What would be best for tweens? For teens? For retirees? For newborns?

    Kruschke’s book for adults has cute puppies on the cover, but sadly no puppies in the examples.

  3. Ehh, do note that the candies change position in each of those first images — in the one shown in the Pr(no_candy_bite | cookie) page, only one third to one quadrant of the cookie has any candies at all, vs the larger candy sectors of the preceding images (one of the cookies in the prior even has a likelihood of 1/1, since it lacks candies altogether). If we treat the sector as being completely covered by candy (and having central angle ≈120°), and the minor segment of the bite as also having a central angle of around 120°, then you can indeed rotate the bite through 120° of cookie before the segment and sector overlap, yielding a conditional probability of 1/3 for the featured cookie (assuming its orientation when picked up & nommed on is uniform).

    Now, maybe that cookie’s sector is a bit >120°, but from the first images the the bite is also 1/3 seems reasonable to me. But then, one would think a child biting from a candy cookie would work to ensure a bit of candy in every bite — if it were me biting that specific cookie, a candy-less bite would have probability << 1/10.

  4. Andrew:

    Like most sweet confections the delicious cookie example delivers a calorie high sugar rush with little lasting nutritional value. The pedagogical objectives of the exercise would have been better served by acknowledging the idealized assumptions and the questions they beg. Suppose, for example, that different people take different sized bites from the cookie? Or that some of the biters are migraine-prone, and deliberately avoid eating anything that looks like a potentially triggering m&m candies (which is nightly prevalent in the population of candy-bearing cookies)? Nothing wrong with idealized assumptions. It’s difficult to imagine negotiating the discourse and commerce of everyday life without them. In the rarefied contexts of science and education, though, they always warrant acknowledging and examining.

    That’s my idealized assumption, at least!

    John

  5. That’s exactly why I have problem understanding variations inference. Why do they just make all these assumptions about using exponential family to make the calculation possible without thinking why their model is … right? And I still don’t understand.

    • There’s no need to make assumptions about exponential families to do variational inference. For example, you can use normalizing flows and do it all non-parametrically.

      The reason people choose exponential families is the same reason they choose conjugate priors for Gibbs sampling—it makes conditional distributions analytic and simplifies computation. The literature does sometimes make it seem like this is the way to do variational inference. Systems like BUGS/JAGS do adaptive rejection sampling/slice sampling when the conditionals are not conjugate.

      Also, +1 to Yuling’s paper with Aki, Dan, and Andrew—it’s really beautiful conceptually, theoretically, and practically (the trifecta!). We have it coded for variational inference in Stan.

Leave a Reply

Your email address will not be published. Required fields are marked *