Columbo does posterior predictive checks

I’m already on record as saying that Ronald Reagan was a statistician so I think this is ok too . . .

Here’s what Columbo does. He hears the killer’s story and he takes it very seriously (it’s murder, and Columbo never jokes about murder), examines all its implications, and finds where it doesn’t fit the data. Then Columbo carefully examines the discrepancies, tries some model expansion, and eventually concludes that he’s proved there’s a problem.

OK, now you’re saying: Yeah, yeah, sure, but how does that differ from any other fictional detective? The difference, I think, is that the tradition is for the detective to find clues and use these to come up with hypotheses, or to trap the killer via internal contradictions in his or her statement. I see Columbo is different—and more in keeping with chapter 6 of Bayesian Data Analysis—in that he is taking the killer’s story seriously and exploring all its implications. That’s the essence of predictive model checking: you take advantage of the fact that you’re working with a generative model, and you generate anything and everything you can.

13 thoughts on “Columbo does posterior predictive checks

  1. He’s also blessed with very very very good priors. You never see him exploring the implications of anyone’s story other than the killer…

  2. paul: an explanation the being so good at guessing at the start is given by another fictional detective here http://www.visual-memory.co.uk/b_resources/abduction.html (warning reflects the rampant racism and disregard of due process of law at the time.)

    I have started to use a detective metaphor for Bayesian analysis (to replace “nearest-neighbors”) by pointing out that in (Rubin’s) two stage conceptual description of Bayes – as first sampling from the prior distribution (parameter suspects), then sampling from the data model given the sampled parameter value (suspect) and only keeping those parameters (suspects) that generated data very close to that in the study as the posterior ( current suspects). Prior parameters (suspects) that generated data values different from the study (crime) – have confirmed alibies and can be confidently excluded from further suspicion of having generated the data in the study. (Analytical Bayes actually down weights parameters in the prior by p(observed|parameter)/p(observed ), aka relative belief ratio.)

    I believe metaphors can facilitate and increase understanding (some just find them annoying.)

    By the way, in the Reagan post, both Regan and Roosevelt missed the counter factual point – “Are you better off now than you were four years ago? || compared to where you be if I had lost.”
    Seems topical – given last night’s TV debate.

    • I’ve been known to use a metaphor of searching for a lost wallet as a way to introduce ideas about statistical assumptions and how they affect power. If you make a strong assumption about where you lost your wallet (e.g., “I lost it somewhere in this room”), and base your search on that assumption, you’ve improved your chances of finding it–if your assumption is true or nearly true. But if your assumption is very false, you’d be better off making a weaker assumption and searching accordingly (e.g., “I lost it somewhere on campus”). Analogously, making strong assumptions (e.g., assuming NIID data) can improve statistical power, but only when those assumptions are met sufficiently closely. I do think this sort of thing is helpful when teaching beginners.

  3. This brings to mind a broader issue. A couple of years ago I realized that, over the course of about 40 years of reading science fiction, I have never encountered a statistician practicing statistics in fiction. In principal this has all the appeal of detective fiction, but with the added element of contemplating the nature of data and appropriate methodology in an analysis associated with some plot-relevant problem, as well as dealing with fraud and mis-use of methodologies. Occasionally I’ve seen a bit of math and probabilities in stories, but nothing like statistics as practiced by statisticians. I could envision a whole detective-like genre, with an incidental educational component.

      • I really wanted to like this show, since they usually used real terms and concepts that they usually explained reasonably accurately. It was also exciting to have an interesting show that might inspire kids to pursue science/math, and that provided a popular introduction to statistics/AI concepts. Plus, who isn’t tired of (cheap?) science fiction throwing in made-up technical terms instead of using the real thing?

        Unfortunately, the concepts could have never accomplished what they claimed they could accomplish, and they couldn’t resist throwing in a few real phrases with incorrect meanings. In the end, it was harder to suspend my disbelief regarding things I was actually familiar with. And the show started taking turns that I didn’t like.

    • In Asimov’s Foundation cycle, the whole plot is based on psychohistory, which comes close to sociological statistics in my opinion. (With the turns in the plot being due to the method being unable to predict outliers!) Similarly, in Brandon Sanderson’s Mistborn trilogy, there are several instances of statistical reasoning, as “These numbers are just too regular to be natural. Nature works in organized chaos—randomness on the small scale, with trends on the large scale.” or “It’s like the chaos of normal random statistics has broken down (…) A population should never react this precisely—there should be a curve of probability, with smaller populations reflecting the expected percentages less accurately.” So I think that, while statisticians are not exposed as such, you can find statistical illustrations in science fiction and fantasy as well as detective stories…

      • I was also going to mention the Foundation series! As I remember, the statistics of people/history was explained with a physics analogy: you can predict the behavior of a gas much more accurately than an individual molecule, and you can predict the actions of a group of people much more accurately than that of an individual. (Which, as you say, shows up in the plot.)

        The series is full of interesting detail, but it also has an incredible arc which is ultimately tied back to the use of this statistics.

      • Certainly statistics is alluded to in science fiction, but it just has not delved much deeper than that. As a real-world example, consider the well-intended researchers (we all have encountered them) who do their own analyses, generally jumping straight to something simple like linear anova without checking any of the assumptions, and ignoring or unaware of issues with heterogeneity or correlation among the predictors, leading to inflated Type I/II errors. I could easily envision many a story based upon a statistical “sleauth” dealing with issues like these, and incidentally educating readers about them, and hopefully inspiring some researchers to recognize when they should leave analyses to the professionals (or to become one).

        I have seen many topics in science or mathematics dealt with rather nicely in fiction, but not-so-much statistics (statistics uses mathematics as a tool, but is no more mathematics than are physics or engineering). Perhaps statisticians are not writing much fiction, or perhaps those that do fear rejection if they delve too deeply.

  4. Pingback: “Just one more thing” on LARPing

Comments are closed.