Implicit assumptions in the Tversky/Kahneman example of the blue and green taxicabs

Juan de Oyarbide writes:

In Chapter 16 of the book “Thinking, Fast and Slow,” titled “Causes Trump Statistics,” Daniel Kahneman brings the differentiation between the use of statistical base rates and causal base rates in Bayes’ rule. Kahneman claims with a simple example that often, due to our logical human reasoning, we may not find the correct Bayesian mathematical model, and that depends on how the problem is presented to us. So he says that under some circumstances the omission of priors generates an overestimation of posterior probabilities.

I wonder if in the problem in question we actually have the same mathematical representation for either way the problem is presented, or there might be some model misidentification. I think the way information is brought could condition our understanding of the priors, and therefore the associated uncertainty (e.g., information on population probabilities with uncertainty on risk is not the same as having same probabilities on risk associated to each population and then equal population weights).

Oyarbide provides further details:

I found the problem online, I will share it below.

A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data: 85% of the cabs in the city are Green and 15% are Blue. A witness identified the cab as Blue. The court tested the reliability of the witness under the circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time.

What is the probability that the cab involved in the accident was Blue rather than Green?

Now consider a variation of the same story, in which only the presentation of the base rate has been altered. You are given the following data: The two companies operate the same number of cabs, but Green cabs are involved in 85% of accidents. The information about the witness is as in the previous version.

Kahneman writes:

The two versions of the problem are mathematically indistinguishable, but they are psychologically quite different. People who read the first version do not know how to use the base rate and often ignore it. In contrast, people who see the second version give considerable weight to the base rate, and their average judgment is not too far from the Bayesian solution. Why?

In the first version, the base rate of Blue cabs is a statistical fact about the cabs in the city. A mind that is hungry for causal stories finds nothing to chew on: How does the number of Green and Blue cabs in the city cause this cab driver to hit and run? In the second version, in contrast, the drivers of Green cabs cause more than 5 times as many accidents as the Blue cabs do. The conclusion is immediate: the Green drivers must be a collection of reckless madmen! You have now formed a stereotype of Green recklessness, which you apply to unknown individual drivers in the company.

The stereotype is easily fitted into a causal story, because recklessness is a causally relevant fact about individual cabdrivers. In this version, there are two causal stories that need to be combined or reconciled. The first is the hit and run, which naturally evokes the idea that a reckless Green driver was responsible. The second is the witness’s testimony, which strongly suggests the cab was Blue. The inferences from the two stories about the color of the car are contradictory and approximately cancel each other. The chances for the two colors are about equal (the Bayesian estimate is 41%, reflecting the fact that the base rate of Green cabs is a little more extreme than the reliability of the witness who reported a Blue cab). The cab example illustrates two types of base rates.

Statistical base rates are facts about a population to which a case belongs, but they are not relevant to the individual case. Causal base rates change your view of how the individual case came to be. The two types of base-rate information are treated differently: Statistical base rates are generally underweighted, and sometimes neglected altogether, when specific information about the case at hand is available. Causal base rates are treated as information about the individual case and are easily combined with other case-specific information.

Oyarbide writes:

My question is, are the problems mathematically indistinguishable? Because the first case we don’t have information about risk, so some prior should be incorporated before including population facts. My second questions is, is there such thing of a statistical base rate and a causal base rate? Shouldn’t we always write a problem based on causality and incorporate population information on priors?

My reply is that neither problem is fully mathematically specified; they both rely on implicit assumptions of independence or random sampling. So you can think of the problems as different to the extent that the different scenarios might bring to mind different models of departures from this unstated assumption.

10 thoughts on “Implicit assumptions in the Tversky/Kahneman example of the blue and green taxicabs

  1. Drives me crazy that so many people’s understanding of probability comes largely from this kind of mathy presentation in which there are ALWAYS implicit and unfounded assumptions about independence and random sampling.

    Suppose Green cab has a very high level of stringency on its hiring process and has installed GPS trackers and docks people’s pay for reckless driving, while Blue cab doesn’t do any of that. Even though Green cab has 85% of the cabs in the city, they could easily be involved in only 5-10% of accidents.

    So NO Mr Kahneman the problems are not in any way mathematically equivalent.

    It bothers me that writers of textbooks are always presenting these probability problems with deeply flawed assumptions as if they were the correct and only way the problem could be solved. This tends to lead people to think that they know how to solve such stuff and make the textbook-type assumptions, and then do some truly abysmal science.

    Sally Clark is an example of a real world situation in which a woman served major jail time and was wrongfully convicted of murdering her children, and then became an alcoholic and died of alcohol poisoning because of this kind of bullshit on the part of a pediatrician named Roy Meadows

    https://en.wikipedia.org/wiki/Sally_Clark

    According to the article a review of many cases after that showed there were two other women similarly wronged.

    • let’s suppose “green is involved in 10% of accidents” while also “green is 85% of cabs”

      p(green | accident) = 0.1

      p(green) = 0.85

      now we’d like to see how much more “reckless” each is. That is we’d like to compare p(accident | blue) to p(accident | green)

      (1) p(accident | green) p(green) = p(green | accident) p(accident)

      (2) p(accident) = p(accident | blue) p(blue) + p(accident | green) p(green)

      p(blue) = 0.15
      p(blue | accident) = 1-p(green|accident) = .9 (because only 2 colors)

      let’s plug in known values to equation (1)

      p(accident | green) * 0.85 = 0.1 * ( p(accident | blue) * 0.15 + p(accident | green) * 0.85)

      p(accident | green) * (0.85 – .085) = .015*p(accident|blue)

      p(accident | green)/p(accident|blue) = 0.015/.765 = 0.0196

      or green has about 2% of the risk of blue per trip/time.

      Kahneman’s assumption was p(accident | green) / p(accident | blue) = 1 or they have the same risk and the only difference is the prevalence… but it’s clear that this is a STRONG assumption, in fact, way too strong, it’s clearly possible to be off by a lot. Without knowing both the prevalence of cabs overall, and the prevalence in accidents, you can’t get anything meaningful and making the assumption that the pool of drivers and the kinds of trips they do (ie. the regions through which they drive) are the same, you get nothing.

    • I hope people do not consider themselves experts on probability if they understand the above example, but it’s a bit silly to say that such and such mathematical technicalities need to be spelled out explicitly in a popular science book to be doing useful exposition. Imagine the average reader interested in behavioral economics picking up this book and the exposition begins with “Let the probability any randomly sampled cab is blue is X, let the probability that any cab is an accident be constant…” etc. No one would read it, because the extra information just obscures the insight the author is trying to provide even while making that insight more technically true. The first time you learn what a pdf is you don’t know measure theory, does that mean your stats101 professor is a big liar because your knowledge is corrupted by these undying unspecified assumptions? I would hope you think not.

      • There’s two problems.

        1) People often never see an example more sophisticated than these kinds
        2) People often see *multiple* examples like this one where some base assumption of equality is made, like over and over again.

        what they get the impression of is just that these kinds of solutions are *the way* you do it.

        All Kahneman has to say is “assume that blue cab drivers are slightly more reckless than green cab drivers so that they are involved in 1.2 times as many accidents per drive-mile driven”

        or something like that, so that you get the idea that you need to consider this question.

        Otherwise people just over and over see “I am not given any information about question X therefore I MUST assume A and B have the same X”

      • In teaching people about probability and statistics, we could do one of two things… only one of which is actually common

        1) We can hide an implicit assumption into the problem that the two cab companies have equally dangerous employment practices and therefore the prevalence of cabs in the town is the same as the risk that a given cab was in an accident **and this is what we always do, and what was done in the example**

        or we could tell the truth:

        2) The information given is not enough to solve the problem, not even approximately, we must put a prior over the relative risk taking in the two companies and if we don’t the problem simply can’t be solved.

        I prefer that we NOT LIE to students so I prefer 2…

        Universally “mathy” books prefer 1 and similar assumptions like it (such as independence) and this is almost always the only experience a non-specialist would have with probability and statistics, and is so widespread that literal MDs will confidently testify that women should wrongly be locked in a cage on the basis of this kind of stuff.

        Doing those kinds of textbook problems over and over in probability and stats textbooks for applied use (ie. engineers, biologists, physicists, chemists, social sciences, etc) is wrong and we should feel ashamed as a society that we lie to students.

  2. The court tested the reliability of the witness under the circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time.

    Isn’t the base rate already baked in? Sounds like they had a witness watch the street with 85/15% green/blue cabs driving by (at night, in the rain, etc). The witness correctly identified a blue cab 80% of the time.

    So the right answer should be 80%.

    Likewise maybe the witness observed actual/staged accidents, where 85% of the time the offending cab was green. In that case the right answer would also be 80%. But that “test” seems less plausible given what we know a priori about such court cases.

    So what did this test actually entail?

    The cheapest thing would be equal numbers of green/blue cabs driving by during the test. In which case, then yes we need to account for the base rates and perhaps use principle of indifference (or assume some other prior) for unknown variables.

    The meaning of the number “80%” depends on unmentioned methodological details of the “test”. The question requires us to make assumptions about how the test was performed. In a complete analysis, the various scenarios should also be given different probabilities based on prior knowledge about how courts perform such tests.

    • > The cheapest thing would be equal numbers of green/blue cabs driving by during the test.

      I take this back, the cheapest thing is to have the witness simply observe the street under normal (85% green cabs), rather than staged/controlled, circumstances.

  3. I keep pondering this post and am not sure what to think. Andrew is suggesting that the way the question is worded may impact the unstated assumptions that people make when answering. Here is a different example (not necessarily the best, but one I easily found): A recent Pew Research poll asks the question: “How satisfied are you with the way democracy is working in (survey country) – very satisfied, somewhat satisfied, not too satisfied, or not at all satisfied?” This question is asked in a number of countries over a number of years as a gauge of people’s satisfaction with democracy.

    Clearly there are many unstated assumptions in this question – awareness of recent political events, where in the country does the respondent live, what other countries might they have knowledge of when asked the question, etc. The question could have been worded differently: for example, “Democracy is working well in my country: Do you strongly agree, agree, disagree, strongly disagree with that statement?”

    It is certainly possible that the stories people implicitly invent about the unstated assumptions will differ depending on how the question is asked. Similarly, it is entirely possible that the green/blue cab question wording will lead to different implicit stories about how the cab companies discipline their employees, what the weather was like, etc. I think it is almost impossible to avoid this ambiguity. Kahneman has a nice clean story about why people respond differently to the two ways of stating the information. Andrew suggests that the story could be wrong – the different responses could be due to different unstated assumptions triggered by the different wording. I guess resolving this would require some additional experimentation aimed to seeing whether the wording indeed triggers different assumptions rather than triggering just the different emotional response that Kahneman suggests.

    While I accept the validity of Andrew’s point about the unstated assumptions, I’m not sure I see a good reason to think that Kahneman’s results were due to such assumptions rather than the reason he postulates. If much is to be made about this possibility, then how are we to interpret any survey, where I think the same ambiguities inherently are present?

Leave a Reply

Your email address will not be published. Required fields are marked *