Joe Stover points to this op-ed by lawyer and political activist Ted Frank, who writes:
Even Supreme Court justices are known to be gullible. In a dissent from last week’s ruling against racial preferences in college admissions, Justice Ketanji Brown Jackson enumerated purported benefits of “diversity” in education. “It saves lives,” she asserts. “For high-risk Black newborns, having a Black physician more than doubles the likelihood that the baby will live.”
A moment’s thought should be enough to realize that this claim is wildly implausible. . . . the actual survival rate is over 99%.
Indeed, there’s no treatment that will take the survival rate up to 198%.
Frank continues:
How could Justice Jackson make such an innumerate mistake? A footnote cites a friend-of-the-court brief by the Association of American Medical Colleges, which makes the same claim in almost identical language. It, in turn, refers to a 2020 study . . . [which] makes no such claims. It examines mortality rates in Florida newborns between 1992 and 2015 and shows a 0.13% to 0.2% improvement in survival rates for black newborns with black pediatricians (though no statistically significant improvement for black obstetricians).
The AAMC brief either misunderstood the paper or invented the statistic. (It isn’t saved by the adjective “high-risk,” which doesn’t appear and isn’t measured in Greenwood’s paper.)
Here’s the quote from the brief by the Association of American Medical Colleges:
And for high-risk Black newborns, having a Black physician is tantamount to a miracle drug: it more than doubles the likelihood that the baby will live.
Here’s the relevant passage from the cited article, “Physician–patient racial concordance and disparities in birthing mortality for newborns”:
And here’s the relevant table:
Stover summarizes:
As far as I can tell, the justification for the quote is probably in Table 1, col. 1. Baseline mortality rate (white newborn + white dr) is 290 (per 100k). Black newborn is +604 above that giving 894/100k. Then -494 from that when it is black newborn + black dr giving 400/100k. So the black newborn mortality rate is more than cut in half when the doctor is also black.
So while the amicus brief did seem to misunderstand or misrepresent the study, the qualitative finding still holds.
Of course, maybe there are other statistical problems. I figure these basic stats don’t need a model though and could have been pulled out of the raw dataset easily.
There’s also Table 2 of the article, which presents data on babies with and without comorbidities. I’m guessing that’s what the amicus brief was talking about when referring to “at-risk” newborns.
In any case, the judge’s key mistake was to trust the amicus brief. I guess this shows a general problem when judges rely on empirical evidence. On one hand, a judge and a judge’s staff are a bunch of lawyers and have no particular expertise in evaluating scientific claims—it’s not like they’re gonna go read journal articles and try to untangle what’s in Table 1 of the Results section. On the other hand, evidently the Association of American Medical Colleges has no such expertise either. I can see why some judges would prefer to rely entirely on legal reasoning and leave empirical findings aside entirely. On the other hand, sometimes they need to rule based on the facts of a case, and in that case empirical results can matter . . . so I’m not sure what they’re supposed to do! I guess I’m overthinking this somehow, but I’m not quite sure where.
How could the judge’s opinion have been changed to accurately summarize this research? Instead of “For high-risk Black newborns, having a Black physician more than doubles the likelihood that the baby will live,” she could’ve written, “A study from Florida found Black infant mortality rates to be half as high with Black physicians than with White physicians.” OK, this could probably be phrased better, but here are the key improvements:
– Instead of just saying the statement as a general truth, localize it to “a study from Florida.”
– Instead of saying “more than doubles the likelihood that the baby will live,” say that the mortality rate halved.
That last bit is kind of funny . . . but I can see that if you’re writing an amicus brief in a hurry, you can, without reflection, think that “reducing risk of death by half” is the same as “doubling the survival rate.” I mean, sure, once you think about it, it’s obviously wrong, but it almost sounds right if you’re just letting the words flow. This is not an excuse!—I’m sure that whoever wrote that brief is really embarrassed right now—just an attempt at understanding.
Evaluating quantitative evidence is hard! A couple posts from the archive brought up errors from Potter Stewart and Antonin Scalia. I’ll do my small part in all of this by referring to these people as judges, not “Justices.”


My experience is limited to regulatory proceedings (administrative law judges and public utility commissioners – no Supreme Court experience). Too often, the decision that is reached contains zero explanation and rarely refers to any of the expert testimony. So, to see a judge misinterpret or repeat erroneous claims almost looks like an improvement over the usual silence that I’ve seen. At least they are pretending to care about the evidence. It’s a start.
Also judge conflates association and cause.
This strikes me as a general issue that people are really bad at using plain language to summarize differences in rates/proportions.
To put things in a less contentious (if more boring) context, imagine a “memory drug” that is tested by having people read a list of 10 words and then, afterwards, people are asked to recall as many of those 10 words as they can. Say that people in the control group recall, on average, 3 words while people in the treatment group recall 6 words. I think it would be reasonable to verbally summarize those results by saying, “people in the treatment group on average remembered twice as many words from the list as people in the control group.”
But results like this have often been verbally summarized using phrases like, “the memory drug doubles your memory capacity” or “we observed a 100% improvement in memory from taking the drug” or “you are twice as likely to remember things after taking the drug”. Even worse, I could imagine someone saying something like, “we observed a 200% improvement in memory!” (conflating the idea of doubling with a lack of understanding of the reference point). I think most of those statements are not only misleading, but are ultimately nonsense that doesn’t really hold up when you think about it. I guess “twice as likely” is okay as kind of a shorthand for an odds ratio, though it still suffers from conflating “memory” with the way it was measured (number of words recalled from a list).
So I guess that leaves me with a question of why these innumerate verbal descriptions are so common? Personally, I thought your more correct rephrasing of the study results still conveyed the core idea in a clear and direct manner. Do phrases like “twice as much” or “doubles” just sound better to most people?
Your explanation got me thinking more clearly, so thanks for that. I think the confusion here is that if people claim doubling of some metric, they are not explicitly stating (as in the Jackson and AAMC cases) what that doubling is relative to. That is the question I would ask these justices and physicians. I feel as though with a bit of dialogue they would understand their error.
” I guess “twice as likely” is okay as kind of a shorthand for an odds ratio…”
No, it’s not. You see things like this all the time, but it’s dead wrong. If the baseline probability is very low, then this is a good approximation. But if the baseline probability is high, it can be very misleading. If we are starting from a baseline probability of, say, 0.5, then the odds are 1. If some other condition provides an odds ratio of 2, then the odds are 2*1 = 2, so the corresponding probability is 0.67, to two decimal places. This is nowhere near twice 0.5. The larger the baseline probability the worse this discrepancy gets.
So, no, odds ratios should not be described as “# times as likely” unless the baseline probability is very close to zero.
Diversity hire.
Bob:
The cited statistical errors came from three different judges—Ketanji Brown Jackson, Potter Stewart, and Antonin Scalia—a diverse set indeed.
“ A study from Florida found Black infant mortality rates to be half as high with Black physicians than with White physicians.”
But this would also be false. That’s not what the study found.
Ted:
I was getting that claim from the quote given above from Stover:
But I didn’t look at this calculation carefully. Are we missing something?
The study purports to measure the variables associated to the *difference* between the black infant mortality rate and the white infant mortality rate. The authors contend that one of their models show that having a black doctor halves the difference between black mortality and white mortality. But that’s not halving the mortality rate, even if the “race of the doctor” variable was a causal factor rather than something confounded by the fact that a high-risk delivery doesn’t have a single attending doctor, but is handled by a large team.
The study data explicitly shows the mortality rate is cut in half. See the summary statistics in the supplemental materials.
Your comment seems to focus on the difference between the “mortality penalty” and “mortality rate”, and fair enough, the model is actually about the former, but it is easy enough to convert to the latter.
Again, I could be misinterpreting something though. If that is the case, I’d like to know.
The study shows the raw-count number is half as high, but ceteris paribus, the reduction attributable to the race of the assigned doctor is lower because of confounding effects (patients assigned doctors who are white have more comorbidities) and the most the paper authors will claim is that the putative race penalty for infant mortality is reduced by half. See “the relevant table.” The model used was column 4. https://x.com/tedfrank/status/1677673115348770817?s=61&t=
Your overall point seems to be about the fact that the model coefficients represent “mortality penalty” and not “mortality rate”. Can mortality rates not be computed from the model in column 4? Or can they be computed, but they aren’t meaningful? Or something else?
I’m not arguing about the ultimate cause of the difference in mortality (e.g. whether it is purely about health status, or economics, or whether there is really something about race or racism involved). My original question was simply about model interpretation and how it was portrayed in legal proceedings.
Your point about there being other individuals involved in the delivery is a good one. There are presumably many other confounding factors not considered in the paper.
Here’s the relevant quote from the study: “Concordance … more than halves the penalty experienced by Black newborns.” p 21195 about halfway down last paragraph.
This quote is about having the mortality penalty instead of the actual mortality rate, but I think the statement about halving the mortality rate is still indicated by the model coefficients and how they are interpreted in the study. Please let me know if I am misunderstanding something.
Again, I’d be curious to just see the statistics from the raw dataset. It feels strange to use model coefficients here.
There could be a lot of reasons for Black infant mortality being higher. I would be surprised if the race of the obstetrician is even in the top ten. I have not looked at this study, but it just compares Blacks and Whites without looking at other ethnic groups. I am extremely skeptical about using a study like this to inform public policy.
> A moment’s thought should be enough to realize that this claim is wildly implausible. . . . the actual survival rate is over 99%
I disagree that it’s so obvious that this is wrong. (I understand that it turned out to be wrong, just disagreeing that it should have been obvious.) The study was on “high-risk” babies. I’m sure that if you graph risk level over survival rate, you will find a cohort of very high-risk babies for whom the survival rate is (tragically) <50%, or even <10%, such that its doubling could be possible or even plausible. Unless there is some standard, common obstetric definition of "high-risk babies".
The UK Royal Society of Statistics has desperately been trying to get UK courts to take errors in statistical interpretation more seriously, with not much success. They wrote a report about medical misconduct cases a couple of years ago: https://rss.org.uk/RSS/media/File-library/News/2022/Report_Healthcare_serial_killer_or_coincidence_statistical_issues_in_investigation_of_suspected_medical_misconduct_Sept_2022_FINAL.pdf
Perhaps a bit out of order, but I think this is part of this theme. Yesterday, another Supreme Court decision involved statistical reasoning, but without any data (https://www.supremecourt.gov/opinions/24pdf/25a169_5h25.pdf, especially beginning in the dissent at page 10). Some of the argument appears to depend on the likelihood that a person stopped at a low-wage employment site that is Hispanic might be here illegally. The Kavanaugh concurring opinion suggests that these factors are likely to involve illegal aliens, while the dissent talks about the potentially large number of innocent people that can be targeted by such stops. I think there is a difference of opinion regarding base rates, yet neither side uses any data regarding base rates. I believe it is undeniable that a significant number of illegal aliens work at such jobs and are of Hispanic heritage and I also believe that a large number of Hispanics working at such occupations are innocent. Given that both groups exist, I don’t see how this issue can be decided on grounds of constitutional rights alone – base rates should matter. But I see no evidence regarding these rates.