If a value is “less than 10%”, you can bet it’s not 0.1%. Usually.

Posted on May 11, 2021 1:39 PM by Phil

This post is by Phil Price, not Andrew.

Many years ago I saw an ad for a running shoe (maybe it was Reebok?) that said something like “At the New York Marathon, three of the five fastest runners were wearing our shoes.” I’m sure I’m not the first or last person to have realized that there’s more information there than it seems at first. For one thing, you can be sure that one of those three runners finished fifth: otherwise the ad would have said “three of the four fastest.” Also, it seems almost certain that the two fastest runners were not wearing the shoes, and indeed it probably wasn’t 1-3 or 2-3 either: “The two fastest” and “two of the three fastest” both seem better than “three of the top five.” The principle here is that if you’re trying to make the result sound as impressive as possible, an unintended consequence is that you’re revealing the upper limit. Maybe Andrew can give this principle a clever name and add it to the lexicon. (If it isn’t already in there: I didn’t have the patience to read through them all. I’m a busy man!)

This came to mind recently because this usually-reliable principle has been violated in spectacular manner by the Centers for Disease Control (CDC), as pointed out in a New York Times article by David Leonhardt. The key quote from the CDC press conference is “DR. WALENSKY: … There’s increasing data that suggests that most of transmission is happening indoors rather than outdoors; less than 10 percent of documented transmission, in many studies, have occurred outdoors.” Less than 10%…as Leonhardt points out, that is true but extremely misleading. Leonardt says “That benchmark ‘seems to be a huge exaggeration,’ as Dr. Muge Cevik, a virologist at the University of St. Andrews, said. In truth, the share of transmission that has occurred outdoors seems to be below 1 percent and may be below 0.1 percent, multiple epidemiologists told me. The rare outdoor transmission that has happened almost all seems to have involved crowded places or close conversation.”

This doesn’t necessarily violate the Reebok principle because it’s not clear what the CDC was trying to achieve. With the running shoes, the ad was trying to make Reeboks seem as performance-boosting as possible, but what was the CDC trying to do? Once they decided to give a number that is almost completely divorced from the data, why not go all the way? They could say “less than 30% of the documented transmissions have occurred outdoors”, or “less than 50%”, or anything they want…it’s all true!

79 thoughts on “If a value is “less than 10%”, you can bet it’s not 0.1%. Usually.”

Andrew on May 11, 2021 1:59 PM at 1:59 pm said:

Phil:

When he’s not dissing Netflix and Tesla, Mark Palko at his blog has been writing a lot about public health authorities having a bias toward alarmism. This seems to fall into that category.

I’m reminded of our work on radon. As we’ve discussed, radon-remediation companies had an incentive to hype the radon threat, but, on the other direction, radon is a natural pollutant and so there’s no natural pro-radon lobby. The EPA has a bias toward overstating environmental threats, which generally makes sense as it has to do battle with polluters, but for the case of radon they were pushing at an open door. My point is just that different organizations play different roles in the conversation. Sometimes it’s kinda the CDC’s job to scare people, but then it can be hard for them to recalibrate when everybody’s scared already. I remember last year when the cops took away the basketball hoops, and then of course there are the people on the street that were screaming at people for not wearing masks.

Reply ↓
- Joshua on May 11, 2021 6:24 PM at 6:24 pm said:
  
  There’s a good argument to be made that (politically appointed) public health officials actually have a bias towards inaction. Michael Lewis makes that argument here:
  
  https://www.nytimes.com/2021/05/11/opinion/ezra-klein-podcast-michael-lewis.html
  
  Reply ↓
- Ben on May 11, 2021 7:12 PM at 7:12 pm said:
  
  > has been writing a lot about public health authorities having a bias toward alarmism
  
  I thought that blog was more about news outlets being alarmist? Like this: https://observationalepidemiology.blogspot.com/2021/05/perhaps-cultivating-learned.html .
  
  The data in the link you included is from CDC/WHO, but the conclusion from that is that more infectious stuff is contained so I don’t think that’s alarmist.
  
  > The EPA has a bias toward overstating environmental threats
  
  Was this the case for the radon stuff or are you thinking of something else specifically?
  
  Reply ↓
  - Andrew on May 11, 2021 7:18 PM at 7:18 pm said:
    
    Ben:
    
    I was thinking of the radon in particular, but more generally I think it’s part of the EPA’s “job” to emphasize environmental risks, as it’s recognized that they operate in a world in which there are lots of polluters trying to downplay these risks.
    
    Reply ↓
Jeff on May 11, 2021 2:40 PM at 2:40 pm said:

If your principle needs a name, I propose “7 out of 13 dentists agree.”

Reply ↓
- Adede on May 11, 2021 2:47 PM at 2:47 pm said:
  
  I feel like Reebok Principle works just as well, and is shorter.
  
  Reply ↓
  - Martha (Smith) on May 11, 2021 6:03 PM at 6:03 pm said:
    
    Which raises the question: What proportion of dentists wear Reeboks? ;~)
    
    Reply ↓
- Joshua on May 11, 2021 6:21 PM at 6:21 pm said:
  
  Yeah, that’s what I thought of also – I still remember my parents laughing about this:
  
  > “Four out of five dentists surveyed recommend sugarless gum for their patients who chew gum.”
  
  Could it be the “Fifth Dentist” fallacy?
  
  Reply ↓
Mathijs Janssen on May 11, 2021 2:49 PM at 2:49 pm said:

The principle is known in economic theory as “unraveling”. It goes back to Milgrom (the recent Nobel prize winner) and Grossman. The type of game where you have to tell the truth (no false advertising), but not necessarily the whole truth, is called a disclosure game. The unraveling principle says that, under certain quite natural conditions, you will have the most negative belief possible consistent with a disclosed statement. E.g., if I say “I get it right at east 80% of the time” you should believe I get it right exactly 80% of the time (since lying is not allowed). It is known as unraveling, because the person making the statement cannot effectively exploit the “not the whole truth” part of the restriction and in the end it is as if he just told you everything.

Reply ↓
- Min on May 11, 2021 6:48 PM at 6:48 pm said:
  
  But if 80% of the time is based upon getting it right in 4 out of 5 total cases, it is misleading.
  
  Reply ↓
  - Michael Nelson on May 12, 2021 1:47 AM at 1:47 am said:
    
    “70% of the time, it works every time.” — from the movie Anchorman, describing the effectiveness of an obnoxious cologne for attracting women. Obviously intended as a joke, I remember watching that scene and thinking, “There actually is a meaningful difference between something succeeding in 7 cases out of 10 but failing in the other 3, and something being partially successful 10 cases out of 10, averaging 70% effectiveness. (Yes, I’m that much of a nerd.) Anyway, I think that’s what you’re saying. But to Janssen’s point, you’re allowed to be misleading in the game, just not dishonest.
    
    Reply ↓
paul alper on May 11, 2021 3:22 PM at 3:22 pm said:

These appear in Leonhardt’s article:

“Saying that less than 10 percent of Covid transmission occurs outdoors is akin to saying that sharks attack fewer than 20,000 swimmers a year. (The actual worldwide number is around 150.) It’s both true and deceiving.”

“In one study, 95 of 10,926 worldwide instances of transmission are classified as outdoors; all 95 are from Singapore construction sites. In another study, four of 103 instances are classified as outdoors; again, all four are from Singapore construction sites.”

“All the while, the scientific evidence points to a conclusion that is much simpler than the C.D.C.’s message: Masks make a huge difference indoors and rarely matter outdoors.”

Of course, this article leads to total confusion because we were told that Trump outdoor rallies and Rose Garden gatherings did cause a lot of infections. Are we to conclude that in the overall scheme of things, not so much? A trivial bump up?

Reply ↓
- Daniel Lakeland on May 11, 2021 3:38 PM at 3:38 pm said:
  
  Maybe most of the cases at the outdoor Trump events were caused by people standing crowded in line for the bathroom etc.
  
  Reply ↓
- Michael J on May 11, 2021 3:41 PM at 3:41 pm said:
  
  Idk about the Trump outdoor rallies but the Rose Garden thing started outdoors but ended indoors. See e.g. here https://www.nytimes.com/interactive/2020/10/03/us/rose-garden-event-covid.html
  
  And that’s part of the point of the article I think – that many events that are called outdoors also have an indoor component.
  
  Reply ↓
- JDK on May 11, 2021 4:02 PM at 4:02 pm said:
  
  “The rare outdoor transmission that has happened almost all seems to have involved crowded places or close conversation.” from the NYT article.
  It makes sense that crowded rallies and lots of huggy outdoors activities would not be as safe as more solitary outdoor activities. Nuance counts and placing risk in context would have been helpful as would have been an overall strategy that all this advice would fit into.
  Oh well. Next time maybe.
  
  Reply ↓
  - rm bloom on May 11, 2021 6:13 PM at 6:13 pm said:
    
    Yes. We’re still waiting to learn what might be found for the asking if the right questions are asked … of the mountains of case-histories which are accumulated. The county records showed clusters of employees at some large retails chains (and never at others … ). Were there clusters of *customers* correlated with these (or not?). Is “big-data” merely a slogan of the white-board-guru/pitchman racket; or can something concrete about differential risks be learned from the data on-hand? Maybe next time?
    
    Reply ↓
NJO on May 11, 2021 3:38 PM at 3:38 pm said:

Sometimes having accurate statistics doesn’t matter. If you can make a Fermi estimate that 10% or less of infections occur outside, and do a cost/benefit comparison of outdoor activity restriction, you may find that even under very generous estimates, the outdoor activity restriction doesn’t make sense. And if the restrictions don’t make sense at 10%, they won’t make sense under 1% or 0.1% either.

Reply ↓
rm bloom on May 11, 2021 3:49 PM at 3:49 pm said:

Andrew, I don’t think it is a bias toward alarmism; not this “usage”. This rhetorical tic comes under the head of the weasel-word “may” which has become ubiquitous I have noticed in medical communication intended for the public. As in “You may wish to talk to your health-care provider”. The thing is a species of euphemism, or baby-talk. I think it is propagated by so-called psychologists working in the communications departments of various organs. The controlling assumptions are: [1] Do *not* alarm your precious audience; [2] Treat them with the trappings of respect, as you would when you try to coax a child to eat his supper; [3] Never, ever, no matter how doubtful you may be at heart, never let slip that the judgement follows from limited understanding: the rhetorical trick of “you may wish” then is the coin of an interesting transaction — the discomfort felt by the authority when he must speak from a position of diffidence is exchanged, is fobbed-off upon the “client” who is flattered, who is “empowered”, since he “may” “choose” to “consult” his “health-care provider” (if he sees fit).

Reply ↓
- rm bloom on May 11, 2021 3:54 PM at 3:54 pm said:
  
  In other words, the “10%” is the way they cover their diffidence in respect to the matter; they are constitutionally unable to say “we do not know”. So they say, “you, dear and honored public, we present you with the upper limit; you are empowered to draw your own conclusions; and we can pretend we speak ex-cathedra, as we dress up our diffidence in the language of (fake) precision”.
  
  Reply ↓
  - Phil on May 11, 2021 5:17 PM at 5:17 pm said:
    
    But 10% isn’t the ‘upper limit’.
    
    Reply ↓
    - Daniel Lakeland on May 11, 2021 5:54 PM at 5:54 pm said:
      
      In the language of Bayesian stats, 10% is probably like a 99.9999 percentile point.
      
      From a Bayesian decision theory standpoint, people in public relations roles in public health don’t want to be contradicted, so by using a super-conservative estimate they don’t have to say “we previously said it was about 0.1% but the latest studies show it was probably about 0.37%” which will then be used politically by opponents of restrictions to say things like “The CDC still can’t figure out the rate of outdoor spreading to within a factor of 3, why should you listen to them about anything?”
    - Anonymous on May 11, 2021 6:02 PM at 6:02 pm said:
      
      “The CDC still can’t figure out the rate of outdoor spreading to within a factor of 3, why should you listen to them about anything?”
      
      Which actually seems to be true! :) They also apparently don’t want to admit to the primary mode of transmission, even though many epidemiologists seem to think it’s primarily aerosol.
      
      Overall it’s hard to point to anything the CDC did during this pandemic that should make them more respectable, although it’s not really clear just yet how much of that can be attributed to the previous admin rather than the CDC itself.
    - rm bloom on May 11, 2021 6:32 PM at 6:32 pm said:
      
      I recall CDC/Gerberding’s daily briefings in spring 2003, during the first SARS affair. Very respectable and constructive of morale; neither underplaying nor overplaying the threat.
    - jim on May 11, 2021 8:30 PM at 8:30 pm said:
      
      “I recall CDC/Gerberding’s daily briefings in spring 2003, during the first SARS affair. Very respectable…”
      
      I don’t recall that but during the zika thing they handled it really well, highly respectable. But today is a different time.
    - Navigator on May 12, 2021 9:57 PM at 9:57 pm said:
      
      Yes, I still can’t believe CDC are sticking to droplet/aerosol false dichotomy. Even worse, virologists like Racaniello are parroting it.
      
      Osterholm is of course correct and mentions the right people who decide on those matters: air biologists, occupational hazard specialists, fluid dynamic physicists, etc.
      
      Do they all really think this disease spread all over the world by people spitting and sneezing on each other?
      
      https://jamanetwork.com/journals/jama/fullarticle/2763852
    - jim on May 14, 2021 9:01 AM at 9:01 am said:
      
      Great paper thanks! Exactly how I imagined it.
      
      Incidentally, I wanted to see the relative size of a virus compared to other stuff, this image is pretty cool:
      
      https://www.visualcapitalist.com/visualizing-relative-size-of-particles/
    - rm bloom on May 11, 2021 6:06 PM at 6:06 pm said:
      
      I presume that “10%” is — in their view — *an* upper-limit. The gist of what I observe (and like many things the more I try to clarify what is meant, the murkier it seems to become) is that this is a phenomenon of diffidence being dressed up in clothes of false-precision.
- Martha (Smith) on May 11, 2021 6:10 PM at 6:10 pm said:
  
  rm bloom said, “I think it is propagated by so-called psychologists working in the communications departments of various organs.”
  
  Like the liver, pancreas, spleen, … ?
  
  Reply ↓
  - rm bloom on May 11, 2021 6:16 PM at 6:16 pm said:
    
    Especially the spleen!
    
    Reply ↓
    - Martha (Smith) on May 11, 2021 10:02 PM at 10:02 pm said:
      
      But the real question is: How well is the communications department of the brain working? ;~)
paul alper on May 11, 2021 3:58 PM at 3:58 pm said:

Back in the Bronx in the 1940s, a true but misleading statistic that fooled no one: “Me and John D. Rockefeller have a lot of money.”

Reply ↓
- Phil on May 12, 2021 2:13 PM at 2:13 pm said:
  
  A quick internet search didn’t turn up the anecdote I was looking for, so I’m going to repeat it here with a guess at the name (and score) of one of the players:
  
  After Wilt Chamberlain scored 100 points in a single pro basketball game, back in 1962, he and the rest of the team were driving back to Philadelphia in private cars. Wilt was in the back seat with reserve York Larese. At some point Larese turned to Wilt and said “What a night, Wilt, what a night! 109 points between us!”
  
  Reply ↓
  - Howard Edwards on May 12, 2021 5:08 PM at 5:08 pm said:
    
    “Let that be lesson to you all. Nobody beats Vitas Gerulaitis 17 times in a row.”
    
    https://www.atptour.com/en/news/vitas-gerulaitis-famous-quote-feature-january-2020
    
    Reply ↓
  - John Richters on May 13, 2021 9:25 AM at 9:25 am said:
    
    Phil;
    
    Maybe this (from https://kuc.org/sermon-archive/rockys-little-brother/)?
    
    “Anyway, on November 15, 1960, Elgin Baylor scored 71 points against the Knicks. Elgin Baylor’s teammate Rod Hundley scored two. As Elgin Baylor and Rod Hundley get into a taxi outside Madison Square Garden after the game, Rod Hundley turns to Elgin Baylor and says, What a night we had, buddy! Seventy-three points between us!”
    
    Reply ↓
    - Phil on May 13, 2021 12:20 PM at 12:20 pm said:
      
      Ha! That’s it! I didn’t have the team or the players right. Or the scores or the year. And yet, I’m going to claim I had the anecdote right.
      
      Thanks.
    - John Richters on May 13, 2021 1:10 PM at 1:10 pm said:
      
      Phil:
      
      You did indeed. And it’s a great anecdote!
      
      John
Dzhaughn on May 11, 2021 5:03 PM at 5:03 pm said:

Pres. Joe had a variation when he said:

“Most people don’t know it: you walk into a store to buy a gun, you have a background check, but if you go to a gun show you can buy whatever you want, no background check.”

It’s true! One can’t really “know” something that is false.

Reply ↓
- Phil on May 11, 2021 5:15 PM at 5:15 pm said:
  
  Sorry, I don’t get it.
  
  Reply ↓
  - Daniel Lakeland on May 11, 2021 6:47 PM at 6:47 pm said:
    
    It depends on what state you’re in and who you’re buying from. Here in CA for example you can’t buy a gun at a gun show without a background check. Federal dealers also must do a background check.
    
    this article kinda explains the issue
    
    https://www.statesman.com/story/news/politics/politifact/2021/04/09/gun-policy-address-joe-biden-exaggerates-background-check-gun-show/7156770002/
    
    We could break the logic down with a less politicized topic:
    
    A lot of people don’t know it: all fish can climb trees
    
    Well, it turns out that there are a few fish that can climb trees, and they lay their eggs out of the water. It’s a very small number of species. So for the most part “all fish can climb trees” is a false statement.
    
    however the overall statement is that “a lot of people don’t know that all fish can climb trees”. That’s a true statement, since a lot of people instead “know” that the statement **isn’t true** therefore they “don’t know” that it **is true**
    
    Fairly tortured interpretation.
    
    Reply ↓
    - Phil on May 11, 2021 9:18 PM at 9:18 pm said:
      
      Ah, I knew little about the “gun show loophole” except that there supposedly is such a thing. Wikipedia says that in 22 states (and DC) background checks are required, but that in the rest private sales, including those at gun shows, do not require a background check. I didn’t look at the list of states and don’t know which ones are and aren’t included, but even if the 28 least-populous states are the ones with no checks, that’s still a whole lot of people who can go to a gun show and buy “whatever they want” (as long as it is for sale by someone who is not a registered firearm dealer) without a background check.
      
      So, yeah, it’s wrong, but it seems like a funny one to bring up. Biden has probably said 20 things that are totally wrong. This one is kinda half right.
- Joshua on May 11, 2021 6:33 PM at 6:33 pm said:
  
  Pres. Donny had a slew of ’em. On the order of:
  
  “A lot of people are saying Covid will magically go away in April.”
  
  Reply ↓
Fred on May 11, 2021 5:28 PM at 5:28 pm said:

Let me play a bit of devil’s advocate here.

CDC understands that there is a massive difference between
A. “X% of documented transmission, in many studies, have occurred outdoors”
and
B. “X% of transmission have occurred outdoors”
but knows that majority of people, even some people on this blog, who read A will think B.

I hope I do not need to line out all the reasons why A and B are not equivalent.
(As a matter of fact, if A is true, then B is extremely unlikely to be true.)

Reply ↓
- Phil on May 11, 2021 5:52 PM at 5:52 pm said:
  
  If you think the fraction of people who have gotten COVID through outdoor transmission is around 0.1%, but you want to make sure you’re being ‘conservative’ so you decide to tell people ‘outdoor transmission is responsible for fewer than 1% of cases’, that’s kind of OK with me. I’d say they could probably have gone with 0.5%, but I wouldn’t squawk about 1%.
  
  But to give a number that is more like a HUNDRED times the actual value, or, to put it another way, is at least 20x the highest value that could possibly be defended, that is going too far.
  
  Reply ↓
  - Fred on May 11, 2021 11:44 PM at 11:44 pm said:
    
    >If you think the fraction of people who have gotten COVID through outdoor transmission is around 0.1%, but you want to make sure you’re being ‘conservative’ so you decide to tell people ‘outdoor transmission is responsible for fewer than 1% of cases’,
    
    The fundamental disagreement we have is that you seem to think
    “0.1% of documented transmission have occurred outdoors” means it is safe to assume a 1% (or 0.5%) upper bound
    and I think that is being overconfident from biased data.
    
    I’m sure we can agree
    P(traced to the source | indoor transmission) > P(traced to the source | outdoor transmission)
    but of course, the question is by how much?
    And this is where I cannot help but shrug my shoulder.
    To be clear, I would not be surprised if the difference is 10-fold or larger in some places.
    
    Just to clarify, I do agree with the conclusion that outdoor is much safer and outdoor transmissions relatively rare, but that belief mostly comes from the mode of transmission, not through incomplete tracking data from different countries with different disease pattern.
    
    Reply ↓
    - Phil on May 12, 2021 2:15 PM at 2:15 pm said:
      
      Yes, if you think it’s plausible that more than a few percent of transmissions have occurred outdoors then we disagree.
    - fogpine on May 12, 2021 3:10 PM at 3:10 pm said:
      
      +1 to Fred’s comments.
      
      Here’s some evidence for Fred’s points, from a study of 1 million SARS-CoV-2-positive persons and tracing of 2.4 million of their contacts (1).
      
      * Of contacts who got infected, only 0.1% had their interaction with the index case outdoors. Without careful thought, this would seem to show that outdoor risks are negligible, consistent with what many are arguing.
      
      * Yet, only 1.8% of all the traced interactions happened outdoors, which suggests that outdoor interactions were greatly underascertained and/or much rarer than indoor interactions.
      
      * For the outdoor interactions that did get traced, 3.8% of the contacts were SARS-CoV-2-positive in the next 10 days. To me, 3.8% does not indicate negligible risk! For comparison, for the “household visitor” interactions that were traced, 8.0% were associated with infection.
      
      The above numbers are from Table 1 of the study and the very nice Figure 2.
      
      What is the actual risk ratio for outdoor vs. indoor contact of the same durations? Presumably outdoor is substantially safer than indoor, but I haven’t seen any good estimate of how much. There are all sorts of biases that affect the available numbers, such as that sick people tend not to go out and that outdoor contacts are much more difficult to trace than, say, household contacts.
      
      (1) Lee et al. SARS-CoV-2 infectivity by viral load, S gene variants and demographic factors and the utility of lateral flow devices to prevent transmission. medRxiv preprint [not peer reviewed]. doi:10.1101 2021.03.31.21254687
    - jim on May 12, 2021 8:11 PM at 8:11 pm said:
      
      Fogpine said:
      
      “There are all sorts of biases that affect the available numbers, such as that sick people tend not to go out and that outdoor contacts are much more difficult to trace than, say, household contacts.”
      
      I don’t know why you think either of these things would be true. First it’s firmly established that you don’t have to have symptoms to transmit COVID, and there are several documented “superspreader” events where the vector wasn’t ill or wasn’t seriously ill. Second, why would outdoor contacts be harder to trace? Can’t see any reason that should be so. I can pass strangers in the supermarket, at Home Depot, or any other indoor venue just as easily as on the sidewalk or in the park.
      
      if you read the article:
      
      “There is not a single documented Covid infection anywhere in the world from casual outdoor interactions, such as walking past someone on a street or eating at a nearby table.”
      
      In the cited study, researchers “deemed almost any setting that was a mix of outdoors and indoors to be outdoors.”
      
      All construction sites are deemed outdoors. Many of the reported outdoor infections occurred in a tower under construction in Singapore – where the tower structure itself was completed before the pandemic.
      
      In studies where the definition of “outdoor” was carefully controlled, the infection rate was extremely low. Additionally, quoting
    - fogpine on May 13, 2021 12:15 AM at 12:15 am said:
      
      jim:
      
      Thanks for your response.
      
      First, I should have said “people who are feeling sick tend not to go out”, rather than “sick people tend not to go out.” Second, it’s more difficult to trace outdoor contacts than household contacts because people do not know many of those they have contact with outside, while they do know those who they live with or who visit their homes. The supermaket example you give is in-between in terms of tracing ease because coworkers will know each other, but customers generally not.
      
      The “casual transmission almost doesn’t exist” arguments are kind of surprising to me. My base expectation is that actual transmission risk is approximately linear in duration of contact. So, all else equal, the average risk of transmission will be similar from a contact with 1 person of unknown infection status for 30 minutes or 300 people of unknown infection status for 6 seconds each. However, it will be much easier to trace-to-source any infection that occurs in the former case than in the latter case.
    - Daniel Lakeland on May 13, 2021 4:12 PM at 4:12 pm said:
      
      I don’t think this linearity assumption makes sense at all…
      
      First off let’s demonstrate that it can’t make sense in the limit of large exposure. The probability of infection can not increase above 1, so if p(t) is a function it must have an asymptote for large t.
      
      Next, from fluid mechanics, if a person exhales some virus laden air, it will take time to billow out away from their mouth and reach the other person. Suppose we’re talking about 1 second to travel 6 feet, then interactions below 1 second will add essentially zero risk.
      
      This suggests that the curve p(t) is a function which is zero at t=0, and has zero derivative, and at t=infinity it’s 1 and has a zero derivative. The question is what’s the shape in the range say t=1 to t=600 (in seconds, or in other words 0 to 10 mins)
      
      I’d guess that outdoors in still air at a distance of 6 feet, the function is pretty near zero out to on the order of 60 seconds, and goes up slowly after that. For outdoors in gentle breeze it could stay very near zero for the full 10 mins.
    - fogpine on May 13, 2021 5:33 PM at 5:33 pm said:
      
      Daniel:
      Thanks for your points. I see where you are coming from, but don’t really agree.
      
      I agree transmission risk can’t be approximately linear in contact duration for the limit nearing certain transmission, but we are far from that limit in all realistic scenarios. For example, even for spouses of index cases, total infection probability is only about 40% (ref Madewell et al’s JAMA household transmission meta-analysis). So for the large majority of contacts, we are in the domain where, if the true transmission function followed an exponential CDF in duration of contact, then it would be well approximated by a function that’s linear in duration of contact. (The assumption that an exponential CDF is approximately linear for p much less than 1 shows up in a lot of fields, so maybe you are familiar with it.)
      
      In the limit of 0 duration contact, is the derivative 0 (S-curve, like you say) or a constant (more like exponential CDF)? In actual casual/short-duration exposures, people are moving into and then out of airspace the other person has already been breathing into. So there isn’t a delay before viral exposure in the way you suggest, meaning the derivative is above 0.
      
      In the contact tracing studies I’ve seen, the usual approach is to try to trace all contacts that exceed a minimum duration and are within a minimum distance. As far as I can tell, this tracing turns up much fewer COVID-19 transmissions than actually occur. For example, it turns up so few transmissions that, if many others were not happening, the disease would quickly die out on its own.
      
      The question is, “How do the untraced transmissions happen?” I suspect many are from very unlucky short-duration contacts (indoors or outdoors), since those quick contacts are infeasible to trace. I’m not confident here though, and there could be other factors that are more responsible, like people just not wanting to admit who they had contact with, restricting tracing.
    - Daniel Lakeland on May 13, 2021 6:50 PM at 6:50 pm said:
      
      For an indoor environment I could see how the derivative at t=0 might be positive as you move into a “fog” of small droplets. For realistic outdoor conditions, the wind velocity must be assumed to be above zero, and doing something like passing someone on a hiking trail or even a sidewalk must have more close to zero derivative.
      
      If we assume indoors might be 100x more dangerous than outdoors, this suggests someone who walks for 100 minutes outdoors then enters an elevator to go up to parking deck 3 for 1 minute might easily get more risk from the short elevator ride. Yet it will not loom large in their memory of the day being less than 1% of the time.
      
      I think lots of people do stupid stuff. I’ve personally heard stories of people getting a positive test and then going out on a bus to the store to stock up on food in case they start to feel sick etc.
      
      Outdoors is very likely vastly less risk. However that doesn’t mean going to an outdoor birthday party and standing 2 ft in front of a friend and stuffing chips in your mouth while discussing the latest political outrages is risk free. Proximity, duration, and masking will all affect risk.
      
      Still it seems more likely people get sick from going indoors to use the restroom than at outdoor parties with masks and 6ft+ distancing
    - Phil on May 12, 2021 9:30 PM at 9:30 pm said:
      
      fogpine,
      
      (1) You say “For the outdoor interactions that did get traced, 3.8% of the contacts were SARS-CoV-2-positive in the next 10 days. To me, 3.8% does not indicate negligible risk!” But the paper says the actual number was 2.9%, not 3.8%; 3.8% is the top of the 95% confidence interval.
      
      (2) I’m not sure where you get “negligible risk.” I’m unhappy with someone misleadingly saying a number is “less than 10%” when it is really more like 0.1%. Whether it is negligible has little or nothing to do with it.
      
      (3)The paper tells us 0.3% of contacts with a known case occurred outdoors, out of 3 million contacts. So that’s 0.003*3,000,000 = 9000 contacts. If 2.9% of those contacted people later tested positive, that’s 261 cases, let’s call it 300. So, 300 people who tested positive after outdoor contact with a case. The paper says that, in the entire population covered by the paper, there were 231,000 positives among the contacts. 300/231,000 = 0.0013 = 0.13%. (Actually this is considerably too high, due to the handling of contacts who never got tested; the actual number would be well under 0.1%). So, in the paper you are citing, the fraction of outdoor transmissions was at most 0.1%. The CDC guy says that number is “less than 10%.” While I agree that 0.1% is less than 10%, I also agree with Leonhardt that the “less than 10%” statement is extremely misleading.
      
      (4) A minor point but since I’m thinking about it: the paper doesn’t seem to give us the base rate, i.e. the infection rate among people not in the study. If x% of people in that region at that time were infected, then you need to subtract x percentage points from all of those infection rates if you want an estimate of the transmissions from the initial cases to the contacts. Maybe more, since there are network effects: Persons A, B, C, D, and E all interact with each other. Person A gets COVID, and persons D and E get tested, with person E testing positive. Did E get it from A? Did A get it from E? Did they both get it from C, who was never tested? The effect of this sort of thing is that if you look at the fraction of contacts who test positive, you get an overestimate (or at least upper limit) on the transmissions from case to contact.
    - fogpine on May 13, 2021 12:06 AM at 12:06 am said:
      
      Phil:
      
      Thanks, I appreciate the responses and the helpful numbering too.
      
      (1) If you add up the numbers in Table 1, the value is 3.8%. I agree it’s confusing to see a different estimate of 2.9%[2.3-3.8%] quoted in the article, but I think (?) 2.9% is the estimate conditional on the median Ct value of 20 while 3.8% is the unconditional estimate. That or there’s an error in the paper somewhere. (I didn’t selectively choose the upper 95% conf limit and pretend that was the point estimate, though I understand why you ended up suspecting that.)
      
      (2) Fair enough. Could close to 10% of transmissions have been outdoors in the US? It seems unlikely, but I don’t feel I can rule it out, and in that I disagree with you (and many other people). In particular, I don’t think I can rule it out because I suspect outdoor contact events are severely underascertained in studies.
      
      (3) Please note that it’s 0.3%* of –traced– contacts with a known case occurred outdoors. When I see such a low proportion of contact events classified as happening outside, I suspect severe underascertainment. I don’t think it’s possible that only 0.3% of all contact events were outside, do you? It’s such a low percentage for a situation where people are avoiding household visits, restaurants are closed or largely closed to indoor dining, clubs and venues have similar restrictions, weddings happen outside, … (PS, The numbers you list are close to those from Table 1 in the paper, but not quite the same.)
      
      (4) Yeah, there are lots of gaps. However, I am confident the gaps mainly lead to transmission events being missed, rather than overestimated or misatributed through the transmission chains you suggest. My reasoning here is simple: There are 1,064,004 index cases, but only 231,489 contacts who tested positive. So on average, there are only 0.2 traced positives for each index case… yet clearly the index cases were infecting several-fold more people because a disease in which each case infects an average of only 0.2 others (R effective = 0.2) would rapidly die out. Again, this ties back into my belief that outdoor contacts are especially underascertained, which prevents me from ruling out the idea that close to 10% of transmissions could have happened outside.
      
      If you know of large contact studies where the ratio of “traced infected contacts” to “index cases” is close to the regional R effective, I’d be eager to see them. Usually when I look at contact studies, the number of traced infected contacts is much less than the number of index cases, even, suggesting bad underascertainment.
      
      *I had a math typo in my earlier post and wrote 1.8%, but it’s 0.3%. Sorry about that.
    - Phil on May 13, 2021 1:34 AM at 1:34 am said:
      
      fogpine,
      
      (1) I’ll take your word for the math, I didn’t notice “3.8%” when I skimmed through, so I searched for it and found it as the upper end of a confidence interval and assumed that’s where you got it. Anyway the difference between 2.9% and 3.8% isn’t enough to change this conversation much.
      
      (2) “Could close to 10% of transmissions have been outdoors in the US? It seems unlikely, but I don’t feel I can rule it out, and in that I disagree with you (and many other people). In particular, I don’t think I can rule it out because I suspect outdoor contact events are severely underascertained in studies.”
      
      I agree, I’m sure that outdoor contact events are severely under-ascertained, but then, indoor ones surely are too. (It must be much more common to be unwilling or unable to list everyone you’ve been in contact with, than to name people you weren’t.) I might agree that outdoor contacts are more likely to be missed than indoor, but the difference would have to be huge in order for 10% of the transmissions to have taken place outdoors, when most of the studies find numbers far below 1%.
      
      But even without the studies I would have been willing to bet on a very low outdoor transmission rate. (a) the average American spends more than 90% of his/her time indoors, so the risk per minute would have to be about equal indoors and outdoors in order for 10% of transmissions to have occurred outdoors. (b) There’s no way the risk per minute is the same outdoors as indoors for the average person, or anywhere near, because of much faster dilution outdoors and because it’s rare to spend several minutes outdoors in close proximity to a specific other person. (This might not be the case if stadiums had continued to operate as normal — that’s an outdoor venue where transmission to nearby people could easily occur. But most or all of that got shut down). I used to work in the Airflow and Pollutant Transport Group and Lawrence Berkeley National Laboratory, where we studied tracer gas and aerosol transport indoors and outdoors, so this is one of the few pandemic-related issues at which I have some subject matter expertise.
    - fogpine on May 13, 2021 12:15 PM at 12:15 pm said:
      
      Phil,
      
      Thanks, your point (a) convinced me for the reasons you list in (b). So I now agree the proportion of US transmissions that happened outside is certainly below 10% by a substantial degree. I’ll add the minor point that a lot of transmissions still may have happened at outdoor seating restaurants and bars.
      
      As someone who works in epidemiology, it’s …irksome… that back-of-the-envelope reasoning from physics and engineering is often more convincing to me than large numbers of studies from my own fields. Your points here are an example. Value of masks and handwashing are other examples.
      
      If I try to put myself in your shoes, I think you may wonder why many epidemiologists and doctors don’t pay more attention to evidence from airflow studies. Maybe I have some insight here. I think a lot of credibility damage was done by prominent airflow studies that claimed risks of COVID-19 transmission while flying were very low. For those used to the epidemiological literature, these studies were difficult to believe because their claims contrast with both research and personal experience. For example, reports of in-flight transmissions during the original SARS pandemic, a nice “natural experiment” study by Zitter et al. (1), and simple anecdotal experience that colds are common after flying.
      
      (1) Zitter et al. Aircraft Cabin Air Recirculation and Symptoms of the Common Cold. JAMA 2002;483-6. doi:10.1001 jama.288.4.483
    - Daniel Lakeland on May 13, 2021 7:06 PM at 7:06 pm said:
      
      Back of the envelope calcs are such a good tool, so it should be the case that they are maybe more convincing than most people might expect.
      
      As for airplanes, it seems to me anyone suggesting they are safe due to the air handling is implicitly saying that 100% of transmission is airborne at a distance. But we all know that sitting next to a sick person you’ll get their direct exhalations and spittle particles on you. So it’s like a self serving flip flop from the usual “most transmission via larger particles”
      
      It’d be a fun project to do a cross disciplinary study with epidemiology and engineering types.
    - jim on May 12, 2021 7:44 PM at 7:44 pm said:
      
      “P(traced to the source | indoor transmission) > P(traced to the source | outdoor transmission)”
      
      I have no idea why that would be so. What’s your rational for that view?
    - Phil on May 12, 2021 11:51 PM at 11:51 pm said:
      
      I don’t think that’s right at all, but I don’t think anyone is claiming it, either.
jim on May 11, 2021 5:30 PM at 5:30 pm said:

Phil!

I love your mode of reasoning. Excellent. *Much* more information is always available than what is directly communicated.

It’s interesting that the CDC now claims “less than 10%” of infections occur outdoors. Even that number is embarrassing for the CDC. The CDC still claims the primary mode of transmission is droplets from sneezing and spittle that travel less than six feet in their entire existence. They did recently acknowledge that aerosol transmission can occur but is “rare”. But now apparently we know that primary mode of transmission – droplets – are at least 90% less effective outdoors, and possibly >>99.9% less effective outdoors. Surprising to say the least.

I suppose there are places in the outdoors subjected to 20mph updrafts that could carry spittle and sneeze drops away almost instantly.

Reply ↓
Anonymous on May 11, 2021 5:30 PM at 5:30 pm said:

> give this principle a clever name

Whatever it is, it looks like a corollary of Grice’s maxims.

Reply ↓
- [email protected] on May 12, 2021 5:59 PM at 5:59 pm said:
  
  Exactly! I was inspired to write a post about it on my moral accounting blog.
  
  “Three of the five fastest runners were wearing our shoes: using the rules of speech to squeeze more information out of fewer words.”
  
  The clever term would be “implicature.”
  
  Reply ↓
  - Robert Bloomfield on May 12, 2021 6:00 PM at 6:00 pm said:
    
    Somehow the link didn’t come through: https://blogs.cornell.edu/moralaccounting/2021/05/12/three-of-the-five-fastest-runners-were-wearing-our-shoes-using-the-rules-of-speech-to-squeeze-more-information-out-of-fewer-words/.
    
    Reply ↓
  - Phil on May 14, 2021 5:08 PM at 5:08 pm said:
    
    Even in the blogosphere, it’s considered good form to give the name of the person you are quoting, especially if you print a long excerpt!
    
    Reply ↓
    - Andrew on May 14, 2021 5:21 PM at 5:21 pm said:
      
      Phil:
      
      This attribution thing leads to some complicated issues. Bloomfield quoted you without using your name, just attributing it to “Andrew Gelman’s blog.” That’s annoying but it’s not that different from when someone quotes a New York Times article and says, “A NYT article says . . .” without mentioning the author’s name. That happens all time! Or, “A paper in JAMA says . . .” Or “A Harvard study finds . . .”
      
      I’ve always been annoyed by this pattern of attribution to the institution rather than to the author, and so on this blog for many years I made it a point to always name the author. Instead of “The NYT says . . .” it would be “Susie Smith says . . .” or whatever. But then a few years ago I got slammed for criticizing authors by name. It always seemed fine to me to criticize authors by name—this is published work I’m writing about!—but enough people were bothered by this that it seemed like a distraction. So now I’ll sometimes purposely not give the name of someone I’m quoting, just to avoid this issue. I’ll link to it, and maybe name the publication or institution, but not name the author, although anyone can follow the link and it’s no secret.
    - Phil on May 14, 2021 7:33 PM at 7:33 pm said:
      
      Sure, I’m not all bent out of shape about it. But I did make the point of giving the name of the NYT writer whose work I was quoting, and I agree with you that I think that should be standard practice.
      
      Also, if someone says “A New York Times article says such-and-such”, the reader realizes they don’t know who actually wrote the quote; they know there is no person named “New York Times.” But when someone says “Andrew Gelman’s blog says…”, most readers will assume the quote is from you. You’ll get either credit or blame, could be good or bad, but either way it’s not fair.
      
      This is not a big deal to me but I would like to move (back?) to a world where the person who is credited with a quote is the one who actually said it!
    - Andrew on May 14, 2021 9:29 PM at 9:29 pm said:
      
      Phil:
      
      I agree.
- Venkat on May 18, 2021 10:55 AM at 10:55 am said:
  
  That was the first thing that occurred to me as well! – I wrote a [blog post](https://venkatasg.me/posts/reebok) about it as well. But there’s something curious about this implicature – Reebok would rather we don’t make it right? The form of the sentence in some way makes the implicature harder than say ‘I ate some (but not all) of the cake’.
  
  Reply ↓
  - Andrew on May 18, 2021 12:12 PM at 12:12 pm said:
    
    Venkat:
    
    Perhaps the best analogy is to prices such as $49.99. Everyone knows why prices are set in that way, but people still do it, I guess because it actually works.
    
    Reply ↓
Peter Dorman on May 11, 2021 5:58 PM at 5:58 pm said:

I thought a useful aspect of the Leonard article was his discussion of coding error in studies purporting to show substantial outdoor transmission. Cases were assigned to “outdoors” in a way that greatly inflated their number. On the other hand, I suspect some measurement error going in the opposite direction, insofar as indoor exposures may be easier to trace than outdoor ones other than mass events. Still, the coding error is probably of much greater weight.

Reply ↓
Joshua on May 11, 2021 6:28 PM at 6:28 pm said:

Don’t know if this has been raised…

But the thought occurs to me that an important question would have to be how much of contact with potentially infected people takes place outdoors.

Just reverse engineering from saying only 1% of transmissions occurred outdoors doesn’t really fully explain the risk of interacting outdoors.

Reply ↓
- anon on May 11, 2021 10:28 PM at 10:28 pm said:
  
  Yes, these indoor/outdoor comparisons are treated as the conditional risk but really are not. It’s analogous to saying “most shark attacks occur near the beach” so don’t worry about swimming in the deep ocean. Even worse, the data from contact tracing backing these studies has an obvious, extreme, sampling bias towards (non-anonymous) indoor interactions. Similarly, I’ve long thought an analogous sampling bias fuels claims of “super-spreading” as well.
  
  This is just yet more bad statistics and messaging from the CDC.
  
  Reply ↓
  - Phil on May 11, 2021 11:13 PM at 11:13 pm said:
    
    Joshua,
    You’re right that most people spend most of their time indoors so even if outdoors were as dangerous as indoors (per minute) only a small fraction of transmissions would be outdoors. But there’s lots of other evidence for outdoor transmission risk being low.
    
    anon, I agree it’s bad messaging, but I think in the opposite direction of what you think. When the Black Lives Matter protests last year failed to turn into major ‘spreader’ events, I took that as convincing empirical evidence that outdoor transmission was unlikely. There was already plenty of reason to believe that is the case, of course, since dilution outdoors is typically orders of magnitude faster than indoors: share an elevator with an infected person and within a couple of minutes you’re breathing air that is heavily contaminated and growing more so, but stand three feet from someone in even a gentle cross-wind and you breathe nearly nothing that they exhaled.
    
    Reply ↓
    - anon on May 12, 2021 1:48 AM at 1:48 am said:
      
      I actually completely agree the absolute risk outdoors is de minimis and the end advice to enjoy the outdoors unmasked is great. So pointing out the error in the CDC’s methods is an own goal of sorts. But bad math to support a good message still deserves criticism.
      
      One other obvious issue with the conditional risk claim: there’s a clear confounder that people outdoors are on average far healthier than those indoors. And overall health is strongly inversely correlated with transmission outcomes.
    - jim on May 12, 2021 11:26 AM at 11:26 am said:
      
      “people outdoors are on average far healthier than those indoors. ”
      
      I’d change “far” for “slightly”
      
      People of all colors, shapes, sizes and health conditions go to beaches, go camping, go hiking, to tourist sites, and in general party outdoors.
      
      At last year’s BLM protests, there were thousands of people gathered in massive tightly packed crowds that were hardly joggers and tennis players. It’s hard to imagine a more friendly environment for disease transmission – sans the fact that it was outdoors. Moreover, subgroups that have otherwise been impacted by COVID at much higher rates were disproportionately represented.
      
      We already know influenza is seasonal. Check out the dramatic swings in flu seasonality. The explanations for flu seasonality without aerosol transmission are amusingly contorted. But if one just accepts that aerosol transmission is the most common form of transmission for respiratory viruses in general, then flu seasonality is easily explained.
    - jim on May 12, 2021 11:27 AM at 11:27 am said:
      
      Let’s try that link like this:
      
      https://www.cdc.gov/flu/weekly/weeklyarchives2007-2008/07-08summary.htm
    - Joshua on May 12, 2021 11:49 AM at 11:49 am said:
      
      jim –
      
      > tennis players. It’s hard to imagine a more friendly environment for disease transmission – sans the fact that it was outdoors. Moreover, subgroups that have otherwise been impacted by COVID at much higher rates were disproportionately represented.
      
      You seem to be dismissing the effect of wearing masks and being generally aware that a pandemic was going on, thus affecting behavior.
      
      Not dismissing the likelihood of low rates of transmission outdoors…
      
      > But if one just accepts that aerosol transmission is the most common form of transmission for respiratory viruses in general, then flu seasonality is easily explained.
      
      Also, how carefully have you looked at the associations between “season” and COVID spread, as distinct from flu so read? Not all aerosol transmission of all respiratory viruses are identical. In the very least, behavior (which might or might not be a confounding or mediating or moderating variable that might or might not be associated with season) should be considered alongside “season.”
  - Joshua on May 11, 2021 11:29 PM at 11:29 pm said:
    
    anon –
    
    > “most shark attacks occur near the beach” so don’t worry about swimming in the deep ocean.
    
    That’s a good one. Reminds me of another great example I ran across earlier in the pandemic when that Santa Clara study came out – related to convenience sampling and survivor bias:
    
    During World War II, researchers at the Center for Naval Analysis faced a critical problem. Many bombers were getting shot down on runs over Germany. The naval researchers knew they needed hard data to solve this problem and went to work. After each mission, the bullet holes and damage from each bomber was painstakingly reviewed and recorded. The researchers poured over the data looking for vulnerabilities.
    
    The data began to show a clear pattern (see picture). Most damage was to the wings and body of the plane.
    
    Plane-bullet-holes-survivor-bias.jpg
    The solution to their problem was clear. Increase the armor on the plane’s wings and body.
    
    But there was a problem. The analysis was completely wrong.
    
    Before the planes were modified, a Hungarian-Jewish statistician named Abraham Wald reviewed the data. Wald had fled Nazi-occupied Austria and worked in New York with other academics to help the war effort.
    
    Wald’s review pointed out a critical flaw in the analysis. The researchers had only looked at bombers who’d returned to base.
    
    Missing from the data? Every plane that had been shot down.
    
    But the research wasn’t a wasted effort. These surviving bombers rarely had damage in the cockpit, engine, and parts of the tail. This wasn’t because of superior protection to those areas. In fact, these were the most vulnerable areas on the entire plane.
    
    The researchers’ bullet hole data had created a map of the exact places that the bomber could be shot and still survive.
    
    With the new analysis in hand, crews reinforced the bombers’ cockpit, engines, and tail armor. The result was fewer fatalities and greater success of bombing missions. This analysis proved to be so useful that it continued to influence military plane design up through the Vietnam war.
    
    https://www.trevorbragdon.com/blog/when-data-gives-the-wrong-solution
    
    Phil –
    
    > But there’s lots of other evidence for outdoor transmission risk being low.
    
    Sure. I remember a convo I had in March 2020 with someone on a town council in Brookline, they were discussing whether to block traffic and open major roads for people to exercise. At the time, I argued it was maybe ill-advised because people would be in relatively close proximity and breathing hard. We had heard about aerosolized particles remaining in an area for some dltime after someone had moved on.
    
    I think my logic was sound, I looked as hard as I could to find useful information… but it’s pretty obvious now my opinion was wrong. Better to have peope outside and exercising.
    
    But this is all about decision-making in the face of uncertainty and in the end you have to accept that while being conservative about risk is often the best choice in the absence of information, it can well turn out to be counter-productive.
    
    Reply ↓
    - Phil on May 12, 2021 12:25 AM at 12:25 am said:
      
      As long ago as last March, there was already reason to believe outdoor transmission risk was low. For instance, China had done a lot of contact tracing and had found only one confirmed outdoor transmission and a very small number of possibles. But this wasn’t completely convincing about the risk-per-minute just because most close contact is indoors, as previously noted. It wasn’t until after the BLM protests that I thought it was clear that outdoor transmission risk is indeed really low: before that I was pretty sure but not absolutely sure.
      
      If someone from the CDC had said “fewer than 10% of transmissions have been indoors” a year ago, that would have been OK with me. But at this point…well, I’m repeating myself, I guess. The number is thought to be more like 0.1% and really couldn’t be much above 2%. There’s no reason to say “10%” at this point, and plenty of reason not to.
Bob Carpenter on May 11, 2021 7:25 PM at 7:25 pm said:

@Anonymous is spot on. It’s a corollary of Grice’s maxim of quantity. It’s called “scalar implicature.”

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

If a value is “less than 10%”, you can bet it’s not 0.1%. Usually.

79 thoughts on “If a value is “less than 10%”, you can bet it’s not 0.1%. Usually.”

Leave a Reply Cancel reply