How did the international public health establishment fail us on covid? By “explicitly privileging the bricks of RCT evidence over the odd-shaped dry stones of mechanistic evidence”

Peter Dorman points us to this brilliant article, “Miasmas, mental models and preventive public health: some philosophical reflections on science in the COVID-19 pandemic,” by health research scholar Trisha Greenhalgh, explaining what went wrong in the response to the coronavirus by British and American public health authorities.

Greenhalgh starts with the familiar (and true) statement that science proceeds through the interplay of theory and empirical evidence. Theory can’t stand alone, and empirical evidence in the human sciences is rarely enough on its own either. Indeed, if you combine experimental data with the standard rules of evidence (that is, acting as if statistically-significant comparisons represent real and persistent effects and as if non-statistically-significant comparisons represent zero effects), you can be worse off than had you never done your damn randomized trial in the first place.

Greenhalgh writes that some of our key covid policy disasters were characterized by “ideological movements in the West [that] drew—eclectically—on statements made by scientists, especially the confident rejection by some members of the EBM movement of the hypothesis that facemasks reduce transmission.”

Her story with regard to covid and masks has fourth parts. First, the establishment happened to start with “an exclusively contact-and-droplet model” of transmission. That’s unfortunate, but mental models are unavoidable, and you have to start somewhere. The real problem came in the second step, which was to take a lack of relevant randomized studies on mask efficacy as implicit support to continue to downplay the threat of aerosol transmission. This was an avoidable error. (Not that I noticed it at the time! I was trusting the experts, just like so many other people were.) The error was compounded in the third step, which was to take the non-statistically-significant result from a single study, the Danmask trial (which according to Greenhalgh was “too small by an order of magnitude to test its main hypothesis” and also had various measurement problems), as evidence that masks do not work. Fourth, this (purportedly) evidence-based masks-don’t-work conclusion was buttressed by evidence-free speculation of reasons why masks might make things worse.

Greenhalgh’s message is not that we need theory without evidence, or evidence without theory. Her message, informed by what seems to me is a very reasonable reading of the history and philosophy of science, is that theories (“mental models”) are in most cases necessary, and we should recognize them as such. We should use evidence where it is available, without acting as if our evidence, positive or negative, is stronger than it is.

All this sounds unobjectionable, but when you look at what happened—and is still happening—in the covid discourse of the past year and a half, you’ll see lots of contravention of these reasonable principles, with the errors coming not just from Hoover Institution hacks but also from the Centers for Disease Control and other respected government agencies. It might sound silly to say that people are making major decisions based on binary summaries of statistical significance from seriously flawed randomized studies, but that seems to be what’s happening. But, as Greenhalgh emphasizes, the problem is not just with the misunderstanding of what to do with statistical evidence; it’s also with the flawed mental model of droplet transmission that these people really really didn’t want to let go of.

And check out her killer conclusion:

While I [Greenhalgh] disagree with the scientists who reject the airborne theory of SARS-CoV-2 transmission and the evidence for the efficacy of facemasks, they should not be dismissed as ideologically motivated cranks. On the contrary, I believe their views are—for the most part—sincerely held and based on adherence to a particular set of principles and quality standards which make sense within a narrow but by no means discredited scientific paradigm. That acknowledged, scientists of all creeds and tribes should beware, in these fast-moving and troubled times, of the intellectual vices that tempt us to elide ideology with scientific hypothesis.

Well put. Remember how we said that honesty and transparency are not enuf? Bad statistical methods are a problem in part because they can empower frauds and cheaters, but they also can degrade the work of researchers who would like to do well. Slaves to some long-defunct etc etc. And it’s not just a concern for this particular example; my colleagues and I have argued that these problems arise with so-called evidence-based practice more generally. As I put it a few years ago, evidence-based medicine eats itself.

P.S. The problems with the public health establishment should not be taken to imply that we should trust anti-establishment sources. For all its flaws, the public health establishment is subject to democratic control and has the motivation to improve public health. They make mistakes and we can try to help them do better. There’s some anti-establishment stuff that’s apparently well funded and just horrible.

130 thoughts on “How did the international public health establishment fail us on covid? By “explicitly privileging the bricks of RCT evidence over the odd-shaped dry stones of mechanistic evidence”

  1. Masks are very common in some Asian countries. These countries should have easily server as counter points to the argument that masks could make things worse. But it’s always show me the RCT, right?

    And there’s sentiment among scientists from outside medicine that their knowledge and warnings were discounted. Why would you not listen to people who specifically study aerosols? See this: https://www.youtube.com/watch?v=i7X81gN650Q

  2. A very interesting article (none the least in my view because much of it is very similar to arguments I have been making) 😉.

    However, I think it’s important to note that much of the opposition to mask use found in the public discourse is based on mechanistic thinkkng, and not from RCT analysis that’s untethered to mechanistic explanations. Specifically, the argument that COVID is spread by aerosol particles that are too small to be filtered by masks.

    Of course, that mechanistic explanation is highly problematic – as people making that argument mostly ignore (1) large uncertainties as to the actual size of infectious particles, (2) the arbitrariness of distinguishing between droplets and aerosols, (3) large gaps in our knowledge about the behavior of particles transmitting COVID in the real world, let alone in lab-based settings (and in our ability to measure those mechanics), (4) large gaps in our understanding of how the number of particles a person is exposed to interact with an individual’s immune system with respect to whether they become infected or how seriously they become infected, etc.

  3. While the article is fascinating, I think it confuses two somewhat different things. The conflict between RCTs and mechanistic (i.e., observational, if I read that correctly) studies is one thing. And the paper is correct to break down the rigid distinction between the two – I have long thought there to be a continuum from RCTs to observational studies, due to the leap required to go from the sampled population in the RCT to the more general population you are interested in. Observational studies lack the purity of the RCT, but are likely to pick up myriad other influences on outcomes that an RCT is unlikely to capture (unless it was an extremely large RCT, which I think is almost a contradiction in terms).

    The other theme I see in the article is the tendency of scientists to hold to initial beliefs even in the face of conflicting evidence – even to the point of not looking for such evidence or rejecting it out of hand (perhaps because it is not an RCT, or for many other reasons). That issue seems to me to have many dimensions, only one of which might be the RCT vs mechanistic one. I’m not even sure that the type of evidence and the tendency to cling to theories in spite of contrary evidence need be related at all. I think that people often exhibit a tendency to hold on to their initial beliefs (as, for instance, in the many cases cited on this blog of researchers that refuse to change their conclusions even after mistakes are discovered), regardless of whether the evidence comes from RCTs or observational data. So, I’m not sure why these two themes are intertwined in the study, nor can I see the necessary connection.

    • I’m on my phone and haven’t read the article yet, but I hope the article makes a distinction between mechanistic and observational. For example mechanistic evidence for mask effectiveness could easily come from laboratory experiments such as the YouTube videos using a laser light sheet that show the reduction in particles output by a person wearing a mask, and other experiments using people sneezing or talking into particle capture apparatus etc. From those mechanistic experimental results it is easy to see that masks are going to reduce exposure at least during brief encounters. From mechanistic observational results you may also see for example fewer viral particles captured by HEPA filters operating on hospital floors after mandating mask usage, etc. These are experimental results but ones where the outcomes measured are the mechanistic causes of disease transmission, but not the outcomes of “sick” vs “not sick” that would presumably be the RCT type results.

      • Now I’ll have to await someone giving a clearer explanation. Apparently, mechanistic is not the same as observational, and the article does not seem to address observational data at all. I take it that mechanistic is akin to laboratory type experiments – in the case of masks, it is easy to envision how such tests could be conducted. Those seem to me to be close to RCTs, with the primary difference being that RCTs use different people randomly assigned, while the mechanistic studies may or may not use different people but clearly use designed experiments according to a theoretically derived relationship, such as testing masks with different designs in a lab where droplets are sprayed at the mask designs. Of course, observational studies are a completely different type of study than either of these – and it seems that the article is not addressing observational studies at all. But in my mind, RCTs and mechanistic studies are more alike than different, compared with observational studies. So, at this point I am confused by exactly what the article is describing. I still think that the discussion of why researchers cling to their initial beliefs (absent evidence, or in spite of contradictory evidence) is a different issue, and exists regardless of what kind of study is conducted.

        • I believe that the article is talking about a religious-like insistence that RCT of a treatment in patients is somehow vastly superior to knowing how a thing works physically and predicting from the physical model how it will work in a patient context. Because of this dogmatic insistence that everything randomize people into several groups and test to see what happens to each group, health care has been led FAR astray from what is actually KNOWN about the world. People studying aerosols in say the context of air pollution have known for a long time how fast they settle out of the air in different environments and whether or not they can be inhaled to various parts of the lung. In fact the CDC has extensive slides for talks on this topic… and yet people refused to believe in “aerosols” as a transmission mechanism of COVID for essentially all of 2020.

          A RCT is in essence an inferior design type for when detailed experiments are not possible and we must accept a lot of uncontrolled variation. But healthcare has gotten it backwards and made it a “primary” type of evidence. RCTs have their place, but it’s not as the primary way of understanding the world.

        • I don’t think that either should be viewed as primary or secondary, but that they should be seen as commentary.

          Here’s an analysis that found little difference when comparing results.

          https://pubmed.ncbi.nlm.nih.gov/24782322/

          My take from the article that this post is about is a reinforcement of the message that any suggestion of causality resulting from an RCT should be supported by a strong mechanistic theory of causation. An important benefit of an RCT in that regard is that it can effectively be a test of a theory of cauasation – particularly importantly, wirhin a longtitudinal framework.

        • One of the things I most liked about the article was its explanation of why mechanistic studies should play an important role in overall assessment. By mechanistic, I take it that Greenhalgh means studies, usually but not necessarily lab-based, that seek to identify or observe the process by which some initial event or factor is thought to lead to an outcome of interest. You can observe this process with a sample of 1 if your mode of observation can capture it. This is different from larger sample studies that try to identify the outcomes of such a process, where the process itself is invisible.

          A process may not show up in outcomes if it is confounded by other processes or operates only under narrow circumstances. Outcomes may not implicate a process if there are other processes that could result in the same set of effects. Of course, there are ways of modifying mechanistic studies to link them to potential outcomes in a more real-world fashion, and outcome studies to include particular patterns that would need to arise if the process of interest is operative.

          Obviously, “outcome” and “process” in practical terms are endpoints of a spectrum, especially since processes are often not directly observable, so even a mechanistic study may be looking at intermediate outcomes.

          Greenhalgh says that the public health establishment (a) ignored the large mechanistic literature on aerosols relevant to coronaviruses and (b) misinterpreted the weak results of a single study as evidence against face masks. I would add that NPI’s against viral transmission are numerous and confounding (even mediating, not just a bunch of separate effects), so it’s not going to be easy to pin down the incremental effect of masking in large outcome studies, with or without randomization. (And how do you randomize social interaction, indoors/outdoors, the use of testing, etc.?) Given this, shouldn’t mechanistic studies have played a larger role?

    • Great insights Dale! I have been viewing, Orthopedic surgeon’s Ian Harris’ YouTubes about the effectiveness of surgeries. They are truly remarkable. He has a few books out as well. Worth everyone’s viewing. What is remarkable is that some surgeons will continue to conduct surgeries when not doing the surgery may be as or more effective.

    • On theme two (holding to initial beliefs): There are times when experts will evaluate a body of evidence and come to a belief quasi-objectively. Once that happens, they will have an easier time remembering that belief than remembering the strength of evidence supporting it. Science takes useful shortcuts: at some point you go from “maybe it’s right, maybe it’s wrong” to “okay, let’s assume it’s right and go from there”. As time passes, the working assumption becomes common wisdom.

      In my view, COVID-19 was a near-perfect storm in that it came mostly out of the blue, in the sense that most experts weren’t in the active evaluating-evidence mode when it started. They already had common wisdom, and everything was seen from that starting point.

  4. From the paper:

    “During the COVID-19 pandemic, libertarian groups drew heavily on what they took to be objective empirical data (especially the Danmask trial) and rejected mechanistic explanations based on indirect—and, they felt, low-quality—evidence. To a greater or lesser extent, people who aligned with the libertarian movement took the view that recommendations to stay home, maintain 2 m distance, wear facemasks and even get vaccinated were unwelcome intrusions of the state. They believed that segmentation should be practised instead of lockdown (that is, the old and vulnerable should stay at home in order that the young and less vulnerable could enjoy their freedoms and remain economically productive), that facemasks were harmful and an unacceptable infringement of personal freedom, and that this essentially mild disease should be allowed to wash over the population to achieve what was termed ‘herd immunity.'”

    After Phillippe Lemoine got some attention here for what turned out to be a bucket of spitballs, I went to his blog and read a bit. It is ironic to me how perfectly this short description of libertarian views perfectly nails Lemoine, a guy who styles himself as an innovative thinker. (Of course, Lemoine would tell you that he was for masks before he was against masks, yada, yada, yada.)

    • Matt,

      Are there groups that can be characterized as anti-libertarian? In my experience of people in DC, there are subsets [generally older] that will accept any directive or opinion rendered by government. All in all, much of public seems to be blamed for what have been conflicting messaging. Older subsets are quite comfortable with staying put in their homes, wearing masks, and ordering in if necessary. They can afford to do so as these subsets rely on retirement pensions and investments and so on.

      More broadly, many don’t rush off to delve into the studies. They might not have the expertise to evaluate these studies either.

      I guess what I’m suggesting is that expertise has been all over the place in their recommendations about masks. The question is whether the right questions are asked by the public?

      I do wear masks when required to do so; as in grocery stores and shops, even though I haven’t really researched what studies are out there about mask effectiveness.

  5. Obviously masks work. If you think masks don’t work, you’re an idiot who doesn’t realize that the possible upside of masking far outweighs the (neglible) downsides.

    Now that we’ve gotten past the signalling part of this, can anyone tell me with any confidence whatsoever what P(I infect you | I am masked) is divided by P(I infect you | I am not masked) is ? Let’s call this value k. It’s certainly 1 – epsilon for some value of epsilon (i.e. “masks work”), but is k < 1/2? is k < 1/4, is k < 1/100, or even k < 1000 ? The RCTs we have are useful because they provide reasonable bounds for values like k. In the beginning of the pandemic, the kind of people who don't care for RCTs did some observational studies, and the values for quantities like k that came out were iirc far more biased towards masks than what the RCTs later found. People like to claim the Danish study wasn't statistically significant because it was underpowered. This is not true. The Danish study had enough power to detect a sufficiently strong effect. Before it, you could (and people did) claim that based on "science", masks were ridiculously effective. It is to me not obvious at all whether the failure was listening to RCTs as opposed to listening to the observational studies.

    An idiot can tell you that k < 1, but the real question on which policy should be based isn't whether k < 1, but how much smaller than 1 k is. Yes, I know about exponential growth, but exponential growth is exactly the kind of mental model people hung on to even though the evidence was all against it.

    • Obviously masks work. If you think masks don’t work, you’re an idiot who doesn’t realize that the possible upside of masking far outweighs the (neglible) downsides.

      There are plenty of plausible reasons masks could contribute to transmission:

      1) Touching your face more
      2) A false sense of security
      3) Concentrating the aerosol into a cloud around the wearers head rather than spraying it down and out where it gets diluted and drops to the ground faster
      4) The particles that get pushed through the mask get aerosolized so take longer to drop to the ground

      How it all works out in the real world, no one really knows. They maybe reduce the avg probability of transmission by 10% max I would say. And that only “flattens the curve”, which means dragging things out longer.

      • Anoneuoid –

        > 1) Touching your face more.

        Citation? For all you know it could mean touching your face less. And what does that mean, exactly, that someone already infected is touching their face and then transmitting the virus to someone else by touching a surface that another person then touches? So what is the prevalence of this fomitr transmission, according to your calculations?

        > Citation? What is the differential between increased carelessness due to false sense of security and increased caution due to the reminder to be careful?

        > Concentrating the aerosol into a cloud around the wearers head rather than spraying it down and out where it gets diluted and drops to the ground faster

        So is this with someone who is already infected and exhaling infectious particles? And what is your calculation of the associated costs versus (1) increased humidity behind the mask which causes infectious particles to drop to the ground or take longer to dry out and, (2) the advantage of altering airflow so that infectious particles don’t travel as far, thus improving the benefit of distancing?

        > . And that only “flattens the curve”, which means dragging things out longer.

        While vaccines get developed and administered, therapeutics get developed and administered, and treatments are improved. Not to mention hours of life illness free (and obviously, alive).

        In other words, you have no idea regarding net cost or benefit.

        This is about decisions making in the face of uncertainty, and about looking at low probability but potentially high damage function risks.

        Selective attributing costs or benefits in one direction only, with shallow treatment of ad hoc speculation about the mechanics of transmission might have some value,but that value is limited.

        • I have to agree with Anonueoid on some of these. He said “there are plenty of plausible reasons masks could contribute to transmission” and then he listed some. He doesn’t need to provide citations to prove that they happen, he’s just arguing that, a priori, they are plausible, and I agree with him.

          But I agree with you that he’s wrong that “all they do is flatten the curve,” if that is intended to imply that in the long run everyone gets infected anyway (or at least that the same number of people ultimately get infected): if people avoid getting COVID until most people they encounter have been vaccinated, they have a decent chance of avoiding infection altogether. If you push infection off long enough then you’ll eventually die of something else without ever having gotten it.

        • Phil –

          > I have to agree with Anonueoid on some of these. He said “there are plenty of plausible reasons masks could contribute to transmission” and then he listed some. He doesn’t need to provide citations to prove that they happen,

          I”m not suggesting that citations would provide proof, but a way to assess plausibility.

          You didn’t specify which of his list you agree with him on, but say you and he think it’s plausible that masks might increase risk of infection or seriousness of infection, because of increased face-touching (one of those I said should be supported with a citation).

          I think it’s plausible that masks might well decrease face touching. Of course, people are going to be touching their masks to adjust them. But do we have evidence that causes someone who isn’t already infected to become infected? Is there a plausible mechanism by which that might happen, given what we’ve recently heard about the lack of fomite transmission?

          So what we’re left is some people thinking one likelihood is plausible and someone else thinking the opposite is plausible. How do we go from that to a basis of policy? Seems to me the first place to turn is the assemblage of relevant evidence.

          > … they have a decent chance of avoiding infection altogether.

          Except that Anoneuoid is skeptical that vaccines reduce transmission.

          > If you push infection off long enough then you’ll eventually die of something else without ever having gotten it.

          Well, yes. That was part of my argument.

    • Exponential growth is a fact about pandemics in the early stages (first couple years?). You can see the log(cases) vs time plots are fairly straight lines across every country for the first few weeks of the pandemic (before “flatten the curve became a thing”) and then there have been periodic exponential growths ever since… https://ourworldindata.org/explorers/coronavirus-data-explorer?yScale=log&facet=none&Metric=Confirmed+deaths&Interval=7-day+rolling+average&Relative+to+Population=true&Align+outbreaks=false&country=USA~ITA~DEU~GBR~FRA

      You can look at Jun-Oct 2021in the US, or Aug to Nov 2020 in the UK or etc. There are periods where if you lay a straight line near the data it will track very well for months. (either up, or down).

      Changing the base of the exponent is equivalent to changing the time coefficient in exp(t/c) since a^b = exp(log(a)*b).

      Suppose we have growth like (1+R)^(t/10), for R = 2, this is like exp(log(3)*t/10) = exp(.1099*t), now suppose you put in a mask mandate and it cuts expected infections by 10%, so we’re talking exp(log(1+.9*2)*t/10) = exp(.1030*t), so with say 1000 infected at the start in some region, in 60 days you’re talking

      1000*exp(.1099*60) = 730700
      vs 1000*exp(.1030*60) = 483000
      483/731 = .66 so by putting on “marginally effective” masks during a run-up period, you cut the total number of people infected over a 60 day run by 33%. That’s not insignificant.

      The point of “flatten the curve” has always been to buy time to get vaccines and treatments out. Of course if people refuse the vaccines… well, at some point you just need to let those guys get sick. That point should be after approving and rolling out vaccines for all age groups. We’re coming up on the FDA’s snail pace for enabling 5-11 year olds. Some time early in 2022 will be time to “let er rip” through the vaccine refusers, if it hasn’t already in the latest Delta boom. Unfortunately if there are enough who refuse, we can still have very bad shortages of healthcare for people with things like gallbladder obstructions or car accidents or whatever. But with people so adamantly against vaccines, you’re going to get that anyway, might as well make it ASAP and all at a time when those who “believe in the virus” can restrict their exposure to those people in a coordinated way (over say a month).

      • > You can see the log(cases) vs time plots are fairly straight lines across every country for the first few weeks of the pandemic (before “flatten the curve became a thing”)

        Well testing was ramping up at the initial time period, so taking the early numbers as being meaningfull makes no sense.

        > (first couple years?)

        Which is it, straight lines for the first few weeks or years?

        >There are periods where if you lay a straight line near the data it will track very well for months. (either up, or down).

        Well yeah, but anything looks linear on a log-plot with fat enough marker. More generally, pick any smooth curve. Plot it on a log plot. Pick a point on the curve. Congratulations, since your curve is differentiable it is locally almost a straight line. Even better, assume it has local maxima and minima. Congratulations, your curve has a saddle point, here you’re extra much like a straight line. I see your link. It has a lot of straight-line-ish looking bits. It does not look like a straight line. Observe how plotting the graph from you link has plenty of “straight line” regions even without using log-scaling. It would (probably) still have straight line regions if using log-log scaling.

        I have no trouble whatsoever following the mathematics of exponential growth. But if your process is not really exponential, you won’t really have exponential growth.

        > That point should be

        It’s cute that you think governments have the power to decide when certain groups of people get sick. I’m sorry for being snide, but that point was March 2020.

        >We’re coming up on the FDA’s snail pace for enabling 5-11 year olds

        I’m just worried for the 0-4 year olds. Forget that the 5-11 year olds are already safer than triply vaccinated 40 year olds, it is important that everyone gets to feel like they have protected themselves.

        • Until 70-90% of people have been infected (so perhaps up to a couple years), you will have *periods* of exponential growth extending for weeks and weeks. We’re not talking about periods of duration epsilon = 3 hours for a function representing the smoothed average over 7 days, where all smooth functions have a tangent line, we’re talking about weeks and weeks, where an exponential model gives accurate predictions of cumulative cases within a percent or two for every day over a period of 30, 60, or 90 days. All you need to know is the cases per day on day 0 and a rate, and you can predict the cases per day for maybe 90 days. But you knew that, and you’ve got an agenda, so discrediting true things because they don’t fit your agenda is important to you. Whatever. It doesn’t change the fact that small improvements in rates lead to large improvements in cumulative infected over real world timescales on which people make decisions (weeks).

        • >you will have *periods* of exponential growth extending for weeks and weeks

          This seems to be a kind of religion in some circles based on a highschool-level model of disease spread. Yes, if everyone infects n people, and if those all infect another n people, then you get n^i infections after i steps. Yes, reduce n by a factor of a, and you save yourself a factor of a^i infections. Unfortunately the real world is far more complicated. For one, the whole process is stochastic, driven by feedback groups, runs on an extremely heterogeneous social graph, etc. Looking at your “our world in data” plots, I could just as well fit a locally linear model.

          You take it on faith that masks have an exponential effect on transmission. I do not share that faith. For one, the real process “stops” being exponential fairly quickly, but you assume that the effects of masks are exponential throughout. Yes, I understand the exponential nature is a direct consequence of your model of disease spread. The kind of effects you claim masks must have on transmission are just not borne out by the evidence – you forget that there are places without mask mandates or with mask mandates that were not complied with.

        • Matty –

          You and I have discussed this before, and I”m not speaking for Daniel.

          But in my view, if there’s a chance that masks reduce transmission to some extent, then there is potentially a compounding aspect of that effect. If one person is not infected, it may well be the case that one person wont’ infect X number of others, who then each won’t infect X number of others.

          Of course that calculus is complicated. Of course there are many relevant factors that likewise would affect behaviors and other correlates of transmission.

          Compliance with mandates is a related issue – of course.

          But the point is that we should consider the possibility of that mechanism. The absolute certainty you assign to Daniel may or may not be valid – but it isn’t so for everyone.

          The problem, IMO, is that I’ve encountered many people who say something like “People in that public space got infected even though they were wearing masks” or ” Masks don’t ‘prevent’ transmission.” without considering how risk reduction at the individual level compounds at the population level.

        • >But in my view, if there’s a chance that masks reduce transmission to some extent, then there is potentially a compounding aspect of that effect.

          I fully agree with this. But likewise, the fact that it isn’t completely obvious that masks have a large population-level effect suggests that masks may have very low individual effects. But I have very low certainty associated with any knowledge I have about masks.

          >The problem, IMO, is that I’ve encountered many people who say something […]

          Yes, what they are saying is wrong. But you seem to be reading my comments discerningly enough to realize I am not saying this, which I appreciate.

        • The assumption isn’t that “masks have an exponential effect” it’s that there exists an average effect of masks across encounters. This is uncontroversial because the “effect” is a number bounded between 0 and 1 by your own admission (ie. reduction factor for transmission probability at each occurrence), encounters are a finite discrete number, transmissions are a finite integer, and the average of a finite number of integers exists.

          If F people are infected and have N encounters during their infectious period, for F, N large:

          without masks, there would be T transmissions on average, with masks by your own admission there would be (1-epsilon)*T transmissions on average for some epsilon between [0,1]. Therefore at the “next time tick” the cumulative total will be F*T without masks and F*(1-epsilon)T with masks.

          Now to get exponential growth, all we need is that T/F is more or less constant for a period of time, that is averaged over all people who are currently actively infected, the number of transmissions per infected person doesn’t change dramatically.

          The existence of months of data that fit this model is sufficient to say the model is useful which is all we can ask of models. The existence of periods where this isn’t true isn’t a problem, because the model doesn’t say “at all times and under all conditions exponential growth with a constant rate occurs” it just says “there exists weeks and weeks during which exponential growth at an approximately constant rate occurs” and that is simple true in the data.

        • Good, some maths I can criticize.

          >by your own admission

          There is nothing to “admit” here, are we trying to figure out the truth or trying to score points?

          The “effectiveness” of a mask is certainly a random variable that is mostly in [0,1]. Could it take on some values > 1? In real life, probably yes. But to get away from this discussion, let’s not go there.

          >without masks, there would be T transmissions on average, with masks by your own admission there would be (1-epsilon)*T transmissions on average for some epsilon between [0,1].

          No! This only works if the number of contacts is independent of masks. But more crucially, you implicitly assume that there is only a single path to infection for a person, i.e. if a transmission never happens then that person is safe for ever. This is somewhat paradoxical – on one hand, the assumption is that getting corona is very rare, and can be stopped by blocking a single transmission. On the other hand, the exponential growth hypothesis implies corona is pretty common.

          >The existence of months of data that fit this model is sufficient to say the model is useful which is all we can ask of models.

          No, we can also ask them to predict things instead of retrospectively “fitting” 5 parameters to an elephant.

        • No, I’m not assuming a person is “safe forever”, only that in that instance of potential transmission that they turned out to be safe. If that’s the case then over the short time period, which let’s call it 10 days, people who were infected on day 0 ultimately produced T new infections or (1-epsilon)T new infections depending on whether the masks were worn or not. The growth rate of infections is a local in time phenomenon and has nothing to do with “safe forever”

          we seem to be talking past each other on the meaning of effectiveness. You are acknowledging that per contact the probability of transmission most likely decreases, but you believe that contacts may increase enough that the net transmission from our F people is N * [(T/N * (1+epsilon_act))] * (1-epsilon_mask) for both positive epsilons… you’re saying that (1+epsilon_act) * (1-epsilon_mask) could be bigger than 1. Yes, it could. So far there is little way to estimate these things.

          I was assuming that we agreed the net effect was (1-epsilon) for positive epsilon.

          The question of “how much does putting on a mask increase people’s willingness to be in contact, and is it enough to overall increase spread” is an open question and probably varies from place to place. Nevertheless the effectiveness of masks in places like Japanese trains seems to indicate they’re potentially rather effective.

        • Replying to Daniel Lakeland:

          >The growth rate of infections is a local in time phenomenon and has nothing to do with “safe forever”

          Yes it does. You implicitly assume that there is only one path to being infected, at least locally. For example, imagine one person who interacts with 5 others. If all 5 people have corona, and it takes only one of them to infect him, then crucially preventing any one of the 5 transmissions does not prevent 4 others.

        • >We’re talking about weeks and weeks, where an exponential model gives accurate predictions of cumulative cases within a percent or two for every day over a period of 30, 60, or 90 days.

          Finally there’s something I can test. Obviously you’re being hyperbolic here, because 1-2% deviation is built in by the randomness, you won’t find a single country where case rates don’t fluctuate by this amount. Now show me me a single country where cumulative cases matched an exponential model within even 5 percent for 90 days. Or even 60 days. Or even 30 days, where a linear model doesn’t do about as well.

          >But you knew that, and you’ve got an agenda, so discrediting true things because they don’t fit your agenda is important to you

          No, I did not “know” this. I don’t even know what “you have agenda” means. I presume it means “you disagree with what I find obvious”.

        • The United States as a whole, cumulative cases relative to population May 1 2020 to January 18 2021

          Here’s the data: https://covid.ourworldindata.org/data/owid-covid-data.csv

          Here’s the Julia code to do the fit:

          using CSV,DataFrames,StatsPlots,Dates,GLM

          plot(1:10) # to open window

          df = CSV.read("owid-covid-data.csv",DataFrame)

          dfus = @chain begin
          df
          @subset(:location .== "United States" .&& :date .> Date("2020-05-01") .&& Date("2021-01-18") .> :date)
          end

          dfus.deltadays = Dates.value.(dfus.date) .- Dates.value(Date("2020-05-01"))

          @df dfus plot(:deltadays,:total_cases)

          linfit = lm(@formula(total_cases ~ deltadays),dfus)
          dfus.linpred = predict(linfit,dfus)

          @df dfus plot!(:deltadays,:linpred)

          expfit = lm(@formula(log(total_cases) ~ deltadays),dfus)

          dfus.exppred = exp.(predict(expfit,dfus))
          @df dfus plot!(:deltadays,:exppred)

          @df dfus plot(:deltadays,(:exppred .- :total_cases)./:total_cases)
          @df dfus plot!(:deltadays,(:linpred .- :total_cases)./:total_cases)

          @df dfus plot(:deltadays,:total_cases)

          # fit to after 150 days,

          expfit2 = lm(@formula(log(total_cases) ~ deltadays),@subset(dfus,:deltadays .> 150))

          exp2pred = predict(expfit2,dfus)
          @df dfus plot!(:deltadays,exp.(exp2pred))

          @df dfus plot(:deltadays,exp.(exp2pred)./ :total_cases .- 1.0)

          The linear fit is absolutely atrocious, absolute error is about 2.5 times the average value at t=0, the exponential fit is within about 15% of the true value continuously for about 9 months!

          Certainly there are periods of 60 days in there where if you fit to just that data the error would be under 5%. I did that towards the end of the code, after 150 days the error stays less than 5% or so for ~ 100 days.

        • I am unable to run your code, it seems to be missing the DataFramesMeta package (for @subset macro) but once I add that it complains about invalid syntax. This is julia 1.6.2 on linux.

        • Yes, it would need DataFramesMeta. And maybe some specific version. I can’t really work on this now, have too many other things on my plate. But the gist of the code should be obvious, so you could run the same thing in R without debugging installation issues. Basically just take the log of total cases, and fit a line to it, then predict the line values and exponentiate them… it fits all 9 months adequately, way better than any line ever could.

        • I tried DataFramesMeta, it didn’t work, gave up. I really do appreciate the code, but I have job and can’t debug code in a language I barely know.

          That said, the choice of the US and this time interval is a bit arbitrary. Take Spain’s data for roughly the same time period, it looks far more linear than the US data. Or the second US wave. My point isn’t that the data is particularly well explained by a linear model – it isn’t! But an exponential model doesn’t explain it well either. You don’t understand what causes exponential growth to break down regularly, but you assume non-breaking-down-exponential growth when reasoning about masks.

        • Thanks for giving it a try. I am sure I could probably tweak it a bit and/or offer up a proper Project.toml and Manifest.toml (which says which versions of the packages I used) and we could get it working, but again, it’s not worth it for either of us.

          I *do* believe that I understand why exponential growth is “breaking down” and it has to do with changes in infectious contact rates. You’re right to worry that telling people to wear masks will induce them to have more contacts than they otherwise would. I definitely believe for example that’s what’s going on in schools near me. The schools are open and kids in person, they’re wearing masks and have HEPA filters in classes and etc, but they had some outbreaks and the real thing keeping them from exploding is that they’re quarantining everyone after the outbreaks, not that they’re wearing masks.

          I’m not a huge believer that the masks are as effective as everyone thinks they are, but *holding all else constant* having a mask will be better than not having one I think.

        • > and it has to do with changes in infectious contact rates.

          Maybe, but if it were that simple I feel like this should be more obvious, and predictable. I mean it’s kind of weird we can’t predict case numbers accurately from mobility data or surveys asking “how many people did you meet this week”. I freaked out about exponential growth in February/March 2020, but began to notice more and more that the dynamics of the pandemic are *not* easy to understand. There does seem to be a seasonal effect, but then hot places have pandemic waves. There seem to be “herd-immunity” like effects (e.g. pandemic receding in Manaus), or everywhere else in the third world. But then we get new waves in those places too. This is not some fringe position, btw, there were a number of early papers on “epidemiological dark matter” early on in the pandemic that were saying roughly that all of this makes no sense. I suspect that a part of the problem is heterogeneity of social networks, but I’m not sure this is the only answer either. But it’s certainly not as simple as “exponential growth”.

          >but *holding all else constant* having a mask will be better than not having one I think.

          Eh probably. I mean I’m starting to get pretty sick of the things, but if I were worried about catching covid I’d probably want to wear one. I’d be surprised if the cloth ones did much good, and I’m pretty sure there is some age n > 0 below which their risks far outweigh the benefits.

      • 1000*exp(.1099*60) = 730700
        vs 1000*exp(.1030*60) = 483000
        483/731 = .66 so by putting on “marginally effective” masks during a run-up period, you cut the total number of people infected over a 60 day run by 33%. That’s not insignificant.

        Using this model, after ~150 days the entire population of the Earth (~7 billion) has been infected either way. The masks buy you about 10 days. It takes 154 vs 144 days. Or you can look at it like it takes 64 rather than 60 days to get to 730k infections.

        That does not seem very significant.

        • Exponential growth never goes on forever. People see things like corpses stacked in mobile freezers and get scared, they stay home, they don’t get sick. The point is that if it takes them 60 days to get scared enough to do something, and the growth rate is exponential with one time scale vs another, you’ll get vastly different numbers of sick people. If our model is that people are looking at what they see in the last say 5 days and making decisions, then it makes a big difference how quickly the function grows in 5 days.

          And that’s not even discussing the idea that near the bifurcation point, being 5% above that point leads to exponential growth and 5% below leads to exponential decay (cases per day).

        • The point is that if it takes them 60 days to get scared enough to do something, and the growth rate is exponential with one time scale vs another, you’ll get vastly different numbers of sick people.

          They would just get “scared enough to do something” after 64 days rather than 60 days though. Whenever the cases reach the “scary” threshold (which is 730k in this example).

        • They get scared enough to do something 20-30 days before peak mortality (time between transmission and death). If your exponential growth is slower, you attain peak mortality with fewer deaths.

        • > if it takes them 60 days to get scared enough to do something, and the growth rate is exponential with one time scale vs another, you’ll get vastly different numbers of sick people

          Do tell me: do people take slightly longer to get scared when they feel safer e.g. because they are wearing a mask? By exponential growth, couldn’t this delay cost literally millions of lives? Or are we only allowed to compound tiny things ad absurdum if the effect size is in the direction we like?

      • Daniel, you say “The point of “flatten the curve” has always been to buy time to get vaccines and treatments out.” In my experience that didn’t become true until about five months into the pandemic, when it was clear that we were indeed going to have good vaccines in a few months rather than several years. “Flatten the curve” was initially billed as a way to avoid overwhelming the health care system: sure, we might all end up getting COVID eventually but if we do it gradually enough, well, that’s a lot better than what we saw in northern Italy and in New York when masses of people were getting sick at the same time.

        Speaking for myself, I think I switched from “Flatten the curve so we don’t overwhelm the system” to “try to hold on a few more months because the vaccine is coming” around June 2020.

  6. “the risk of serious harm from facemasks is extremely low, and the potential for benefit at population level could be high.”

    Uhhh, citation needed.

    “Hence, it could be argued that the usual reasons for advocating caution in clinical trial research do not hold. Indeed, because of the very different balance of probabilities, there are strong arguments for reversing the usual assumption that avoiding harm is more important than striving for benefit. We should, perhaps, adopt the precautionary principle and recommend this intervention ‘just in case’ [37].”

    How about just running bigger and better RCTs on masks? And on social distancing. And on the vaccines. The money and opportunity were there.

    It would be nice if the author at least steelmanned this potential instead of lazily falling back to, “Oh well, RCTs are too complicated, let me just assert my bias toward the precautionary principle.” (I know this is an unfair summary, but I think the general public are all a bit tired of all of this lazy and strained reasoning masquerading as something profound as long as it’s styled in a nice looking PDF.) [I’m grumpy today.]

    • ACF –

      > How about just running bigger and better RCTs on masks?

      Given the high level of difficulty of running a real world RCT in this context, that controls for confounding variables and the myriad potential mediators and moderators, and runs long enough to gain the advantage of a longitudinal analysis (i.e., long enough to control for the variability in transmission associated with different variants) just suggesting bigger RCTs would likely only compound the problems. Sometimes bigger sample size gets you no advantage.

      As for “better” RCTs, what would you suggest? And who should fund it? Big government and (I’m guessing you think) politically corrupted public health officials?

        • Sure there are. Say you have a perfect chemotherapy RCT that shows it doubles survival time. The trial is based on the idea the drug is killing cancer cells by targeting their fast rate of division. Some major “side effects” are nausea, reduced intestinal absorption, etc.

          Then it turns out the mechanism via which survival time was doubled is actually caloric restriction starving the cancer cells, rather than directly killing the cancer. Why would you choose to take an expensive poison rather than just limit your calorie intake?

          The issue is that the research hypothesis != statistical hypothesis. There is a whole world of things that can go wrong when trying to draw a connection between the two and these are even more important for making rational decisions than testing the statistical hypothesis.

        • Matty –

          > A good enough RCT does not need to control for confounders because there are none.

          First, I don’t understand how you determine what’s good enough.

          Second, I don’t understand your point anyway.

          Say you think there is a mechanism of causality between dietary fat intake in children and their cardiometabolic risk, and so you want to run a trial to see if reducing their dietary fat intake correlates with reduced risk. So you get a randomized sample and run that trial. But there’s a strong literature on other correlates, such as SES or the number of Adverse Childhood Experiences, or levels of physical activity and sedentary time, etc. How are you supposed to assess your mechanism of causality of interest if you don’t control for those variables? How could you possibly recruit a sample where you didn’t need to control for those variables?

        • > First, I don’t understand how you determine what’s good enough.

          Whether or not you randomized the treatment groups.

          > Second, I don’t understand your point anyway.

          You randomized the treatment groups with some random variable that did not depend on confounders (like flipping a coin). Hence you don’t need to control for confounders.

          > So you get a randomized sample and run that trial.

          In a RCT you do not “get” a randomized sample, you *create* a randomized sample where you *randomly* assign people to a treatment group and a control group. For example, you get some people who are willing to wear masks or not wear masks, then you randomly get them to wear masks or not wear masks. In this case you don’t need to control for whether or not the people are rich or not because whether or not they wore a mask does not depend on whether or not they were rich.

          You need to, of course, precisely state your result because you may introduce confounders there if you don’t do so.

        • Matty –

          > In a RCT you do not “get” a randomized sample, you *create* a randomized sample where you *randomly* assign people to a treatment group and a control group.

          Is this a semantic point? In the scenario I described above, you recruit (“get” or “create”) a sample of participants for the study and then randomize assignment into the control and intervention arm, yes?

          Of course, therein lies a problem, as there are always potential problems with the recruitment process.

          > For example, you get some people who are willing to wear masks or not wear masks,

          In other words, you “get” or “create” your sample….

          > In this case you don’t need to control for whether or not the people are rich or not because whether or not they wore a mask does not depend on whether or not they were rich.

          No – but this isn’t a parallel example. I’m talking about controlling for correlates that might be relevant to your propose causal mechanism for the outcome of interest.

          In your case you would need to control for SES, if that were a potential correlate for the outcome of wearing masks. In fact it could well be a proxy correlated in your example, if, say, income were a mediator between the person-density of a household and transmission outcomes. Or, you could control for that factor more directly.

        • The issue is whether the result is transportable to a different context. It’s entirely possible to “get” a group of people who you randomize, and discover that *in this group of people* assigning them to wear a mask *causes* them to have lower risk for infection… and yet, because the people you “get” are different from the population at large, “assigning” the population at large to wear a mask would have perhaps no causal effect, or even a reverse causal effect.

          For example if you recruit from healthcare workers you might have a different causal effect than if you recruit from say school teachers. For example perhaps making health care workers wear masks reduces their risk because they have realistic views of how effective masks are, but when you assign school teachers to wear masks they tend to act as if they’re invulnerable and do things like comforting children who are upset by getting up close to them, and therefore get more sick.

          Randomization only controls for confounding **within the population randomized**.

        • >Of course, therein lies a problem, as there are always potential problems with the recruitment process.

          Well yes, but to paraphrase Wittgenstein: about that which we cannot study, we must remain ignorant.

          There are some things we cannot hope for scientific knowledge about, you cannot really “fix” systematic issues in a satisfactory way.

        • Matty –

          I didn’t understand your comment at 3:19.

          Could you explain? It looks to me like you’re saying that there wouldn’t be any reason to study something like whether diet can affect cardiometabolic health in children because of the difficulty of recruiting a sample that’s completely free of any potential confound variables.

          But I have a hard time believing that’s what you really meant.

    • “How about just running bigger and better RCTs on masks? And on social distancing. And on the vaccines. The money and opportunity were there.” So, how long would that take? What do you do in the meantime?

  7. I generally like the article – but I think there’s an important caveat to its focus.

    I think that what scientists and researchers have and have not done (properly) is far from the dominant factor when considering counterfactual scenarios for how fewer people might have become ill/fewer lives might not have been lost.

    I see a similar playing field in many areas of controversy on the public policy/science interface.

    With the advent of new technologies, large segments of the public believe strongly that they can Google or check Twitter and develop opinions on very complex topics that are sufficiently well-informed that they can just dismiss what “experts” in the fields have to say.

    Very obvious in this regard is climate change, where poling shows that large swaths of the public think they are well-informed enough to assess the science even when they display a notable lack of mastery of probably any of the technical aspects of the science.

    Reinforcing that confidence is the dynamic where people can confirm their preexisting biases (which almost always align with ideological identity-orientation) by selectively culling through the views of “experts” who are similarly ideologically aligned.

    Reinforcing that dynamic is the echo chamber effect whereby many people are only likely to see “expert” opinions that are ideologically aligned.

    Finding fault with the work of researchers and scientists certainly has value. But I think people need to be extremely careful about revere engineering from outcomes like prevalence of illnesses and deaths from COVID to some countefactual view that things would have been different had scientists and researchers conducted their work differently.

  8. I think people have been having problems weighing theory vs. evidence for a lot more than just face masks during the pandemic. Off the top of my head:

    1) The debate over the prevalence of asymptomatic and presymptomatic spread (Personally, I think the evidence remains all over the place here).
    2) Length of immunity post infection. (A lot of scientists started with a null of no immunity!).
    3) Efficacy/Safety of the vaccines for younger people. (it’s illegal for kids to get them, until one day soon it’ll become mandatory…)
    4) Efficacy/Safety of boosters.
    5) Efficacy/Safety of vaccine dosage sizes or mixing and matching.
    6) Face mask efficacy.
    7) Relative safety/danger of different activities

    A lot of frequentist studies assuming nulls completely contradictory to established theory.

    • Yep, the vaccine situation for the younger kids is a total disaster and a good example of this issue. Everyone knew back in March 2021 that it would be approved and eventually mandated for kids, let’s be realistic, no one thought “hey this is super effective for people over 12, but you know people who are 11 or 10 are kind of like lizards or birds or jellyfish they could be really entirely different and need a totally different kind of vaccine or even maybe they can’t get the virus or they explode when needles enter their arm”… no, the entire question was what dose? That question could have been answered within a month using … yep you guessed it… mechanistic models. For example, scaling the dose by something as simple as bodyweight (or a little more sophisticated by long-bone mass) to get an approximate range, and then antibody concentration after 2 weeks was already shown to be a good (mechanistic) proxy for immunogenicity and protection, so you inject various doses into a random sample of 200 kids, wait for a few weeks, measure antibody levels, and do a regression. By end of April we could have approved the vaccine for 5-11 year olds. There’s no way to do that using RCT of infection rate outcomes. But doing the RCT leads to vastly worse outcomes overall (as Delta spreads through schools in the south for example)

        • Because the main effects of COVID are as bad as they are (in kids this isn’t typically death, but long covid, hospitalization, weeks of illness etc are all possible. Long covid itself is like 10% of children with COVID). All you need is that the side effects are not worse than the COVID main effects. You can already rule out the vaccine being worse than the disease in a couple hundred kids sample. an EUA with a staged rollout (say start with 10,11 year olds, look at real world reports… lower the age bar…) would have been perfectly fine. To decide that it’s dangerous, you have to somehow argue that the function “rate of serious complications” as a function of age has a nearly vertical shape near age 12 (going from say 5/100k for 12 year olds to say 5000/100k for 11 year olds as you decrease age). You’d notice that right away in a trial of 300-400 kids. And what makes 12 years old so special? Why did we not consider say 15 year olds as dramatically different than 16 year olds? Why isn’t each year of life its own separate group? Why not each day of life? The fact is all of these functions are continuous and smooth with age, and extremely sharp “boundary layers” have never been observed in physiological measurements across 1 year ranges at least for kids over 5 (kids between age 1 day and 2 years could be a different story perhaps).

          Often we think of Frequentist estimates as something like a Bayesian estimate with a flat prior. Often this is done in the context of say estimating a mean, and the shrinking of the posterior distribution by using an informed prior is often small and seen as maybe not very important. But when we design trials on the basis of different groups binned into ages being essentially 100% different from each other (a uniform prior from [0,1] for adverse effects for each group for example) what we wind up with is the nonsensical idea that as a function of age it’s totally plausible that say 12 year olds have an adverse effect rate of say .00005 while 11 year olds have .93 and 10 year olds have .04 and 9 year olds have .77 and 8 year olds have .042 and …

          that’s just not how things work, but it’s the logical consequence of treating each group as completely separate. The idea is so absurd of course, that they design the trial of 5 to 11 year olds as divided into 3 groups and being given 10, 20, or 30 micrograms each, depending only on group membership not age. Of course if you “really believe” the “pseudo prior” that each age group needs an entirely different study, then you also need to believe that doing this trial where you’d give a 5 year old the same 20ug dose as an 11 year old is SUPER DANGEROUS. No one believes that, as evidenced by the fact that they did in fact run the trial.

        • Daniel, just curious: are you familiar with what happened with the respiratory syncytial vaccine in the ’60’s?

          It’s been a while, but I remember from immunology classes that children’s immune systems are indeed functionally different than those of adolescents. Your slippery slope argument with regards to age is sort of silly to me. For policies, a line needs to be drawn somewhere. From a regulatory standpoint, FDA treats children different than adolescents.

          There are plenty of sharp “boundary layers” that take place during development and/or progressively change past the age of 5, from ovulation and production of estrogen/progesterone, to baby teeth falling out, to TLR1 receptors in the skin, to the ratio of lymphocytes to neutrophils in WBC, to raw neutrophil count, to basophil count, to IL-12 production, to hair falling out to menopause.

          Unrelated: why do you use such absurdly specific numbers and language when you try to provide illustrative examples? Is it really necessary to write “an adverse effect rate of .00005 while 11 year olds have .93 and 10 year olds have 0.04…” It’s about as distracting and listing specific US cities, roller coasters, and plummeting from the sky when describing Bayesian modeling methods. I think it’s sometimes a disservice to your writing because it’s distracting! Sometimes what you write is *so specific* it’s easy to get caught up in trying to figure out if the specifics are relevant to the point being made.

        • In individuals, yes suddenly you’ll have onset of menopause or puberty or acquisition of an allergy. Individuals will change wildly from one time to another. But averages across entire populations? Changing over orders of magnitude consistently within a year? Puberty onset is spread in the population over about 4 or 5 years (from say 10 to 15), that kind of consistent boundary layer in an average across populations seems incredibly unlikely. I’m always open to data, but I’d be shocked to see the average rate of anything occurring in a population changing by 2 or 3 orders of magnitude within 1 year of age for anyone older than 5. Ok, I’ll except rates of death in the relatively quite old (people over say 80). That really does have a boundary layer out in the 80-90 year range.

          Yes, I’m familiar with the RSV vaccine. But that vaccine wasn’t shown to be enormously successful in people over 12 and then suddenly unsuccessful in people under 12, it was just unsuccessful (as I understand it).

          Look, I’m certainly NOT arguing that we should have shortcut all study of the vaccine. Just confirming the manufacturing was consistent and successful would take months, so we had half a year to study the vaccine from say May 2020 to Nov 2020… And a rollout to older groups by Dec-Jan we knew this vaccine was safe and effective in people over 18 thanks to trials and real-world data. What we *didn’t* know is what doses were needed in younger people, and whether there were *large* swings in the rates of adverse effects in children. The size of those swings needed to be quite large to disqualify the whole thing from kids. In the group 12-16 the rate of adverse effects was around 0.00005 which is why I mention that number. It would have to go to around .05 before it had a chance of being worse than COVID, or 1000x higher than it was in 12 year olds. Does anyone really believe that if you draw a curve of “rate of serious adverse effects” vs “age” that it has a shape where it is up around 0.05 for under 12 but plummets down to .00005 above 12? If so, you know what? It’s easy to test. Place a prior on the curve that incorporates that kind of wild swing, and then do a 200 person trial and we would have detected the problem if it existed. If the posterior curve stayed below .005 for all ages, it’d have been better than COVID by far, so approval would be necessary under any reasonable risk/benefit analysis. That analysis could have been done in March-April 2021. Instead the FDA demanded tens of thousands of people studies over like 6 months duration, while thousands of children were hospitalized throughout the US alone during the Delta boom.

      • @ Daniel, what do you make of this paper that just came out? The vaccines also appear about 60-70% effective at preventing non-covid deaths:

        The lower mortality risk after COVID-19 vaccination suggests substantial healthy vaccinee effects (i.e., vaccinated persons tend to be healthier than unvaccinated persons) (7,8), which will be explored in future analyses.

        https://www.cdc.gov/mmwr/volumes/70/wr/mm7043e2.htm

        There was no difference in all cause mortality during the RCTs, and death rates in the unvaccinated were similar to those found in the standard life tables (https://www.ssa.gov/oact/STATS/table4c6.html). The rate in the vaccinated group is the one thats abnormally (~65%) low.

        • I feel like you answered it yourself. More risk-averse people got the shots, and that leads to lower mortality rates over that period. Just selection bias.

          It also falls into the same idea we’ve been talking about, where without a mechanism for lower non-covid mortality rates—they’re vaccines, not miracle drugs—I’m gonna discount the results of an observational study like this.

          Side note: I’d also be a little skeptical when comparing to standard life tables. I would be unsurprised if they shifted during the pandemic.

        • @kj

          I have been waiting for this type of data for awhile. It is still not exactly what I wanted, which is simply all cause mortality in vaccinated vs not (by age, etc), but it is getting closer. There are some strange exclusion criteria they used as well, but it is the best we have right now.

          To me this says looking only at covid mortality without also comparing to all cause mortality can be very misleading when it comes to vaccine effectiveness against death. Probably hospitalizations too.

        • “I feel like you answered it yourself. More risk-averse people got the shots, and that leads to lower mortality rates over that period. Just selection bias.”

          This is likely to be the answer but this isn’t what I would have expected. If you had asked me, I would have said that people at greatest risk of a very negative COVID experience would be more likely to get the shots; that would be the frail elderly and people with obesity or diabetes or immunosuppression. In short, I would have expected the vaccinated population to be _less_ healthy than the unvaccinated, and that we would therefore see _higher_ all-cause mortality among the vaccinated. Indeed, I would have been willing to wager on that!

        • Well they adjusted for age, sex, ethnicity and location. Unsure how well they adjusted, I didn’t look too deeply at it. There are a lot of things correlated with vaccination.

        • I didn’t look too deeply at it.

          If vaccines appear ~90% effective at preventing death but now this indicates ~65 out of that 90 are due to selection bias, then instead they are ~25% effective.

          Seems worth looking into to me…

        • Yes, the adjusted for some things but not for obesity, diabetes, general frailty, etc. I would have expected that within any (age, sex, ethnicity, location) the people at higher risk would have been more likely to get the vaccine. Perhaps I would have been wrong.

        • Anon, the vaccines are obviously waaaay better than 25% at preventing hospitalization or worse. According to https://www.cdc.gov/mmwr/volumes/70/wr/mm7037e1.htm “Averaged weekly, age-standardized incidence rate ratios (IRRs)…among persons who were not fully vaccinated compared with those among fully vaccinated persons…for hospitalizations and deaths decreased between the same two periods [April 4–June 19, and June 20–July 17], from 13.3 (95% CI = 11.3–15.6) to 10.4 (95% CI = 8.1–13.3) and from 16.6 (95% CI = 13.5–20.4) to 11.3 (95% CI = 9.1–13.9). Findings were consistent with a potential decline in vaccine protection against confirmed SARS-CoV-2 infection and continued strong protection against COVID-19–associated hospitalization and death.”

          I know you don’t need help interpreting those numbers, but for people who do need help: For people of a given age, in spring an unvaccinated person was about 13 times as likely as a fully vaccinated person to be hospitalized for COVID. By summer that factor had fallen from about 13 to about 10. Some of this is likely due to riskier COVID behavior by people who haven’t been vaccinated — you’d think if you were eschewing the vaccine you’d be more careful but that is evidently not the way people’s minds work — but I don’t see how that could possibly account for a 10x difference or even a 5x difference. Well, OK, it _possibly_ could, but I think that’s extremely unlikely.

        • @Phil

          How do you reconcile essentially no difference in all cause mortality during the RCTs* with this observational study that shows 18-64 year old unvaccinated are 17x more likely to die of covid?

          * IIRC, for Pfizer it was 18 vaccinated vs 16 unvaccinated out of ~20k subjects each, with median age of ~50 years

          Could it be that vaccinated are 50+% less likely to get tested (the CDC even told vaccinated people they did not need to get tested) and there is further selection bias that makes that group 35% as likely to die regardless?

          We don’t know, because no one has been collecting data on that.

        • Anon,
          I remember looking at results from one of the RCTs, months ago, but can’t find it now. Maybe you can post a link.

          I note that Table 21 in https://www.fda.gov/media/146217/download shows results from the J&J trial, and with about 22k people each arm of the trial (vaccine and placebo) they had a total of 8 total (all-cause) deaths in the vaccine group and 39 total deaths in the placebo group.

          Hmm, interesting, they only attribute 7 of those 39 deaths to COVID (and zero in the vaccinated group), suggesting 8 non-COVID deaths in the vaccinated group and 31 in the placebo group.

          Short answer is: I want to look at the trials data again before I speculate. But it’s hard for me to believe anyone looking at the real-world situation could doubt that the vaccines are quite effective at overall risk reduction. All-cause mortality is way up, almost entirely due to COVID mortality. Areas are not finding that their hospital ICUs are clogged with vaccinated people; it’s the unvaccinated who are doing the extra dying.

        • During the blinded, placebo-controlled period, 15 participants in the BNT162b2 group and 14 in the placebo group died; during the open-label period, 3 participants in the BNT162b2 group and 2 in the original placebo group who received BNT162b2 after unblinding died. None of these deaths were considered to be related to BNT162b2 by the investigators. Causes of death were balanced between BNT162b2 and placebo groups (Table S4).

          https://www.nejm.org/doi/full/10.1056/NEJMoa2110345

        • Also, I think you misread that table, there were only 19 total deaths:

          As of January 22, 2021, 19 deaths were reported (3 vaccine, 16 placebo). Two deaths in the
          vaccine group were secondary to respiratory infections not due to COVID-19. A 61-year-old
          participant died of pneumonia on Day 24 following onset of symptoms on Day 13. A 42-year-old
          participant with HIV died on Day 59 following diagnosis of a lung abscess on Day 33. A 66-year-
          old participant died of unknown causes after waking up with shortness of breath on Day 45. The
          placebo recipients died of pneumonia (n=2), suicide (n=1), accidental overdose (n=1),
          myocardial infarction (n=1), malaise (n=1), unknown cause (n=3) and confirmed COVID-19
          (n=6). An update on deaths reported from the time period of January 22 to February 5 included
          an additional 6 deaths. Of these 6 deaths, 2 occurred in the vaccine group and 4, including 1
          due to COVID-19, occurred in the placebo group. None were related to the study product.

          So placebo group died at the same rate as both vaccinated/placebo seen in the Pfizer trial but the JJ vaccine group apparently died much less. JJ was supposedly much less effective against symptomatic covid, but it was much better vs all cause mortality? I’d guess something else is going on here, related to follow up or something.

        • Anoneuoid –

          Remember this?

          Anoneuoid
          on August 18, 2021 at 1:13 pm said:

          […]

          You should take issue with claims from those with a bad track record that mislead you. Not me. If you ever want to put your money where your mouth is let me know.

          Given that you have a “bad track record” and lost money putting it where your mouth is, I think you should provide a clear explanation of what you got wrong and why you got it wrong.

          In addition to significantly overestimating the decline in cases in the US, you also significantly overestimated the rate of deaths in Florida:

          https://statmodeling.stat.columbia.edu/2021/10/22/how-did-the-international-public-health-establishment-fail-us-on-covid-by-explicitly-privileging-the-bricks-of-rct-evidence-over-the-odd-shaped-dry-stones-of-mechanistic-evidence/#comment-2026919

          Given that a regular theme on this blog is the problem of people not confronting their errors, it seems to me it would be appropriate for you to confront yours.

          I think that some of your comments are interesting and informative. But it helps me to assess the value of someone’s contributions if they explain the reasons for their errors.

          Why were your forecasts so significantly off?

        • Why were your forecasts so significantly off?

          Wait a couple weeks. It will all be explained. I am just waiting for some lagged data to come out then I’ll send Andrew and email hoping he can make it a post.

          But essentially the 2021 summer heat lasted an extra month (or maybe couple weeks since it is monthly data). Weekly cases are at ~500k right now, so I was off by 2.5 weeks. They should start rising soon.

          The correlation between total heating + cooling degree-days and and covid rates is pretty unbelievable going back all the way to early 2020. Looking at energy consumption for heating/cooling is even better.

          I personally don’t believe it so am waiting for more data.

        • Anoneuoid –

          > Weekly cases are at ~500k right now,

          Accepting that the numbers are not precise, and with a big caveat that I could easily have made simply mathematical errors, it looks to me that according to the CDC, for the 7 days ending on Oct. 22 – if you figure a 6% rise for the lag (which is about the amount of increase comparing the number now for the week leading up to Oct. 5 and the number that it was on Oct. 6 as documented in the post on the bet coming to a close) – the number of cases Oct. 22 will be about 531,000. So I’d say it’s going to be significantly higher than 500k for today and it will be a while longer before it reaches 500k.

          > so I was off by 2.5 weeks.

          That’s one way to look at the size of your error. Here’s another way:

          Let’s go back to the original statement, when you said (on September 3) the number would drop to 500k in 28 days. It’s now 51 days, and it’s not there yet.

          So it looks like it will wind up taking close to TWICE as long as you thought it would take to reach that number.

          If you had made a prediction of that drop to take place in one day and it actually took 2.5 weeks, would you have said “Well, I was only 2.5 weeks off!”

          Further, on September 24th, you underestimated the per day deaths in Florida by somewhere in the range of @50%-100%.

          That would have been 3 weeks into your unexpected hot spell of around 3 weeks or so? So it seems to me that an unexpected hot spell of around 3 weeks wouldn’t explain that error.

          But the main problem I have is how you used one state, Hawaii, as evidence for interpretation of the effect of “seasonality” (above any of the myriad other influences), essentially (in my view) weakly handwaving at the use of air conditioning (and ignoring many complications with that mechanism of causality) but haven’t addressed whether, perhaps, in many other states the change in average temperature (plus perhaps humidity factored in somehow) would have been even greater but the drop was less. Or other states where there were very significant drops (or maybe increases) in cases without changes in weather of a similar scale.

          Hopefully you’ll address some of that when you present your explanation for how you were wrong, but only because you were actually even more right (about the impact of “seasonality” as a function of temps and humidity) than even you thought you were.

        • @Joshua

          I will bet you $200 that weekly covid cases are back over 1 million by last week of Feb 2022. We can do the same where winner donates to Stan dev.

          There is no need to write lots of paragraphs in response.

        • Anoneuoid –

          > I will bet you $200 that weekly covid cases are back over 1 million by last week of Feb 2022. We can do the same where winner donates to Stan dev.

          I have no dog in that fight.

          My interest is in your analysis of the effect of seasonality, and the extent to which it is basically the single driver of COVID rates as you seem to be arguing, and even there whether the effect of “seasonality” is simply a function of temps and humidity, as you seem to be arguing.

          That’s the point I’m raising with you, and the case rate in the last week in Feb. may very well tell us not very much about the issue I’m interested in.

          Your bet proposal seems to indicate that you still don’t even understand that what I”m saying is that the trends are only partially explained by “seasonality” and that “seasonality” is a more complex phenomenon anyway.

          BTW, I thought of perhaps a another way of looking at the magnitude of your error. You predicted a drop of about 56% (from Sept. 14 to Oct 1). But in reality is was more like about 35%. I”m basing that on the 7-day moving average rates on the CDC website.

          Again, the caveat about simple math errors applies.

          That looks like a pretty significant error to me. Especially since we’re talking about that error over a pretty short time frame.

        • > I will bet you $200 that weekly covid cases are back over 1 million by last week of Feb 2022. We can do the same where winner donates to Stan dev.

          You have to be more precise than that. Indeed you should be, if you believe in your seasonality model. Given your failed forecast was “closely bracketed”, you should give a range of plausible values, and replace the “by the last week” with “on the last week”. And also what odds do you hold on this wager?

          For instance, do you believe that, with your seasonality model, the cases will be, say, 1-1.2 mil on the last week of feb, and will pay up 200 dollars if it is not, while I’ll have to pay up say, 50 dollars if it is?

          Because I do not agree with your seasonality model, but do not also believe that COVID-19 numbers will necessarily fall either, rather that it’s dominated by factors that are difficult to predict.

        • Guys, he has proposed a bet. You can take it or leave it, but telling him he’s proposing it wrong seems kinda silly. If you don’t like his wager you can suggest one of your own.

          Me, I’m mulling this over. We are going to see cases go up due to seasonality. But how high? A lot of people have been vaccinated now, which means (a) fewer infections, but also (b) a higher percentage of asymptomatic infections, presumably leading to fewer tests per infection and thus fewer cases. On the other hand, as people relax about COVID restrictions — easily visible in my city, and I think just about everywhere — things could easily get out of hand again.

          Anon, would you be willing to bet on hospitalizations rather than cases?

        • Phil –

          > but telling him he’s proposing it wrong seems kinda silly.

          I didn’t say he proposed the wrong bet. I said he proposed a bet that I’m not interested in. if you’re interested in a bet with him then propose that bet. It seems kinda silly for you to mischaracterize what I said rather than just make a bet with him if you want to.

          I am interested in his arguments about the effect of seasonality. And rather than deal with problems with his theory that I presented to him – such as the way he framed the size of his error in a way that suggests he has an “agenda” – he proposed a bet as if doing so (or the outcome of the bet) answers my comments related to his theory of seasonality.

          Actually, it’s somewhat orthagonal.

          I have more questions, also.

          If this recent drop was due to temperature plus humidity, and those phenomena singularly drive infection rates, then why were much steeper drops in cases in the US than this recent drop (if indeed it turns around pretty much tomorrow as he says it will)?

          Why was there an increase from mid-June to mid-July of 2020?

          Why was there practically no drop at all from early Sept to early Oct in 2020, but this year there was a large drop?

          Why was there a large increase from mid-June to mid-July of 2020 while cases were flat from mid-June to mid-July of this year?

          There are more problems as well, but I’lll let him avoid answering these questions (by among other things proposing a bet as if it answers those questions when it doesn’t) before I ask more.

        • Anon, would you be willing to bet on hospitalizations rather than cases?

          I haven’t been looking at hospitalizations, and am more interested in cases since all the day-to-day stuff about wearing a mask and getting vaccinated is about stopping the spread. I don’t believe these interventions can be more than 20% effective combined since the vaccines do not give you mucosal immunity (also mucosoal immunity due to infection is only going to last 0.5-2 years anyway) and the masks being worn are not going to be meaningfully effective vs aerosols.

          I do believe that the vaccines should protect against severe illness, until the virus mutates in the right way and/or immunity wanes sufficiently. So I would rather bet on the cases.

        • Anoneuoid: “all the day-to-day stuff about wearing a mask and getting vaccinated is about stopping the spread. I don’t believe these interventions can be more than 20% effective combined”

          breakthrough infections occur much less frequently among vaccinated people, this study says:

          The 338 reported VBT cases represent 0.14% of Washoe’s vaccinated population of across the study period, compared to a 2.54% rate in those unvaccinated.

          In this case series analysis from Washoe County, Nevada, the rate of breakthrough infections was more than 18-fold lower than the rate of infections among unvaccinated individuals. Most importantly, rates of severe illness were low in vaccinated individuals. These findings add to a growing body of data that can reassure the public on the real-world
          effectiveness of the COVID-19 vaccines.

          Source: https://www.medrxiv.org/content/10.1101/2021.09.09.21262448v1

          If you cut a replication rate of R=6 by a factor of 18, you’re down to R=0.4, which easily stops the pandemic.

        • > breakthrough infections occur much less frequently among vaccinated people, this study says:

          There also seem to be evidence that vaccinated people are infectious for a shorter period.

        • > Guys, he has proposed a bet. You can take it or leave it, but telling him he’s proposing it wrong seems kinda silly.

          But the point of the bet proposal is to serve a rhetorical function on his certainty about his hypothesis. If I say “I’m certain that covid-19 will be over soon, I bet you $500 that it’ll be below 10 mil cases a day in the US next week”, I’m not making much sense.

        • But the point of the bet proposal is to serve a rhetorical function on his certainty about his hypothesis

          The point is to be proven wrong, which motivates me to get better.

          I’ve mentioned similar to the Joshua poster before, some people just don’t “get it” apparently. They project their own motivations onto others.

          I *like* being proven wrong.

        • Your bet still needs to relate to your claim. There’s any number of reasons why case rates can go high at some point that have nothing to do with your claims, some of which even directly contradict your claims. Further given your previous wager didn’t prove you wrong, maybe you should design this wage so that it is actually capable of proving you wrong.

        • > *like* being proven wrong.

          Anoneuoid –

          Since that’s the case, then why do you miss tate facts in a waay that obscures your being wrong, and scew numbers in such a way as to minimize the degree to which you were wrong?

        • Anoneuoid –

          I tried multiple times to get you to remember to account for the lag in reporting before you think you know the number of cases of COVID. Remember when you said this?

          > Anoneuoid
          on October 24, 2021 at 9:27 pm said:

          […]

          >> Weekly cases are at ~500k right now, so I was off by 2.5 weeks.

          Actually, according to the CDC as of today the number of cases for the week prior to Oct. 22 was @542K.

          Incidentally, we are now at right about 500k. So instead of being off by two weeks it’s more like you were off by 5 weeks. And even that’s misleading because you said it was a virtual certainty we’d hit that number in 28 days, it’s going to have taken more than 70 days (assuming we will get there soon).

        • Anoneuoid –

          Regarding your confidence that your analyses aren’t agenda-related. And your certainty that your view of “seasonality” is so spot on…

          Initially you thought that you understood the factors affecting case rates in Florida ,when actually you didn’t even know that there was a very large lag in the official counting of deaths.

          But even after you were given that information you were mistaken in your confidence of your ability to understand the factors contributing to the death rate:

          Anoneuoid
          on September 24, 2021 at 3:37 pm said:

          […]
          It looks like, in August, about 10% of deaths were reported with 1 day delay, 50% were reported with 1 week delay, and 70% within 2 weeks. So I would guess the final number for today is somewhere around 100-150 deaths.

          As of today, the 7-day moving average for deaths in Florida for September 24th is 211 (Worldometers has it at 220). I would say that the lag is pretty much closed at this point (sometimes the lag in recording does extend even longer than a month but I don’t think it does very often)

          I wonder if you’d consider reevaluating your confidence in how you assess COVID. When dealing with the plain fact that the number you said was a “near certainty” for the country as a whole, was in fact off by some 50%, you offered a rather contrived explanation where you actually went on to assert that if anything, you underestimated the effect of “seasonality” in states like Florida. But even weeks later you were off by close to 100%, overestimating the effects of seasonality in Florida..

          Why were you so wrong? Being off repeatedly with systematic errors in the same direction, to me implies some kind of conceptual error or fundamental analytical error.

          https://statmodeling.stat.columbia.edu/2021/09/24/how-to-think-about-evil-thoughts-expressed-by-entertainers-on-news-shows/#comment-2024379

        • Anoneuoid –

          Regarding your confidence about the ineffectiveness of vaccines:

          Among 179 COVID-19 case-patients, six (3%) were vaccinated and 173 (97%) were unvaccinated (Table 2). Overall, 77 (43%) case-patients were admitted to an intensive care unit, and 29 (16%) critically ill case-patients received life support during hospitalization, including invasive mechanical ventilation, vasoactive infusions, or extracorporeal membrane oxygenation; two of these 29 critically ill patients (7%) died. All 77 case-patients admitted to the intensive care unit, all 29 critically ill case-patients, and both deaths occurred among unvaccinated case-patients. Among 169 case-patients with available hospital discharge data, the median length of hospital stay was 5 days (interquartile range [IQR] = 2–9 days) for unvaccinated case-patients and 3 days (IQR = 2–4 days) for vaccinated case-patients.

          https://www.cdc.gov/mmwr/volumes/70/wr/mm7042e1.htm?s_cid=mm7042e1_w

      • Caveat first – there are many moving parts (some of which conflict) and confounds and mediators and moderators and interaction effects. – so I take no particular stock in anything this article says.

        On the other hand, the mention of the delay in vaccinating kids is worth noting:

        https://www.nbcnews.com/news/world/new-variant-no-masks-driving-uks-latest-covid-surge-rcna3392

        Also,

        https://twitter.com/JusDayDa/status/1451419090782810115?s=20

        • Here’s a nice data based article from USA Today on what happened with kids over the last few months:

          https://www.usatoday.com/in-depth/graphics/2021/10/08/covid-19-kids-cases-hospitalizations-deaths/8361479002/

          It’s not clear to me if the graph “admitted to the hospital each week” means “number in hospital in that week” or “number newly admitted during that week”…

          Taking the least bad interpretation, it looks like maybe 3500 kids were in hospital at the peak. Hard to say, but perhaps 5000 total hospitalized over the last few months? Now credible rates for the heart inflammation is something like 5 to 10 per 100k for the 30ug dose given to 12-18 year olds, most need a couple days of ibuprofen and resolve on their own, an even smaller percentage need some hospital care. let’s say we vaxxed 20M kids, and had 10/100k, that would have been 2000 kids with some mild inflammation, and maybe 200-500 who needed hospitalization, a factor of probably at least 7 times fewer than what actually happened due to COVID.

          The dose of 10ug which was eventually decided on undoubtedly will produce dramatically less cardiac inflammation than the 30ug dose would have.

          Suppose you start with Beta(2,10e3) as a prior on the rate of heart inflammation, you’re 99% sure it’s less than 66/100k or a factor of 10 or more higher than what’s seen in 12 year olds with 3x the dose…

          Suppose it really is 100/100k and you add 20000 kids… you’ve still got an 82% chance of getting zero cases, and if you get 0, your 99%tile on the rate is still going to be 22/100k. If you’re “lucky” enough to get 1 case your 99tile will be 28/100k.

          So the study is just wanking anyway, you’ll only detect the 10-100 fold higher risk in the rollout to millions of kids even after all that 6 months of delay you imposed on kids.

          Of course, it’s not wanking to do a safety and immunogenicity (antibody titer) study on 200 or 500 or 1000 kids, because if the bad effects are at rate .01 or higher you’ll detect them and in a couple hundred kids you’ll do a regression and get a dosage recommendation. But doing 20k or 30k kids and 6 months to establish effectiveness from scratch as if we didn’t have 1 Billion doses worldwide already given and already well established as effective that really is utterly unreasonable on any kind of cost/benefit basis.

        • Daniel –

          Did you see this?:

          https://www.ft.com/__origami/service/image/v2/images/raw/https%3A%2F%2Fd6c748xw2pzm8.cloudfront.net%2Fprod%2F254c3620-334e-11ec-83ee-ddc8940a9a22-fullwidth.png?dpr=1&fit=scale-down&quality=highest&source=next&width=1260

          From this article…

          https://www.ft.com/content/1f57838a-24d2-40d5-b314-2d8345a6e001

          Personally, I have reservations about the conclusion of causality for the trends being attributed to delays in vaccination – because many people have been wrong thinking they understand causality with COVID trends. But the case is certainly a strong one.

        • It seems like a pretty obvious thing that mechanism would predict, and then it became true, so it’s certainly what I’d call good evidence for the hypothesis that if you don’t protect kids not only do the kids suffer but their parents suffer and their grandparents suffer and their teachers suffer and etc as well. Anyone who has ever had kids in daycare will tell you this! Kids in daycare nearly killed me, no joke.

  9. Theory can’t stand alone, and empirical evidence in the human sciences is rarely enough on its own either.

    Evidence is of or for something (e.g., a theory) – it doesn’t make sense to talk about evidence “on its own”. Maybe “evidence” is being used as a synonym for “data” here? If so, I would argue that it’s important to maintain a distinction between the two concepts.

  10. There are somd assertions in the article regarding droplet transmission that I disagree with.

    > airborne transmission is strongly suggested by well-documented super-spreader events (such as singing performances) and nosocomial outbreaks (within healthcare facilities)

    Singing, yes, but healthcare involves prolongued close personal contact, and thus facilitates droplet transmission.

    > Just as in the 1850s, policymakers assumed the mode of transmission rather than seeking empirical demonstration of it. They dismissed claims from people who argued that the virus was—or could be—significantly spread through the air.

    There was empirical demonstration, see below.

    > an exclusively droplet mode of transmission [

    Nobody suggested that. From the beginning, the WHO advised special precautions in situations where aerosols were more likely to occur, specifically when Covid patients get intubated.

    The first outbreak in Germany happened at a car parts manufacturer near Munich late in January, and German virologists took the opportunity to study the disease.
    It was quickly confirmed that the virus initially replicates in the throat, and that patients start being infectious before they are symptomatic. This mechanical knowledge -of the virus replicating in the mucous membranes of the throat- suggested droplet transmission, because mucus doesn’t aerosolize easily (heavy breathing from singing or bodily exertion being exceptions, as well as intubation).

    This was confirmed by contact tracing: basically, the whole car parts company was investigated, and if the virus had transmitted well via aerosols, sharing hallways or the common cafeteria should have posed an infection risk to everyone. This was not observed. (I always felt the fact that a large number of people on the Diamond Princess cruise ship were never infected supports this take.)

    https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30314-5/fulltext?fbclid=IwAR1zMrcF9K2BIPre77PrHBmiHAOuhv3vR83vkr5mMN_hRVkYdAlJuhiXg7E

    Contacts were classified as high risk if they had cumulative face-to-face contact with a patient with laboratory-confirmed SARS-CoV-2 infection for at least 15 min, had direct contact with secretions or body fluids of a patient with confirmed COVID-19, or, in the case of health-care workers, had worked within 2 m of a patient with confirmed COVID-19 without personal protective equipment. All other contacts were classified as low-risk contacts.

    [..]

    No cases occurred among the 108 identified low-risk contacts.

    This evidence confirmed the notion that transmission was mainly via droplets.

    From this evidence, it was correct to advise the population to significantly reduce the number of high-risk contacts, i.e. it was to be expected that reducing the number of close contacts and physical distancing would significantly reduce transmission of the virus; which it did.

    It was also clear very early on (lesson learned from the Washington choir) that communal singing or wind instruments had to be regulated beyond regular distancing; the same would be true for gyms.

    I believe that early in 2020, CDC advice differed; but I believe most of Europe and the WHO followed this German observational evidence.

    —-

    At the time (March/April 2020), medical face masks were in short supply; since they had to be assigned to exposed health care workers, it was not possible to enact a public medical mask policy, even if the benefits had been clear. What happened instead was that people started making ad-hoc masks at home from whatever cloth they had; and initially, there was zero evidence that these non-medical masks would have a discernible effect on transmission at all.

    There is no legal basis to mandate mask wearing if it isn’t clear that they help fight the epidemic.

    • > This evidence confirmed the notion that transmission was mainly via droplets.

      Did it though? I’m not sure that evidence is inconsistent with airborne transmission. I had gotten the impression that the binary cutoff between particle sizes for droplet/aerosols—as well the 2m safe distance—were kinda arbitrary, and that covid was exhibiting transmission characteristics in between the two.

      Do you feel that covid is still primarily droplets? Perhaps by taking precautions we have a selection bias and mainly observe the remaining airborne transmission? Or maybe subsequent variants have moved to be more airborne? Or do you think the German data turned out to be too noisy?

      I’ve felt the evidence has been kinda all over the place sometimes, and am curious to hear your thoughts on reconciling it all.

      • the binary cutoff was not only arbitrary but just wrong. It was created at 5 microns because that was the size of particle that could carry Tuberculosis deep enough into the lungs to allow that bacterium to infect the tissue that was vulnerable to its infection method. COVID will happily infect your nasal passages or throat, so 100um will get there no question and there is plenty of floating around in the air that is done by particles between 5um and 100um. We discussed an article about this a few months back here on the blog https://statmodeling.stat.columbia.edu/2021/08/06/the-60-year-old-scientific-screwup-that-helped-covid-kill/

      • It’s not really that relevant where the cutoff is.

        The observation was that high-risk contacts had a substantial infection risk and low-risk contacts had very little, so no matter what the mechanism, the advice to avoid high-risk contacts (unprotected close personal contact > 15 min) was epidemiologically sound and covered by evidence, so that was the basis of legislation that successfully fought back the first wave.

        It led to splash shields at supermarket checkouts, school closures, and much working at home.

    • ” and if the virus had transmitted well via aerosols, sharing hallways or the common cafeteria should have posed an infection risk to everyone. ”

      This leap of logic is mistaken. Under the circumstances of that study it was **kind of** understandable. However, in the later study of the call center in S Korea, it was shown that the people who’s cubicles were air-system-down-wind of the infected individual on the same floor became infected, while no people that rode elevators and shared hallways with that individual were infected. So brief contact in well-ventilated areas isn’t enough to transmit the infection – which explains why it’s safe, and your mask is pretty much irrelevant in big box stores – but extended exposure to modest concentrations of the virus is enough to transmit the infection, or modest time of exposure in poorly ventilated areas where concentration can grow over time.

      I say it’s only “**kind of** understandable” because the idea of accumulating exposure to a pollutant of variable concentration and the idea that ventilation would affect an concentration of an airborne pollutant is hardly new. Beyond that, there is widespread evidence of mass transmission during 1918, and that was ignored despite the obvious parallels. Slide 8 of this pdf notes the progress of the flu at Fort Meade: Sept 17, a “handful” of soldiers have influenza; only seven days later:

      “800 new flu patients hospitalized; six companies of the 71st Infantry ordered out of barracks to tents four miles away; all of Camp Meade
      placed under quarantine.”

      Such rapid transmission is simply impossible by droplets or touch. If you think this problem through the only way such rapid transmission is possible is via aerosol.

      On top of **all of that**, the German study is just one study. While in the closed context of the single study its *kind of understandable* that they reached the conclusions they did, they missed a HUGE factor: the effects of ventilation. This is why we don’t rely on single studies in science. We require multiple – not one or two but dozens and even hundreds – of studies. Science is hard. It’s easy to miss things.

      • The problem is that you don’t have “multiple studies” when the epidemic is starting. You work with the evidence you have, not with the evidence you want.

        My point is that Greenhalgh writes that the droplet idea was all prejudice and no evidence, and that’s simply not true.

        The Korean call center outbreak is documented at https://wwwnc.cdc.gov/eid/article/26/8/20-1274_article and that does not mention ventilation at all; also, the outbreak was recognized on March 9th, and the analysis came too late to inform countermeasures for the first wave in Western Europe and America.

        Figure 1 looks like one person infected 10 others, who became symptomatic after a few days, and then these 10 infected a much bigger wave that followed.
        If I assume that close seating and personal contact facilitated this transmission pattern, it looks like that explains most of these infections.

        I would also like to know how loudly people are speaking in that call center.

        • “The problem is that you don’t have “multiple studies” when the epidemic is starting”
          But there were multiple studies, because “superspreader” events had occurred during the previous SARS outbreaks as well, in fact the Korean study you link to states: “Severe acute respiratory syndrome coronavirus, the predecessor of SARS-CoV-2, exhibited multiple superspreading events in 2002 and 2003”. Just as influenza had done in 1918. So in fact “superspreading” was a well known behavior of respiratory viruses, and people that had worked in this particular subfield of epidemiology know all of this quite well.

          “that does not mention ventilation at all”
          Yes, you are correct, my apologies. I believe I confused the call center study with this study from China which shows how the airflow at a restaurant could have been responsible for transmission. Apologies.

          Just the same while people are in close contact Figure 2 shows the distribution of the cases on the 11th floor. Yes, you can construe a complex sequence of events in which transmission is in a long chain of 1:1 expectoration events with face to face contact, but the hypothesis of least astonishment is that the few infected people fill the room with a high concentration of aerosol virus which allows many people to become infected at the same time, and also explains why people in hallways, elevators and most adjacent rooms don’t become infected, even though they have occasional contact with infected people. This is what the author sensibly concludes: “spread of COVID-19 was limited almost exclusively to the 11th floor, which indicates that the duration of interaction (or contact) was likely the main facilitator for further spreading of SARS-CoV-2.”

        • Quotes from the restaurant study you linked: “From our examination of the potential routes of transmission, we concluded that the most likely cause of this outbreak was droplet transmission.” “This finding is less consistent with aerosol transmission.” Your own evidence speaks against you.

          > you can construe a complex sequence of events in which transmission is in a long chain of 1:1 expectoration events with face to face contact

          Exaggeration does not become the truth well.

          There isn’t a “long chain”, but the call center date clearly shows a 3-step sequence of index patient, pause, 10-people group, pause, big outbreak. It’s clear that there was not one person infecting everyone at once, and from what I remember of the super-spreader research I saw last year, this staged infection route is the norm rather than the exception: the virus spreads through personal contacts in a close-knit community, and suddenly it seems that a single person infected everyone.

          Also, nobody talks about “expectoration events”, which means coughing or deliberate spitting. It’s clear (also by the German study) that unremarkable contact can transmit the virus via droplets produced in normal face-to-face conversation.

          Re: your reference of SARS (2013) outbreaks, they’re not comparable because the old SARS does not replicate in the throat. Another study from the same group of German virologists proved this early on.
          https://www.nature.com/articles/s41586-020-2196-x

          Successful isolation of live virus from throat swabs is another notable difference between COVID-19 and SARS, for which such isolation was rarely successful. This suggests active virus replication in tissues of the upper respiratory tract, where SARS-CoV is not thought to replicate in spite of detectable ACE2 expression.

        • Your response duly noted. The weight of the evidence available by late spring 2020 showed:

          1) transmission across distances far greater than six feet is common
          2) face to face or close interaction is *not* a key factor – infected people in short close contact with others (elevators) rarely transmit.
          3) likelihood of transmission increases with duration of exposure
          4) mass transmission from few people to many people is common and can occur in less than a few hours

          The grain size distinction of “droplet” vs “aerosol” isn’t critical to my – or any – argument. The point is that the virus remains suspended in the air long enough to be transmitted by air currents over distances substantially larger than six feet.

          Thanks, have a good one!

        • Ok, you’re reaching at straws and moving goalpodts now. The main difference between aerosols and droplets is that aerosols remain suspended in the air for hours, while droplets fall down quickly. There’s obviously a continuum.

          Aerosol transmidsion means you need to filter better, physical distancing doesn’t make a difference, and a badly ventilated room remains infectious after the spreader has left. I’ve not seen evidence for any of that.

          > The weight of the evidence available by late spring 2020 showed:

          Please cite it. Your two previous citations are insufficient.

          > 1) transmission across distances far greater than six feet is common

          It’s the exception, not the norm. It should be the norm if it was mainly aerosols.

          > 2) face to face or close interaction is *not* a key factor – infected people in short close contact with others (elevators) rarely transmit.

          Not logical. The German study identified two key factord (I quotef this here!): close personal contact 15 minutes. You are arguing that because duration matters, distance does not, and that is not logical.

          > 3) likelihood of transmission increases with duration of exposure

          See above; droplets do that.

          Key difference is, with droplets, you look at exposure to a person; with aerosols, you ought to be looking at exposure to a place. Show me a study where people got infected who entered a place after the infected person had left, and I’ll change my mind.

          > 4) mass transmission from few people to many people is common and can occur in less than a few hours

          “Clusters of Coronavirus Disease in Communities, Japan, January–April 2020”, Yuki Furuse et al.

          We noted many COVID-19 clusters were associated with heavy breathing in close proximity, such as singing at karaoke parties, cheering at clubs, having conversations in bars, and exercising in gymnasiums. 

          Most (39/61; 64%) clusters involved 5–10 cases.

          https://www.researchgate.net/publication/341562087_Clustering_and_superspreading_potential_of_severe_acute_respiratory_syndrome_coronavirus_2_SARS-CoV-2_infections_in_Hong_Kong

          the largest number of individual secondary cases was 11.

          I’d expect more than 10 more often if it was mainly aerosols.

          > The point is that the virus remains suspended in the air long enough to be transmitted by air currents over distances substantially larger than six feet.

          The point is that if the main mode of transmission was aerosols, this would not require a strong, directed air current.

          If you prepare for aerosols instead of droplets as the main mode of transmission, you’ll be overly restrictive, and also underestimate the efficiency of proven NPIs. This would be bad.

        • I haven’t gone through these comments with a fine-tooth comb, and maybe you’ve covered the information in the following article, but I thought the article might be relevant (bold added):

          “‘A’ had to get a large dose in just five minutes, provided by larger aerosols probably about 50 microns,” she said. “Large aerosols or small droplets overlapping in that gray area can transmit disease further than one or two meters [3.3 to 6.6 feet] if you have strong airflow.”

          […]

          Lee and his team re-created the conditions in the restaurant — researchers sat at tables as stand-ins — and measured the airflow. The high school student and a third diner who was infected had been sitting directly along the flow of air from an air conditioner; other diners who had their back to the airflow were not infected. Through genome sequencing, the team confirmed the three patients’ virus genomic types matched.

          […]

          “Incredibly, despite sitting a far distance away, the airflow came down the wall and created a valley of wind. People who were along that line were infected,” Lee said. “We concluded this was a droplet transmission, and beyond” 6.6 feet.

          […]

          The pattern of infection in the restaurant showed it was transmission through small droplets or larger aerosols either landing on the face or being breathed in, said Marr, the Virginia Tech professor who was not involved in the study. The measured air velocity in the restaurant, which did not have windows or a ventilation system, was about 3.3 feet per second, the equivalent of a blowing fan.

          https://www.latimes.com/world-nation/story/2020-12-09/five-minutes-from-20-feet-away-south-korean-study-shows-perils-of-indoor-dining-for-covid-19

          jim –

          > So brief contact in well-ventilated areas isn’t enough to transmit the infection – which explains why it’s safe, and your mask is pretty much irrelevant in big box stores – but extended exposure to modest concentrations of the virus is enough to transmit the infection, or modest time of exposure in poorly ventilated areas where concentration can grow over time.

          I think your characterization about masks being useless in a big box store is broadly correct, (it was a generalization), but it’s useful to think about the limitations of your generalization. It depends on what “well-ventilated” means. It does seem possible that you could be somewhere with a high rate of air exchange for a relatively short period of time and still get infected if you’re directly in a stream of airflow and facing the wrong way!

        • Thanks J.

          “if you’re directly in a stream of airflow and facing the wrong way!”

          Yes, for sure! My entire discussion is about generalizations. From a societal perspective, the frequency of transmission in any given situation is a critical question. But from the individual perspective, one transmission is all that matters, so it still makes sense to wear a mask even in big box stores if it generates a marginal decrease in the chance of infection.

          “It depends on what “well-ventilated” means.”

          Yes, understood. The critical factor would be the airflow at “head” level, between 5-7ft. (or 3-5ft when seated). Seems likely that just the high ceilings of big-box stores are enough to ensure significant upward airflow at that level above the floor because of the temperature of the air people exhale is warmer than the room and will rise. Interestingly, restaurants usually have low ceilings, so artificial air flow needs to be cranked up to compensate, making it a more important factor in whether transmission occurs and who gets hit by it. Also people are stationary for much longer in restaurants, showing again that duration of exposure is critical.

          cheers man

  11. A related issue is the complete lack of nuance. Even this thoughtful article traverses at the granularity of “do mask work” which lacks sufficient precision to be almost meaningless. Indeed many of the discussed “mechanistic” and RCTs studies are focused on wildly different treatments and endpoints!

    What type of masks? Worn where and with what don/doff-ing procedure? How do we even define “works”?

    I’m willing to believe that an N95 is beneficial w.r.t. outflow for a symptomatic patient hospitalized in close quarters. I find it implausible forcing kindergarteners to recess with cloth bandanas is net beneficial by any total measure of health.

    • “A related issue is the complete lack of nuance. ”

      I agree but as always I would rephrase this: people make claims without stating – or, it seems, often without knowing or recognizing – the assumptions and conditions under which those claims are purportedly true.

  12. IMO there’s a real problem in this discussions about understanding the relevance of different types of “evidence”. For my money, there are only three types of “evidence” in science:

    1) Controlled tests and experiments
    2) Natural observations
    3) Known relationships (a.k.a. “theory”)

    I don’t agree with the idea of a category of evidence called “mechanistic” (applied here to observations of the air-flow around masks). This amounts to indirect natural observations or controlled tests. Science has to use **all three** forms of evidence to reach valid conclusions. It can’t just rely on one kind of evidence. Each type of evidence has strengths and weaknesses.

    Furthermore, people seem stuck on how to balance conflicting evidence. Even in the strongest theories, there are observations that don’t fit – because we can’t know everything. That’s why we (a) require many studies and lines of evidence; and (b) must weigh the strongest evidence first.

    The problem in the medical community is that, apparently because their biggest concern is drug testing, many people are claiming that RCTs are the only valid form of evidence. RCTs, a subset of (1) above, are valid controlled tests, but they’re just one kind of test, and one that has restricted applications and uses. So the idea that we can only use RCTs to study the effects of masks is badly wrong. In fact it would be impossible to conduct a properly controlled RCT on infected patients, and it would be stupid to throw out all the other equally valid forms of evidence regarding mask use.

    More generally, the problem of understanding evidence is also ***rampant*** in the social sciences, where common observations are ignored in favor of significance testing and implicit assumptions aren’t even acknowledged much less carefully considered, despite the fact that many more of them are required in social science than in physical sciences. Reliance on single or just a few studies is **wildly** out of control despite the far lower level of experimental control available to social scientists.

    IMO this all results from the high motivation of social /medical scientists to get some type of actionable result.

    • Early this century there were two movements, science-based medicine and evidence-based medicine, which wanted to avoid the flaws in human judgement by basing decisions exclusively on RCTs and erasing natural observations as a valid source of knowledge. After all, doctors are good at convincing themselves that their pet treatment works, because homo sapiens is the rationalizing animal. But RCTs have their own different epistemological problems, practical limits, and vulnerability to crime and wishful thinking. And commanding that people ignore the evidence of their own senses and experience is a bigger ask than many of the believers appreciated.

      • People are unable to “sense” or” experience” masks protecting against viral particles. What they can sense is enormous consensus-driving groupthink and social pressure. We don’t need RCTs to decide if parachutes work, but we do to evaluate claims that cloth masks reduce spread by ~1%.

        This entire discussion reads like rejecting rigor purely because its results are politically inconvenient.

        • In the case of masks, known relationships suggest that they reduce the spread of respiratory infections (“someone with a mask over their nose and mouth will emit fewer and smaller droplets than someone without a mask”), and natural observations suggest that they reduce the spread of respiratory infections (“most countries in East Asia are controlling the pandemic better than Euro and settler countries so their policies are probably more effective” and “people who think they have respiratory infections often wear masks in Japan and it has not caused a public health crisis”) Those were pretty compelling evidence in early 2020! Since then a bit of contradictory evidence has appeared from controlled observations, such as a study in Bangladesh, but science is about making decisions based on the state of the evidence and then updating.

        • The “known relationships”, i.e. mechanistic, evidence is mixed. Mannequins with spray cans suggest one thing, guy with vape pen and the almost entirety of mechanistic research pre-2020 suggests another. Given this evidence is mainly premised on the wrong heuristic of infection, I find the entire corpus to be overall low value in either direction.

          The observational evidence is similarly not at all clear. Certainly you can cherry-pick single examples w/ high mask compliance that, if you exclude the last few months, had low case rates. I’ve oddly never seen a retraction of these claims based on (e.g.) Singapore’s current case explosion. One can also find low compliance low cases examples (e.g. Africa); or high compliance high case (the US northeast) exs. Conclusions based off pointing to single exs. is basically motivated thinking.

          Actual natural experiments strongly suggest mask mandates or reported usage are uncorrelated with case rates: CA vs AZ, OC vs LA, ND vs SD, etc. https://twitter.com/ianmSC does an excellent job cataloging this. I’m willing to concede a priori that perhaps mask usages does help but is offset by hidden confounders with negative effect and/or is too small to see at the population level. This of course is exactly the setting where RCTs are most necessary.

      • Vinay Prasad suggest the following experiment for the effectiveness of masks.

        “Unfortunately, scientists have failed to conduct the kind of randomized trials that can provide more reliable answers. Here schools, counties, or districts would be assigned a mandatory or optional masking policy, and researchers could simply track their experience to determine which schools had more coronavirus spread. Kids wouldn’t be banned or prohibited from wearing masks, but rather the policy of making all kids wear masks would be rigorously tested.”

        https://www.theatlantic.com/ideas/archive/2021/09/school-mask-mandates-downside/619952/

        What do you think?

  13. This article is great.
    It reminds me of this unfortunate line of reasoning from a popular article by Ioannidies
    (https://www.statnews.com/2020/03/17/a-fiasco-in-the-making-as-the-coronavirus-pandemic-takes-hold-we-are-making-decisions-without-reliable-data/)

    > The one situation where an entire, closed population was tested was the Diamond Princess cruise ship and its quarantine passengers. The case fatality rate there was 1.0%, but this was a largely elderly population, in which the death rate from Covid-19 is much higher.

    very heavy reliance on a “natural experiment” with potentially good internal validity, while disregarding mountains of evidence from around the world that did not have such nice statistical properties. The “conservative” attitude of “Why Most Published Research Findings Are False” (a classic, to be sure) led to disastrously poor reasoning during the pandemic.

    • “good internal validity” — not really, Ioannidis ignored those patients on ventilators at the time; eventually, there were 14 deaths instead of 7, and the case fatality rates (not infection fatality rates!) per age group for symptomatic cases eventually matched the CFRs reported from Wuhan within the confidence intervals.

      Plus a score of other misleading information in that article.

      • > eventually, there were 14 deaths instead of 7, and the case fatality rates (not infection fatality rates!) per age group for symptomatic cases eventually matched the CFRs reported from Wuhan within the confidence intervals.

        Wow I did not know that. That turns things around a bit, doesn’t it? If he had analyzed his natural experiment properly, his focus on internal validity wouldn’t necessarily have misled him so. I guess maybe the way to think about it is along the lines of the general critique that as soon as someone finds a clever natural experiment they stop worrying so much about the details… but really who knows, I guess.

        > Plus a score of other misleading information in that article.

        Oh yeah, it was a hot mess all around.

  14. > apparently 3/4ths of the people who reported COVID symptoms didn’t test positive for COVID

    that doesn’t really matter if the agents that cause these symptoms are similar to covid

    The main caveat with the Bangladesh experiment is that it tests the effect of an intervention (mask encouragement) in a specific setting. It did not test the effectiveness of the mask itself. The commentator missed this distinction as he went on about condoms and antibiotics.
    For condoms, the proper analogy would be to measire the effect of an Aids advertising campaign: if this lowered Aids cases by 11%, you’d say the campaign did well, but wouldn’t conclude that condoms don’t work well to prevent Aids transmission.

Leave a Reply

Your email address will not be published. Required fields are marked *