If the outcome is that rare, then nothing much can be learned from pure statistics.

Alain Fourmigue writes:

You may have heard of this recent controversial study on the efficacy of colchicine to reduce the number of hospitalisations/deaths due to covid.

It seems to be the opposite of the pattern usually reported on your blog.

Here, we have a researcher making a bold claim despite the lack of statistical significance,
and the scientific community expressing skepticism after the manuscript is released.

This study raises an interesting issue: how to analyse very rare outcomes (prevalence < 1%)? The sample is big (n>4400), but the outcome (death) is rare (y=14).
The SE of the log OR is ~ sqrt(1/5+1/9+1/2230+1/2244).
Because of the small number of deaths, there will inevitably be a lot of uncertainty.
Very frustrating…

Is there nothing we could do?
Is there nothing better than logistic regression / odd ratios for this situation?
I’m not sure the researcher could have afforded a (credible) informative prior.

I replied that, yes, if the outcome is that rare then nothing much can be learned from pure statistics. You’d need a model that connects more directly to the mechanism of the treatment.

24 thoughts on “If the outcome is that rare, then nothing much can be learned from pure statistics.

  1. On December 11, 2020, the steering committee chairman informed the data safety monitoring board that the investigators had decided to terminate the study once 75% of the planned patients were recruited and had completed the 30-day follow-up. This decision was made due to logistical issues related to maintaining the central study call center active 24 hours per day for a prolonged period of time, as well as the need to provide healthcare systems with study results in a timely fashion given the state of the COVID-19 pandemic. To account for the interim analyses, the statistical significance level was set to 0.0490 for the final analysis of the primary endpoint.

    With large enough sample size you will get significance eventually. Here it looks like it would have been favoring the drug had the study been allowed to continue.

    • Yes, stat. sig. is a function of sample size and with a large enough sample anything and everything becomes stat. sig., but that shouldn’t be the goal.
      The real question is what the distribution of a given rare phenomenon is.

      • The problem here *is* small sample size. According to their original power analysis this study was too small to detect a meaninful difference. The inconclusive results were by choice.

        It is goofy to ask how to fix a problem that was literally chosen to occur.

        I would rather see a much smaller but highly detailed longitudinal study to begin with, but thats not the design they chose. Then they chose inconclusive results by ending it early.

  2. > I replied that, yes, if the outcome is that rare then nothing much can be learned from pure statistics. You’d need a model that connects more directly to the mechanism

    Not to contradict your point but personally, I think the importance of linking to models of mechanism of causality applies more widely also to where outcomes aren’t that rare.

    • “the importance of linking to models of mechanism of causality applies more widely also to where outcomes aren’t that rare.”

      I strongly agree!

      So in the end if you already understand the mechanism / causal relationship, statistics provide a measure of the degree of effectiveness. But if the causal relationship isn’t established through other means, they’re not very useful.

  3. Not too long ago there was a study on effectiveness of Pfizer vaccine, based on which the entire planet started injections.
    N=just over 36K in two cohorts
    Infections in 170 people only (162 in the intervention and 8 in the control arm).

    Nobody complained at that time that the whole thing was based on N=170, not 36K, simply because there was no way around it.

      • While 170 is a lot more than 14 a more important distinction between these cases is that 162/8 is very different from 9/5. If the vaccine trial had been much smaller and there had been 0 or 1 or 2 infections out of 14 in the vaccinated group a naive calculation would still show efficacy (maybe not good enough for approval though).

        Of course the approval was not based on statistics alone – they never are. But if the outcome had been 60 vs 110 rather than 8 vs 162 (for the same 5/9 ratio) the vaccine would probably have been rejected based on statistics alone. (In the colchicine trial if N had been ten times as large getting 50/90 – or even 57/83 – instead of 5/9 gets you statistical significance. So does a 60/110 outcome but the threshold for vaccine approval is higher than that.)

        • It has been repeated twice now so I’ll point out this detail. The pfizer vaccine trial did not measure infections. They measured people with certain symptoms and a positive PCR test within +/-4 days of those symptoms.

          Confirmed COVID-19: presence of at least 1 of the following symptoms and
          SARS-CoV-2 NAAT-positive during, or within 4 days before or after, the
          symptomatic period, either at the central laboratory or at a local testing facility (using
          an acceptable test):
          • Fever;
          • New or increased cough;
          • New or increased shortness of breath;
          • Chills;
          • New or increased muscle pain;
          • New loss of taste or smell;
          • Sore throat;
          • Diarrhea;
          • Vomiting.

          https://cdn.pfizer.com/pfizercom/2020-11/C4591001_Clinical_Protocol_Nov2020.pdf

          They also excluded those who tested positive after first dose to a week after second dose (~28 days since first dose). From the same link above:

          Ratio of confirmed COVID-19 illness from 7 days after the second dose per 1000 person-years of follow-up in participants without evidence of infection (prior to 7 days after receipt of the second dose) for the active vaccine group to the placebo group

          There are other issues as well. Eg, no exit poll, it is unlikely the blinding was very effective due to the side effect profile. There were also about just as many people who met the symptom criteria (for any reason) in both groups, while there were more suspected cases in the within 28 days of the first dose in the vaccine group. Also, about just as much all cause mortality. They also only checked within a few months of the vaccination.

          Anyway, in the year after we saw that infections went up and all cause mortality stayed about the same. This is all what you would expect due to the prior information available.

          First, we saw the lymphocytopenia after first dose (apparently all the immune cells migrated to the lymph nodes), leaving the body open to other infections/issues for 3-7 days.

          Also, IM vaccinations were not expected to induce the mucosal immunity required to meaningfully reduce infection/transmission (which was verified with the animals studies for this vaccine), and even after infection mucosal immunity towards respiratory viruses (in contrast to measles, polio, etc) only lasts a few months to years anyway.

          We also saw heavy selection for spike variants (the target of the vaccine), and then later reduction in the effectiveness towards severe illness as well (due to mutations and waning).

          Going forward, once sufficient selection pressure exists there is likely to be a new variant where very strong omicron immunity is counterproductive and vice versa. The first hints of this have already been seen in omicron vs delta antibodies. But those are still only ~5% different in the spike sequence. We should expect this issue to become prominent once it gets to a 10-20% difference.

          This is why what you actually want is immunity robust to variants, not hyper-focused on a single sequence for only 1/29 of the viral proteins. It is really natural selection 101.

        • > This is why what you actually want is immunity robust to variants, not hyper-focused on a single sequence for only 1/29 of the viral proteins. It is really natural selection 101.

          Using “imuunity” in such a generalized way doesn’t same to me to be particularly useful.

          A vaccine that targets a variant-specific component would seem sub-optimal. But a vaccine so designed that stimulates memory cell immunity could save hundreds? of millions of lives, perhaps until a non-variant specific vaccine can be developed.

          We live in a sub-optimal world. Operating as if an optimal world might exist doesn’t same to me to particularly improve quality of life in the sub-optimal world.

        • Using “imuunity” in such a generalized way doesn’t same to me to be particularly useful.

          This is the definition your body uses for immunity based on millions (billions?) years worth of natural selection. Can you name one instance where the body generates immunity like at issue here?

          That is rhetorical, the answer is “no” because that strategy is easily defeated by selecting for resistance. You see the same in monoclonal antibody treatments for cancer, many enzyme inhibitors, etc. The problem gets delayed for a short time at best.

        • Seems to me there’s reasonably strong evidence of efficacy against serious disease and death for meaningful lengths of time, if not for very long against infection, conferred by these vaccines across a variety of covid variants.

        • @Joshua

          Yes, an IM vaccine was expected to protect against viremia (virus in the blood), which is a decent proxy for severe illness. Until waning and mutations eventually reverse this and immunity towards one variant means negative immunity towards the other. Natural selection causes this to happen.

          This entire time all I have done is figure whatever happened in similar past situations will happen again and everything has been playing out almost exactly as one would expect based on that.

          Anyway, too many comments here now.

        • Carlos,
          It would be all fine and dandy, if the virus was uniformly spread throughout air, like oxygen, but it’s not. 95/5 ratio of 170 infected participants could very easily be a function of their exposure level (they didn’t adjust for essential workers or many other important variables) or could be attributed to natural variability among individuals when it comes to getting infected. Not everyone can get infected and not everyone passes it on (only 20% of infected actually do, if we rely on very flimsy evidence of transmission based on GM mice, as the true transmission knowledge is on par with ESP for all dangerous respiratory illnesses).
          Anyway, if we look at the fact that the survival before vaccination is still around 99% (priors, for those of Bayesian persuasion), it’s really impossible to say how much vax contributed to the current improvement. I mean, people are still being injected with the original strain vax which fell out of vogue a few strains ago. Ok, we rely on cellular immunity, but still…
          Not to mention that vax uptake flattened out and has been there for quite some time. US is only 66% fully vaxed (per Bloomberg clock) with around 30% of nervous Nellies boosted (which was never needed, BTW).

          Long story short, if I take something that wasn’t too deadly to begin with and give it a nudge (vax) I get great results on paper. However, why would I contribute all of the ‘success’ to the nudge only?

          Sorry for a meandering response

        • It would be all fine and dandy, if the virus was uniformly spread throughout air, like oxygen, but it’s not. 95/5 ratio of 170 infected participants could very easily be a function of their exposure level (they didn’t adjust for essential workers or many other important variables) or could be attributed to natural variability among individuals when it comes to getting infected

          So you don’t believe in randomization at all? I wasn’t going to comment on this thread any more but this one seems to strange to ignore.

    • >… because there was no way around it.

      You go to war with the sample size ya’ got?

      Meantime, 36k seems not trivial as a sample for detecting harm. And there are other ways to measure efficacy (such as antibody levels).

        • Good call and fair enough. As I understand it, the relationship between antibody levels and immunity is as of not clarified.

      • Joshua

        1. 36K is a great sample size if I am allowed to infect them all with COVID and see the true ratio. Otherwise, the only numbers you can go by are 162/8
        2. Antibody levels are good for about 6 weeks and are relevant for humoral (intercellular) immunity only. Once the cell gets infected, they are as good as an expired coupon.

    • Anon:

      Thanks for sending. This correction happened because I alerted the journal editor to this problem. The editor was going to tell me when the correction appeared but I guess he forgot. The earlier blog discussion was here.

      • I wonder if they charged a submission fee for the Erratum

        “All manuscripts submitted to the journal must be accompanied by a $175 submission fee for nonsubscribers or a $100 submission fee for individual JPE subscribers.”

  4. Now in retrospect, we all got very lucky. The vaccines that were promulgated to a few hundred million people after very abbreviated, inconclusive and rushed “trials” turned out to massively decreased the numbers of people dying or going to critical care unit. I’m thankful for that.

    But the fact it all worked out in the end does not mean the trials were anything other than serious underpowered and limited in scope. They were “show trials” quite literally. Absent any serious, immediately obvious safety issues they were going to be used worldwide whether there was sufficient evidence of their efficacy from those trials or not.

    It’s like the guy who buys a lottery ticket because he has a “system” for picking lottery numbers and then wins a million bucks. The fact he won does not prove his system valid.

    Those of us who might have been dead or permanently damaged in a counterfactual world where Pfizer/Moderna never came onto the scene can be thankful for our good fortune while still acknowledging the “successful” trials were really just hopeful extrapolations from a ridiculous small number of cases.

Leave a Reply

Your email address will not be published. Required fields are marked *