Estimating efficacy of the vaccine from 95 true infections

Gaurav writes:

The 94.5% efficacy announcement is based on comparing 5 of 15k to 90 of 15k:

On Sunday, an independent monitoring board broke the code to examine 95 infections that were recorded starting two weeks after volunteers’ second dose — and discovered all but five illnesses occurred in participants who got the placebo.

Similar stuff from Pfizer etc., of course.

Unlikely to happen by chance but low baselines.

My [Gaurav’s] guess is that the final numbers will be a lot lower than 95%.

He expands:

The data = control group is 5 out of 15k and the treatment group is 90 out of 15k. The base rate (control group) is 0.6%. When the base rate is so low, it is generally hard to be confident about the ratio (1 – (5/95)). But noise is not the same as bias. One reason to think why 94.5% is an overestimate is simply that 94.5% is pretty close to the maximum point on the scale.

The other reason to worry about 94.5% is that the efficacy of a Flu vaccine is dramatically lower. (There is a difference in the time horizons over which effectiveness is measured for Flu for Covid, with Covid being much shorter, but useful to take that as a caveat when trying to project the effectiveness of Covid vaccine.)

79 thoughts on “Estimating efficacy of the vaccine from 95 true infections

  1. I saw the 90 out of 95 cases too, and began wondering- are these typical samply sizes for vaccine trials? For example, when flu vaccines are trialed, do they get 100 cases in the sample or a lot more?

    • Most of the people in the trial didn’t get infected, and therefore don’t tell us anything about how effective the vaccine is. This is the advantage of a Human Challenge Trial – deliberately infecting people creates more statistical power. The disadvantage is… well the disadvantage is obvious.

        • From my reading of the interim results, it seems like the vaccine companies really do only care about the the infected participants. What could you learn from uninfected people other than the proportion of uninfected in each group, which is just another way of getting the proportion of infected people?

        • It’s even less useful than that, since the proportion of infected people in a vaccine trial is unlikely to be comparable to the entire population — people who sign up for vaccine trials will be those who take COVID seriously.

        • Not if people in both the treatment and placebo group fail to get it. When you look at the trial results, for example, only one American Indian/Alaskan native in the placebo got the virus and none in the vaccine group did. Would you be comfortable saying it has 100% efficacy for that group? Certainly not. We need variation to estimate an effect.

  2. I don’t think it is meaningful to compare the COVID vaccine to flu vaccines. Flu vaccines are developed for what are projected to be the most prominent strains among many circulating in the upcoming season. If this projection is wrong, or the most prominent strains are not as prominent as competing strains, the efficacy of the vaccine drops substantially for that year. To the best of my knowledge that is not currently a problem with COVID.

  3. I don’t buy that as a statistical argument. One of the things about conditioning on the total number of cases in the analysis is that it removes the base rate parameter. Also, I believe that they may have meant 1 – 5/90 for the ratio. I have an updated post on that may be of interest.

    http://blog.fellstat.com/?p=468

    Perhaps you can make the case from a biological point of view that the rate will likely come down, though COVID is fundamentally different from the flu and mutates much more slowly.

    • It will come down. No one under 65 with comorbidities reported symptoms, no one over 65 with comorbidities was included in the study, and only 10-15 healthy 65+ year olds reported symptoms vs 75-80 younger than 65.

      https://statmodeling.stat.columbia.edu/2020/11/11/the-pfizer-biontech-vaccine-may-be-a-lot-more-effective-than-you-think/#comment-1584416

      In the animal studies of SARS vaccines the young healthy animals were protected while aged were not.

      • Again, I think you can make a biological argument, which you do. I don’t think there is any statistical reason to believe this based on the study alone. I would note that only 13% of the population is over 65, so the subgroup of those with co-morbidities can’t move the needle of overall efficacy that much.

        Also, just because they didn’t specify the co-morbid counts doesn’t mean that none of the subjects with comorbidities were cases.

        • Yes, but ~40% of the population is obese alone. Apparently none of them included in the study got infected in the placebo group.

          I think they would be bragging about it if the stats for comorbid young subjects looked good for the vaccine… Could be all 5 in the vaccine group were comorbid/young.

        • “Apparently none of them included in the study got infected in the placebo group.”

          *Citation needed.* You can’t just say that this is true because they didn’t explicitly say that it wasn’t true. We have no idea how many of the cases in either arm have any of the relevant comorbidities.

        • The 95 COVID-19 cases included 15 older adults (ages 65+) and 20 participants identifying as being from diverse communities (including 12 Hispanic or LatinX, 4 Black or African Americans, 3 Asian Americans and 1 multiracial).

          They mention every other subgroup included in the study, why not the young/comorbid subjects? It is quite a glaring omission to me, especially since that is the only place where I expected to see an issue beforehand.

        • “Every other subgroup”?

          They did report 11 severe cases in the placebo arm, given what we know about severe covid infections, it’s very unlikely, almost impossible, that none of them were free of comorbidities.

        • I really do not think there is any reason to expect a risk in young/with comorbidities greater than old/without comorbidities.

          In fact, there is pretty strong evidence against it: the observed age disparity in COVID deaths in the US is simply far too great, given how common comorbidities – especially obesity and asthma – are in the younger US population.

          So even if *you* specifically were concerned with that group, there’s no reason for the vaccine developers to focus on it or even mention it.

        • “Every other subgroup”?

          Yes, follow the links in my post. They included well defined subgroups, and talked about all except the subjects with comorbidities in their press release.

          A peer reviewer who failed to ask about that would be incompetent as can be.

      • 10-15 elderly people infected out of 90 actually doesn’t seem that low: a vaccine trial population would be expected to exclude people who think the virus is a hoax or “just a cold/flu”, so one would expect the elderly (at much greater risk) to be more cautious overall.

        And the young are also more numerous (US median age is 38). Even in a trial of just adults, wouldn’t one expect a majority to be under 65, especially as minimum health standards might also exclude more of the elderly?

        Also not sure how much we can compare SARS vaccine studies 16-17 years ago to this; I believe the Moderna and Pfizer mRNA vaccines are a rather new technology. Genetic stuff has come a really long way since the early 2000s.

        • Also not sure how much we can compare SARS vaccine studies 16-17 years ago to this; I believe the Moderna and Pfizer mRNA vaccines are a rather new technology. Genetic stuff has come a really long way since the early 2000s.

          What matters is having few/weak antibodies to the spike protein. Whatever triggered them doesnt matter other than perhaps the strength of the immune response and rate of waning. Exposure to SARS3 in a few years is another big risk factor here.

        • Eh… maybe? But I don’t think there is anything like the certainty you are suggesting that ADE will be a thing for SARS-COV-2, much less what the risk factors for it would be.

          Was ADE shown for SARS-1 in vivo, or only in vitro?

          And future viruses that haven’t even evolved yet are *by definition* unpredictable!

        • Am I reading something wrong? The first and third of those *do* seem to be in vitro (cell line) studies.

          The second one is in mice, granted, but I am not sure “the vaccine failed to protect aged animals in which augmented immune pathology was also observed, indicating the possibility of the animals being harmed because of the vaccination” is equivalent to “antibody-dependent enhancement did in fact happen”, much less that it would happen in humans.

        • Look, I’m not claiming any special expertise (which I don’t have). But none of those studies that you are quoting really seem to answer the question I was asking: two of them are not in vivo, and the in vivo one may not really demonstrate ADE.

        • And I’m not sure that vaccine type is that irrelevant. The RSV vaccine issues Daniel Lakeland mentioned on another thread may have been related to that (the paper I saw on it didn’t seem terribly clear, but that may be because it happened in the 60s and the knowledge of the time was not entirely up to par in terms of understanding what happened).

          But that might have had some white-blood-cell involvement rather than being “purely” antibody-caused.

          I don’t nearly have the expertise to judge this — but I really don’t think this is nearly as certain/solid as you suggest.

        • It seems to me to be pretty likely that that’s not been done because it in fact would not be relevant/useful.

          Otherwise one would have to assume that many research groups in many different countries are all making the exact same errors.

          IE – if this is obvious to you, why isn’t it obvious to them?

          I complain a lot about US drug development/approval issues, but those are fairly specific to the way the FDA does things – a single nation with a specific regulatory structure that creates incentives (not always positive ones). In this case many nations with different structures are involved.

        • It seems to me to be pretty likely that that’s not been done because it in fact would not be relevant/useful.

          It was always considered relevant/useful before covid. And doesnt cost much to do the study given the money being thrown around.

        • Then what is your explanation for why it hasn’t been done? COVID vaccine efforts are too widespread/decentralized for it to be plausible that everyone is making the same “obvious” mistake.

          As for the polio vaccine, I really don’t think problems that happened in *the 1950s* have any relevance. Biological understanding in the 50s was pitifully limited, DNA was just being figured out. That would be like comparing safety of modern aircraft to World War I-era ones.

          If ADE was likely to be a real problem with COVID, we’d see a lot more trouble with natural reinfection than we do.

        • >>What matters is having few/weak antibodies to the spike protein.

          For what it’s worth, this may be true in mice for SARS, but not carry over to COVID-19

          https://blogs.sciencemag.org/pipeline/archives/2020/11/18/vaccine-possibilities

          “one figure to take home is that 90% of the subjects were still seropositive for neutralizing antibodies at the 6 to 8 month time points. The authors point out that in primate studies, even low titers (>1:20) of such neutralizing antibodies were still largely protective, so if humans work similarly, that’s a good sign. An even better sign, though, are the numbers for memory B cells”

          If low titers are still protective, the problem may not exist for this disease.

    • One thing that’s clear is that the baseline case rate assumed when designing the trial is way too low. In the Moderna and Astrazenaca protocols, the base rate is assumed to be ~0.7% over six months. It’s pretty clear they are seeing that level over just a few weeks so the base rate is off by a huge margin. If they had assumed say 5% base rate in the design, wouldn’t the interim analysis require more cases?

      • Why? What you need is a sufficient number of cases. If incidence is higher than expected and you can get there in six week rather than six months you are happy to have your results earlier. If they had assumed a higher base rate maybe they would have enrolled less people in the trial (on the other hand you need lots of people for the safety endpoints anyway, whatever the incidence).

        • In classical design (no interim analysis), the closer the base rate is to 50%, the higher the required sample size – so when multiplied by a higher base rate, the # of cases would have been higher, not lower.

        • Isn’t the design just “go until we get N total cases across both arms”… in which case the base rate is just to estimate how many people are needed to get N cases in a reasonable time?

          I think this is the same thing Carlos said, so obviously I’m not following. If you could elaborate a little I’d appreciate.

        • I don’t follow you. Forget the interim analysis. If you decide you need 200 cases to look at the split vaccine/placebo and be happy with the inference you make about the vaccine efficiency, why does it matter whether you get those 200 cases in six weeks or six months [1]? Why would you require more cases if 200 are enough? It’s also possible that I have misundestood your previous comment entirely.

          [1] Apart from the insight you may get about duration.

        • Carlos: “If you decide you need 200 cases to look at the split vaccine/placebo and be happy with the inference you make…”

          But how do they decide they need 200 [or whatever the real number is] cases?

        • Moderna: “Under the assumption of proportional hazards over time and with 1:1 randomization of mRNA-1273 and placebo, a total of 151 COVID-19 cases will provide 90% power to detect a 60% reduction in hazard rate (60% VE), rejecting the null hypothesis H0: VE ≤ 30%, with 2 IAs at 35% and 70% of the target total number of cases using a 1-sided O’Brien-Fleming boundary for efficacy and a log-rank test statistic with a 1-sided false positive error rate of 0.025.”

          Pfizer: “Under the assumption of a true VE rate of ≥60%, after the second dose of investigational product, a target of 164 primary-endpoint cases of confirmed COVID-19 due to SARS-CoV-2 occurring at least 7 days following the second dose of the primary series of the candidate vaccine will be sufficient to provide 90% power to conclude true VE >30% with high probability.”

          Janssen: “The study TNE is determined using the following assumptions: a VE for molecularly confirmed, moderate to severe/critical SARS-CoV-2 infection of 60%, approximately 90% power to reject a null hypothesis of H0: VE≤30%, type 1 error rate α = 2.5% to evaluate VE of the vaccine regimen (employing the sequential probability ratio test [SPRT] to perform a fully sequential design analysis; detailed in Section 9.5.1), a randomization ratio of 1:1 for active versus placebo. (…) Under the assumptions above, the total TNE to compare the active vaccine versus placebo equals 154, based on events in the active vaccination and placebo group, according to the primary endpoint case definition of moderate to severe/critical COVID-19 (Section 8.1.3.1).”

          AstraZeneca: “Approximately 33 000 participants will be screened such that approximately 30 000 participants will be randomized in a 2:1 ratio to receive 2 IM doses of either 5 × 1010 vp (nominal, ± 1.5 × 1010 vp) AZD1222 (the active group, n = approximately 20 000) or saline placebo (the control group, n = approximately 10 000) 4 weeks apart, on Days 1 and 29. The sample size calculations are based on the primary efficacy endpoint and were derived following a modified Poisson regression approach (Zou 2004). (…) For the primary efficacy analysis, approximately 150 events meeting the primary efficacy endpoint definition within the population of participants who are not seropositive at baseline are required across the active and control groups to detect a VE of 60% with > 90% power. These calculations assume an observed attack rate of approximately 0.8% and are based on a 2-sided test, where the lower bound of the 2-sided 95.10% CI for VE is required to be greater than 30% with an observed point estimate of at least 50%.”

          AstraZeneca is the only one who mentions the attack rate (percentage of an at-risk population that contracts the disease during a specified time interval). It’s not really used to determine that 150 cases are required, it provides the link between the 150 cases to the 30000 participants.

        • If we got close to a base rate of 50% we are in a very different situation of emergency. Right now NYC is closing schools on a positivity rate of 3% *among those tested*. Unless the bias for tasting is that people with infections are less likely to get tested 50% is very far away. Of course I keep thinking about the fact that the plague kill 25% of the population of Europe over a number of years. The death rate from COVID in North Dakota is 1/1000 and still going up and that’s just a few months.

  4. The reason flu vaccine efficacy is so much lower is that there are multiple circulating strains of flu, not all of which are targeted by a given vaccine, and any given year’s vaccine is based on epidemiologists’ best guess about which strains will be circulating that year. The molecular targets for flu mutate/adapt more than the spike protein for coronavirus. So it is not an apples to apples comparison, for multiple reasons. Many other vaccines have higher efficacy rates.

    • Yeah, I think the efficacy stated is probably “assuming a population of similar composition to the sample group”.

      It would make sense to be less effective in the very oldest/least healthy/most immunocompromised, I’d think, as a vaccine requires a certain degree of functionality of the person’s own immune system.

      But this still seems extremely good — people were talking about 50-70% efficacy, not 90%+, and the US population isn’t so old as to shift it *that* much!

  5. I read that the in the Pfizer trial that they only tested people who were symptomatic. The PR for Moderna suggests the same – “prevented virtually all symptomatic cases of Covid-19″. This suggests that what is being measured is not whether it stops people getting it, but whether it stops symptoms. It seems to me this has statistical implications. Now, is the number of infections is the same in each group, but symptoms only manifest in a smaller number, or does the number of infections drop overall? It seems we don’t really know this yet. And I worry that the protocols don’t involve the right sort of testing to find this out.

    • Is there any reason to believe that asymptomatic cases would be more prevalent in one arm of the trial than the other? Is there any reason to believe that something that prevents symptomatic cases would not also prevent asymptomatic cases?

      • Sure, your immune system *could* suppress the virus to the point you dont really notice but you can still transmit it. That is the whole idea behind the asymptomatic people needing to wear a mask.

        • You’re forgetting that masks work for people when they’re pre-symptomatic, i.e. they have no or mild symptoms and haven’t been tested yet.

          A lot here depends on access to testing, i.e. at what level of symptoms do people get tested?

          But in the end, does it matter?
          Even if p(did not get sick from infection) > p(did not transmit to others), as long as the reduced transmissivity pushes the reproduction rate firmly below 1, herd immunity will stop the virus.

        • Definitely… but the issue brought up by Brian suggests a scenario in which the vaccine would reduce only symptomatic cases by turning them into asymptomatic cases without reducing the asymptomatic cases that would have occurred as well. So let’s say there would have been 100 asymptomatic cases and 100 symptomatic cases in each group without vaccine. Would the symptomatic cases that the vaccine prevented just be added to the asymptomatic cases such that in the vaccine group we now have 5 symptomatic cases and 195 asymptomatic cases with no reduction in actual infections? If that’s the case, then there certainly would statistical issues. My question is whether we have any reason to believe that is actually the case (or anywhere close to it).

        • It doesn’t sound terribly plausible, though I’m not an immunologist — immunity isn’t binary, it would be very strange if it prevented 95% of symptomatic infections without reducing infections or contagiousness at all!

        • I think the issue is if asymptomatic are multiples of symptomatic. If all we do is reduce the number of symptomatic without reducing asymptomatic, then we may not move R very much. The truth is I don’t know the answer and I have no priors on this (the closest I got to medicine was a course on experimental design). But I would have hoped the trials were set up to answer this question.

  6. I always have the desire to replace 2by2 table by logistic regression (no need to remember those smart hypothesis tests, the model is generative, immune under retrospective design, easy to specify prior, etc etc).

    For this dataset, we can run the following regression to estimate the “effectiveness”:

    y1=c(rep(1,5),rep(0,1e4)) # control 
    x1=y1*0
    y2=c(rep(1,90),rep(0,1e4)) # treatment 
    x2=y2*0+1   
    y=c(y1,y2)
    x=c(x1,x2)
    library(rstanarm)
    fit=stan_glm(y~x,family = binomial(link = "logit"))
    print(fit)
    stan_fit= extract(fit$stanfit)
    mean( 1- invlogit(stan_fit$alpha +stan_fit$beta)/invlogit(stan_fit$alpha))
    quantile( 1- invlogit(stan_fit$alpha +stan_fit$beta)/invlogit(stan_fit$alpha), c(0.025,0.975))
    

    I am using default prior here but presumably we can do better. The outcome is 94.0%, with its 95 confidence interval to be (87%, 98%). So I guess the final number would not “be a lot lower than 95%” based on such evidence.

  7. There is no reason to believe COVID and flu are analogous. SARS-CoV-2 is a recent spillover from an animal reservoir and has very different biology. For example, Coronaviruses have a proofreading exonuclease that dramatically reduces their mutation rate. As a result, SAR-CoV-2 has limited standing genetic diversity and a much lower mutation rate than typical Influenza viruses. Moreover, the the most prominent vaccines against Influenza viruses are generated by passaging live virus in eggs, which generates myriad evolutionarily-adaptive substitutions and thus antigenic divergence from the selected vaccine strain. The two announced SARS-CoV-2 vaccines are both mRNA vaccines. They are not passaged and therefore retain the exact spike protein sequence that was originally selected for vaccine production.

    Basically the only thing they have in common is they are both respiratory viruses. Otherwise, the biology and vaccine are completely different.

  8. It’s instructive to look at the Tamiflu history. Basically all data NOT reviewed by any country; glowing Cochrane reviews not based on full data; before stockpiling and selling $8 billion worth. The full data was 130,000 pages reviewed by British Medical Journal concluding, essentially, it was not very effective (details in the BMJ 2014 review) with significant neurological adverse effects not reported by Roche. It took 4 years to wangle the full data out of Roche. That’s why the call for “full data” is so important.
    BMJ 2014; 348 doi: https://doi.org/10.1136/bmj.g2545 (Published 09 April 2014) Cite this as: BMJ 2014;348:g2545
    Also see a shorter summary
    see Indian J Pharmacol. 2015 Jan-Feb; 47(1): 11–16.
    doi: 10.4103/0253-7613.150308: 10.4103/0253-7613.150308 PMCID: PMC4375804

    Peter Doshe BMJ, one of the reviewers of full data of Tamiflu has comments worth reading on the vaccine data available. https://blogs.bmj.com/bmj/2020/11/26/peter-doshi-pfizer-and-modernas-95-effective-vaccines-lets-be-cautious-and-first-see-the-full-data/

Leave a Reply to Zhou Fang Cancel reply

Your email address will not be published. Required fields are marked *