“We’ve got to look at the analyses, the real granular data. It’s always tough when you’re looking at a press release to figure out what’s going on.”

Chris Arderne writes:

Surprised to see you hadn’t yet discussed the Oxford/AstraZeneca 60%/90% story on the blog.

They accidentally changed the dose for some patients without an hypothesis, saw that it worked out better and are now (sort of) claiming 90% as a result…

Sounds like your kind of investigation?

I hadn’t heard about this so I googled *Oxford/AstraZeneca 60%/90%* and found this news article from Helen Branswell and Adam Feuerstein:

AstraZeneca said Monday that its coronavirus vaccine reduced the risk of symptomatic Covid-19 by an average of 70.4%, according to an interim analysis of large Phase 3 trials conducted in the United Kingdom and Brazil. . . .

The preliminary results on the AstraZeneca vaccine were based on a total of 131 Covid-19 cases in a study involving 11,363 participants. The findings were perplexing. Two full doses of the vaccine appeared to be only 62% effective at preventing disease, while a half dose, followed by a full dose, was about 90% effective. That latter analysis was conducted on a small subset of the study participants, only 2,741.

A U.S.-based trial, being supported by Operation Warp Speed, is testing the two-full-dose regimen. That may soon change. AstraZeneca plans to explore adding the half dose-full dose regimen to its ongoing clinical trials in discussions with regulatory agencies . . .

Fauci cautioned that full datasets — which the Oxford researchers said they intend to publish in a scientific journal — need to be pored over before conclusions can be drawn.

“We’ve got to look at the analyses, the real granular data. It’s always tough when you’re looking at a press release to figure out what’s going on,” Fauci said. . . .

Indeed, it’s hard to deconstruct a press release. In this case, the relevant N is not the number of people in the study; it’s the number of coronavirus cases. If this was proportional in the subset, then they’s (2741)/(11363)*131 = 32 cases . . . ok, if there are equal numbers in the placebo and treatment groups, and the risk is reduced by 90%, then that would be something like 30 cases in the placebo group and 3 in the treatment group, if I’m thinking about this right. A 70% reduction would be 9 cases in the treatment group. If you expect to see 9, then it would be unlikely to see only 3 . . . I guess I’d do the Bayes thing and estimate the efficacy to be somewhere between 70% and 90% . . . . of course that’s making the usual assumption that the vaccine is as effective in the real world as in this trial.

OK, so I guess I didn’t have that much to say on this one. Ultimately I hope they can learn not just from these topline numbers but by looking at more direct measurements of antibodies or whatever on the individual patients.

Full disclosure: I have done some work with AstraZeneca.

P.S. I think it’s useful to sometimes post on statistical issues like this where I have no special insight, if for no other reason that to remind people that nobody, myself included, has the answer all the time.

36 thoughts on ““We’ve got to look at the analyses, the real granular data. It’s always tough when you’re looking at a press release to figure out what’s going on.”

  1. Rumor (said to be based on phone calls with AZ) has it that the numbers in the UK arm were indeed 30/3.
    Here’s one likely contributor (besides random chance and different population characteristics) to the difference between the UK (1/2+1) arm and the Brazil (1+1) arm. The dosing error in the UK was discovered because they noticed a peculiar absence of the usual unpleasant reactions (flu-ish feeling for a day,…). Those same common reactions effectively unblind the study. If you get a shot and then feel sick for 24 hours, you know you got the vaccine. You’re likely to be less careful about distancing. So in the (1+1) arm, the direct biological effect of the vaccine would be underestimated by more than in the (1/2+1) arm. To the extent that this behavioral effect plays a role, the actual net individual effect of a real vaccine will always be overestimated.

    But then we get to the herd effect issue.

    Despite the frustratingly incomplete reporting on the results from AZ, they should soon release an important piece of data not available for the mRNA vaccines. In the UK, they did regular PCR tests, so they should have a good idea not only of how many symptomatic cases were avoided but also of how many contagious asymptomatic cases were avoided. That’s very important to know for deciding whether to focus on vaccinating the most vulnerable or the most likely links in the transmission tree. The mRNA trials just didn’t measure asymptomatic cases, so those vaccines should probably go more to the vulnerable, for whom the benefits are clear. (Disclosure: I have a personal interest in that.)

    • > If you get a shot and then feel sick for 24 hours, you know you got the vaccine.

      I think the UK/Brazil trials used a meningits vaccine as comparator. While I’m not sure how bad its side-effects are, the risk of unblinding may be lower than in the PFE/MRNA trials.

  2. I understand skepticism when reporting is opaque, but a lot of this just comes down to whether you are a pre-registration-phile or a learn-what-you-can-from-the-data-phile. Subgroup analyses are noisy, but the same principles of inference apply, and supplying some plausible causal mechanism helps. The dosing error is sort of unfortunate, but because it was done systematically, you can learn from it! It seems prima facie weird that a half-full regimen would work better than a full-full regimen. I don’t know enough about this to argue one way or the other though. But the data are the data (hopefully!).

    Michael’s comment offers that plausible causal mechanism for the higher effectiveness in the subgroup, so perhaps dosing of at least a half dose doesn’t make a difference and the vaccine is really only 60% effective at reducing risk immunologically (or it could be some combo where a half-full regimen is better but there is also the behavioral component…). That’s the value that matters since people will be ‘unblinded’ when they choose to get it. It then makes me wonder if the other vaccines have this feature and their “true” effectiveness is overestimated (and to restate the caveat: conditional on Michael being correct that such a behavioral difference between placebo and vaccine exists). Again, I don’t know enough about vaccines to say whether they are similar enough that we can somewhat generalize across brands.

    • The other explanation is that antibodies were raised against the other components of the vaccine, then the booster dose was neutralized by those antibodies before it could stimulate the better anti-spike response.

      For the Russian vaccine they used two different versions of the vaccine for this reason.

      • Thanks! Further illustrating the importance of subject matter expertise when thinking about what inferences can be drawn from statistical results. Carlos Ungil also notes the control might not have been saline injections, helping cut against the behavioral hypothesis.

        Again, and I think Andrew would agree with this, the main thing here isn’t whether the run or analysis was “good” or “bad” but what can be learned from it. Hence, Andrew’s less-than-frenzied take.

    • Many complaints are coming from pre-registration-philes. But, in their defense—and this could just be how the media reports things—Astrazeneca sounded too credulous of their results rather than emphasizing sources of uncertainty. From https://www.wired.com/story/the-astrazeneca-covid-vaccine-data-isnt-up-to-snuff:

      1) “[I]t has since been revealed that the people who received an initial half-dose—and for whom the vaccine was said to have 90-percent efficacy—included no one over the age of 55. That was not the case for the standard-dosing group, however, where the reported efficacy was 62 percent.”

      2) This was a meta-analysis combining different experiments with different protocols. “A month later, a second Phase 3 trial for the vaccine started in Brazil. That one was for healthcare workers, for whom the risk of being exposed to Covid was far higher than it was for the people in the UK trial. But the two trials had other substantive differences. In the UK, for example, the volunteers who did not get the experimental Covid vaccine were injected with meningococcal vaccine; in Brazil, those in the comparison group were given a saline injection as a placebo.”

      Since each arm was an RCT and I have little reason to think these guys somehow abused the data, I actually think a lot of complaints are overblown. I also don’t have a prior on how likely it is to find a smaller first dose enhances efficacy. Still, I remain a little skeptical.

      • Anonymous —
        You are to be congratulated for mentioning the selective age distribution in the half-dose sample for which Astra-Zenica claimed 90% efficacy. This situation strongly suggests the possibility of selection bias affecting this efficacy finding. Why are most contributors to this blog topic not even mentioning this issue?

    • I’m not claiming that the unblinding effect is big, because I really don’t know anything about this field to say. Just that it’s a plausible contributor to the difference along with chance, different test populations, and of course subtle but direct biological effects (see e.g. Anoneuoid below) that might make (1/2+1) better.
      Just as an unscientific guess, maybe the huge conscious differences in Covid-avoiding behavior might make the behavioral reduction of vaccine effects bigger than for e.g. flu, where people don’t take as much of a range of protective steps.

      • The problem with arguing that some how unblinding would make more people in the placebo group get the virus. Why would that be the case? Wouldn’t you expect that the unblinding effect would be that people in the treatment group too more risks and then became more likely to get the virus despite the treatment? (or that the controls assumed they were controls and thus remained more cautious?)

        • One could just as easily say, the unblinding effect on the treatment group reduced their exposure during the study period, if side-effects made some substantial number of them feel poorly and hence *reduce* their activity. I cannot imagine any of these questions being settled by armchair psychology — it seems if they’re addressable at all, it’d have to be by recourse to post-hoc analysis of “longitudinal” activity logs of both groups of experimental subjects.

  3. ” the relevant N is not the number of people in the study; it’s the number of coronavirus cases. ”
    _____

    … but the number of “Cases” is also irrelevant without a medically valid definition of “Cases” denoting an active COVID-19 infection.
    The accuracy of current COVID-19 “tests” is also very suspect.

    Both U.S. vaccine trials have a treatment group that received the vaccine and a control group that did not.
    All the trial subjects were “tested” as COVID-19 negative before start of the trial.

    Analysis for both trials was performed when an arbitrary target number of “cases” were reached. “Cases” were defined by positive polymerase chain reaction (PCR) testing.

    There was no information released publicly about the ‘cycle number for the PCR tests (# of sample amplifications).
    There was no information about whether the “cases” had actual symptoms or not.
    There was no information about hospitalizations or deaths.

    The Pfizer study had 43,538 participants and was analyzed after 164 cases. So, roughly 150 out 21,750 participants (less than 0.7 percent) became “PCR positive” in the control group and about one-tenth that number in the vaccine treatment group became PCR positive.

    The Moderna trial had 30,000 participants. There were 95 “cases” in the 15,000 control participants (about 0.6 percent) and 5 “cases” in the 15,000 vaccine participants (about one-twentieth of 0.6 percent).
    The “efficacy” figures quoted in these press-release announcements are odds ratios.

    • > “Cases” were defined by positive polymerase chain reaction (PCR) testing.

      Cases were _confirmed_ by PCR testing.

      > There was no information about whether the “cases” had actual symptoms or not.

      What do you mean by “there was no information”? Looking at the trial protocols one can tell that every case had actual symptoms.

    • I’m 100% certain that the FDA required them to define what a case was before starting the trials. In the Pfizer case from what I read they only got PCR if they reported symptoms. So the number of infections in both arms was likely higher than the number discussed in this analysis.

      I believe the original power analyses assumed something like a .7 or .8 base rate of infections. Of course they deliberately chose study groups with a much higher baseline risk than the general world population (healthcare workers, people in Brazil). But that is why they announced the results when they did, they had passed the number of cases required based on the power analysis. Between the time they reached that number and the time the FDA approved the announcements they had even more cases.

  4. There’s been some confusion about the numbers, and there’s a typo in the paper you quote, but from what I can tell the figures we know are as follows.

    The results combine two trials. One in the UK, with 12,390 participants, one in Brazil with 10,300 participants, so 22,690 participants all up. In the non-placebo group in Brazil, all participants received two full doses. In the group in the UK, 2,791 participants received a half dose and a full dose, and some number received two full doses. The total number of participants receiving two full doses was 8,895, but it is not stated how many of these were in Brazil and how many in the UK, giving 11,636 receiving the vaccine in total – there is a transposition error in the paper. Of the 11,636, 30 became infected. Of the total placebo group, 101 became infected, giving rise to the 70% overall figure, although I can’t get exactly 70.4%.

    The 90% claim is for the 2,791 who received the half dose – a lot of speculation along the lines of the above as to how many of these got sick, but most people seem to think either 2 or 3.

    There has been some criticism on combining the results from 2 different studies in different countries, but we’ll just have to wait for the actual data to be published to see what the exact numbers were.

  5. Why, though, “If you expect to see 9, then it would be unlikely to see only 3 ” … I think that when they chose the sample size they probably used conservative estimates of efficacy and of base rate. People I think don’t necessarily realize how huge these sample sizes are. Even the half dose 2790 sample is very large compared to most phase 3 studies. They didn’t want to wait a year to get enough cases to move on.

    Also, just based on talking to people in the pharma industry, it’s not like these studies are one and done, they continue to test different formulations, populations, dosages even after the initial phase three as they move from the early work to market. But in this case the time frame is so compressed that some of that will be done even as they move to emergency use. They also need much more follow up time to understand subsequent safety issues and long term efficacy.

    • On the “If you expect to see 9, then it would be unlikely to see only 3”, no idea what the real Andrew’s thinking is, but if you’re a back of the envelope statistician, and you’re thinking Poisson, what’s the chance of seeing 3 or below in P(9)?

      • And Excel (so sue me) tells me if the mean of a Poisson distribution is 9 the chance of seeing 3 or below is around 2%. And if the mean is 6, the chance is around 6%.

      • Yes, exactly. sqrt(9) is 3, so [9 +/- 2 sd] is [3, 15], hence it’s unlikely to see as few as 3 counts if the expectation is 9. But in any case we’d like to get more information on the people who were in the treatment and control groups in each study.

  6. My guess is that this vaccine will get US regulatory approval for the half dose regimen for those under 55. My understanding is that the US based sites that are still enrolling have switched to the low dose. I don’t think the efficacy will be quite as good as the Moderna and Pfizer vaccines. But it doesn’t need to stored at very low temps. It will be interesting how it is decided who gets which vaccine.

  7. And here, I focused on the definition of cases, which is people in the study were only tested for Covid if they showed symptoms. If both dosage regimens generate asymptomatic cases, they would slide so maybe the difference is that the dosage difference presents a few more or less symptomatic cases. That is from this in an NYT article: ‘The clinical trials run by Pfizer and other companies were specifically designed to see whether vaccines protect people from getting sick from Covid-19. If volunteers developed symptoms like a fever or cough, they were then tested for the coronavirus. But there’s abundant evidence that people can get infected with the coronavirus without ever showing symptoms. And so it’s possible that a number of people who got vaccinated in the clinical trials got infected, too, without ever realizing it. If those cases indeed exist, none of them are reflected in the 95 percent efficacy rate.’

    As a question, wouldnt the first thought be that you may have displaced some cases from showing positive symptoms to asymptomatic based on dosage? That is, you have two tails, a small difference, so this would appear to be a lower stimulation of whether you’re symptomatic, which is plausible because we’re guessing and this guess fills the gap.

  8. Greetings!

    Could someone please clarify the end-point data collection issues (infections) in any of the new vaccine studies?

    I understand both arms of the trial were injected with either a full or partial dose of vaccine/placebo about a month apart, and then allowed to ‘roam freely’ and otherwise go about their lives as they did prior to the treatment.

    Later, a number of positive cases was collected in each arm.

    Are we expected to assume all of them had equal chance of contracting COVID? Was there any experimental control as to how susceptible any of the participants are, in the first place?

    How anyone gets infected is random, and without a proper experimental control they may only be documenting noise. Actually, higher ‘success’ in the half-dose, which makes no sense, leads me to believe it is exactly what is happening.

    I hope I’m wrong.
    Thanks

    • Presumably the experimental and placebo groups were chosen by random number generator. The sizes of these groups were thousands to tens of thousands of people. Under these conditions we expect the two groups to be within epsilon of each other in terms of the distributions of any important variables. So, no the groups are not just different in susceptibility or occupational exposure or the like.

      • Thank you for your response, Daniel.

        I’m aware of the sample sizes and I’m sure it was randomized properly, but I am still puzzled about the end-point lack of control.

        With something this big at stake, it would be nice to have access to the raw data and all
        other variables for participants in both arms (infected or not).

        If COVID prevalence was huge, it wouldn’t matter, but the chance to get infected is small to begin with, and a handful (or a few thousands) of infected vs. not in each randomized arm, could very well be a function of their susceptibility due to other factors (essential worker, hot-spot zipcode, public transport user, risk-taking level, etc.)

        I understand that the sample size would take care of some of those issues, simply because of probability laws, but if so, why not have all those variables available to the public?

        A lot of well-controlled studies of smaller importance have (rightfully) been scrutinized, questioned, and dissected in great detail. Why not this one?

      • I presume that pre and post injection activity logs of both arms were compared and were found to be comparable. Right? We would not want the result to depend upon some combination of exogenous and adventitious factors; like, for example, suppose a substantial number in the arm that got the real dose felt like crap for a week or two and stayed in? Suppose many in the arm that got the fake dose learned that the case-rate in their locality was dropping (because of successful mitigation measures) and decided it was o.k. to take a ‘holiday’ from enforced isolation?

        • Its a good point that the vaccine itself could cause changes in behavior. But if they randomized correctly it isn’t credible that the prevaccine distributions of behavior or susceptibility were dramatically different.

          My understanding was that noons got fake doses they got COVID vaccine or they got some other vaccine which also would have had side effects. I think I remember maybe a meningitis one? I haven’t been following carefully. In any case I don’t think the unblinding was as simple and obvious as a saline vs COVID example

        • If lingering adverse reactions were indeed sufficiently rare.

          Activity logs sound like a good idea though. Both to make sure that the treated group was not hiding out at home; or that their environment happened to experience a drop in prevalence; and to make sure that the non-treated group was not more brave or devil-may-care than the other group; or that the local environment of the treated group happened to experience a jump in prevalence. It seems elementary, that one would want post-hoc check on the assumption of randomization of all “exogenous” influences which might conceivably spoil the interpretation of outcome.

Leave a Reply to Daniel Cancel reply

Your email address will not be published. Required fields are marked *