Skip to content

Last post on hydroxychloroquine (perhaps)

James “not this guy” Watson writes:

The Lancet study has already been consequential, for example, the WHO have decided to remove the hydroxychloroquine arm from their flagship SOLIDARITY trial.

Thanks in part to the crowdsourcing of data sleuthing on your blog, I have an updated version of doubts concerning the data reliability/veracity.

1/ Ozzy numbers:
This Australian government report (Table 5) says that as of 10th May, only 866 patients in total had been hospitalized in Australia, of whom 7.9% died (68 patients)… whereas 73 Australian patients in the Lancet paper were reported as having died. The mean age reported in the Lancet paper for Australian patients is 55.8 years. The median age for all Australian patients in the attached is 47 years, and for those hospitalized it’s 61 years. (Note the Lancet paper only included hospitalized people, up to April 14th).

2/ A very large Japanese hospital:
The Mehra et al. paper in the NEJM (Cardiovascular disease, drug therapy, and mortality in Covid-19, same data provenance, time period: Dec 20th to March 15th) gave the number of hospitals broken down by country. They had 9 hospitals in Asia (7 in China, 1 in Japan and 1 in South Korea) and 1,507 patients. Their follow-up paper in The Lancet presumably used the same data plus extra data up until April the 14th. The Lancet paper had 7,555 participants in Asia and also 9 hospitals. The assumption would be that these hospitals are the same (why would you exclude the hospitals from the first analysis in the second analysis?). Therefore, we assume that they had an extra 6048 patients in that time period.
Cases in China went from 80,860 on March the 15th to 82,295 by April the 14th (difference is 1435). South Korea: increase from 8,192 to 10,564 (difference is 2372); Japan: from 833 to 7,885 in this time (7052). This is a total increase of 10,859. If all cases in China and South Korea in the intervening period were seen in these 8 hospitals, then it would imply that 2241 patients were seen in 1 hospital in Japan in the space of a month!

3/ High dosing:
Almost 2 thirds of the data come from North America (66%, 559 hospitals). In the previous NEJM publication, the majority of the hospitals were in USA (121 versus 4 in Canada). Assuming that the same pattern holds for the extra 434 hospitals in this Lancet paper, the majority of the patients will have received doses of HCQ according to FDA recommendations: 800mg on day 1, followed by 400mg (salt weights) for 4-7 days. This is not a weight-based dosing recommendation.
The mean daily doses and durations of dosing for HCQ are given as: 596 mg (SD: 126) for an average of 4.2 days (SD: 1.9); HCQ with a macrolide: 597 mg (SD 128) and 4.3 days (SD 2). The FDA dosing for 4 days would give an average of 500mg daily, i.e. (800 + 3×400) / 4. Nowhere in the world recommends higher doses than this, with the exception of the RECOVERY trial in the UK.
So are these average daily doses possible?

4/ Disclaimer/background
It may be worth mentioning that I (or the research unit for which I work) could be seen as having a “vested interest” in chloroquine because we are running the COPCOV study (I am not an investigator on that trial). COPCOV is a COVID19 prevention trial in health workers. Participants will take low dose chloroquine as prophylaxis for 3 months (they are not sick and the doses are about 3x lower than given for treatment – so different population&dose than Lancet study). The Lancet study will inevitably damage this trial due the media attention. Understanding whether the underlying data are reliable or not is of extreme importance to our research group. Because our unit has been thinking/reading about (hydroxy)chloroquine a lot recently (and some people in the group have been studying chloroquine pharmacology for 40 years) we rapidly picked up on the “oddness” of this recent paper.

My conclusion from this is that post-publication review is a vital component of science. Medical journals need to embrace and stop pretending that peer/editorial review will solve all problems.

Perhaps the authors of that Lancet study will respond in the comments here? They haven’t yet responded on pubpeer.

P.S. The authors have this followup post which has some general discussion of their data sources but no engagement with the criticisms of the paper. On the ladder of responses to criticism, I’d put them at #4 (“Avoid looking into the question”). The good news is that they’re nowhere near #6 (“Attempt to patch the error by misrepresenting what you’ve written, introducing additional errors in an attempt to protect your original claim”) or #7 (“Attack the messenger”). As I’ve said before, I have an open mind on this, and it’s possible the paper has no mistakes at all: maybe the criticisms are misguided. I’d feel better if the authors acknowledged the criticisms and responded in some way.


  1. samuel says:

    Regarding the SOLIDARITY trial: How typical is it for a trial to stop early due to safety concerns based on results reported in an external data source/study? The story at the link mentions that the trials have been suspended “temporarily”. Is it possible that they are being suspended so that their own interim analysis can be done?

    • Luca Beltrame says:

      Trials can be paused and resumed for interim analysis or additional checks, or even external factors (like lack of a specific drug).

      Source: a trial in my institution was stopped due to similar reasons, then restarted after a while.

  2. Anon says:

    It is not the case that HCQ has been removed from the trial. The WHO has paused that arm of the trial to allow for a review of the interim data to see if they are seeing the same impacts on mortality as have been seen in some other studies.

  3. James Watson says:

    There is another paper by some of the same authors (Surgisphere data) on the use of ivermectin to treat COVID. The authors had posted it as a preprint but appear to have since removed it.

    It reports another wonderfully large effect size with a p-value with too many zeros to count

    The pdf can still be found here:

    A question for doctors working in the US: is is plausible that 451 patients in the US received ivermectin before the 31st March 2020? This may be completely fine, but my understanding was that it’s very difficult to get this drug for off-label use and that most hospitals would follow FDA guidelines (does not recommend ivermectin).

  4. Carlos Ungil says:

    The calculations in 2) are wrong because they don’t take into account that non-resolved cases are not included.

    “Cases in China went from 80,860 on March the 15th to 82,295 by April the 14th (difference is 1435).”

    Cases on March 15 that that were resolved by March 28 are between 78169 and 78748.

    Cases on April 14 that were resolved by April 21 are between 81292 and 81755.

    The difference is between 2500 and 3600, not 1400.

    “South Korea: increase from 8,192 to 10,564 (difference is 2372)”

    Cases on March 15 that that were resolved by March 28 are between 3909 and 5225.

    Cases on April 14 that were resolved by April 21 are between 8331 and 8450.

    The difference is between 3100 and 4600, not 2400.

    (Active cases data from worldometer. By the way, there may be a transription error in the 8192 above which would be 8162)

  5. Al says:

    Alongside the issues with dichotomising oxygen saturation and qSOFA score, the relationship between BMI and mortality appears to be linear in Figure 2. Independent predictors of in-hospital mortality.

    Lancet and the authors really buried the lede here, a better title may have received even more media attention – “Starvation: an affordable, widely accessible preventative treatment for COVID-19”

  6. Anoop says:

    “As I’ve said before, I have an open mind on this, and it’s possible the paper has no mistakes at all: maybe the criticisms are misguided. I’d feel better if the authors acknowledged the criticisms and responded in some way”

    To be honest, they could have had some reporting errors, which is expected when you reporting such large numbers within a short time. And these could be better clarified in the paper. But none of the criticisms point to any major errors in the paper or falsification. For example, controlling for disease severity. They controlled for what is available which is exactly how 90% of the retrospective analysis is done. They didn’t show dose-response which they mentioned in the limitation. Showing a dose-response will strengthen the argument, but the lack of dose-response doesn’t make the effect go way either. And he can always write a letter to editor and if his concerns are so grave, they will publish it

    Also, when you allow posts here, it does make an impact, especially with provocative titles. I have seen people sharing this all over twitter questioning the Lancet study and science as a whole.

    • Andrew says:


      I think we’re in agreement about the substance. The data and analysis of that paper might be ok; we just don’t know. The main difference is that I’m sharing people’s concerns and I keep saying I don’t know, whereas you seem to want to give the paper the benefit of the doubt. But why give it the benefit of the doubt? Because the first author of the paper is from Harvard? Because Lancet agreed to publish it? Regarding your statement that their analysis “is exactly how 90% of the retrospective analysis is done”: it all depends on how the adjustments are done, and it also depends on what is being studied. Adjusting for confounders is hard.

      You write that Watson “can always write a letter to editor and if his concerns are so grave, they will publish it.” Yes, and he can also write a letter to me, and I published it here! I also posted on pubpeer. Given the importance of this topic, I don’t think Lancet should be a gatekeeper of the discussion. I mean, why not just ask Watson to write his criticisms on gold tablets and bury them inside a pyramid? That will be there for posterity, right? In the meantime, people are dying. That’s the reason why the published study got all this attention, and that’s the reason the criticisms of the study got this attention.

      • Anoop says:

        Thank you for the reply.

        My point is the title of the posts and the article weighs more towards this paper HAS some serious issues than “we don’t really know yet and these criticisms maybe misguided and the paper could haves no mistakes at all”. And people assume that unless these are serious mistakes, Dr. Gelman won’t make a blog post about it.

        I hope you see my point and I do agree with you about open science.

        • Anon says:

          Andrew and Anoop, I am in agreement with you two about this. There ae a lot of really silly comments and conspirac theories. Anti-vaxxers and other people who just want to see the world burn. None of that is science and I want no part of that discourse.

          Andrew: I think reputations do matter. If somoene came out of left field, I think this would be a totally different story. I’m trying to separate crazy people reacting to “yet another chloroquine bashing paper” and just trying to understand the science

        • Anon says:

          @Andrew @James Watson – has anyone bothered to reach out to the authors to ask them any questions?

    • Anoneuoid says:

      But none of the criticisms point to any major errors in the paper or falsification.

      They do not describe what they did well enough for anyone to replicate it, this is a fatal flaw for a scientific report. It is worthless.

  7. BenK says:

    The gov’t of India has chosen to pursue HCQ and has some sharp things to say about the study.

  8. Sophie says:

    One very strange thing in the appendix is the hazard ratio for diabetes by region. For America and Europe, the Hazard ratio is bigger than one (1.305 and 1.151) but for South America, Africa and Australia under one (0.744, 0.769 and 0.897). This does not make any sense.

    Is BMI used as confounding factor (Type 2 Diabetes — the most common — is highly correlated to BMI)?
    Is the model over-fitted in some way (we have got not that many diabetics for some regions)?

    • Carlos Ungil says:

      Thanks for the link!

      “In a statement, Surgisphere founder Dr Sapan Desai, also an author on the Lancet paper, said a hospital from Asia had accidentally been included in the Australian data.

      ““We have reviewed our Surgisphere database and discovered that a new hospital that joined the registry on April 1, and self-designated as belonging to the Australasia continental designation,” the spokesman said. “In reviewing the data from each of the hospitals in the registry, we noted that this hospital had a nearly 100% composition of Asian race and a relatively high use of chloroquine compared to non-use in Australia. This hospital should have more appropriately been assigned to the Asian continental designation.”

      “He said the error did not change the overall study findings. It did mean that the Australian data in the paper would be revised to four hospitals and 63 deaths,.”

    • Andrew says:


      Here’s what it says in the linked news article:

      Questions about the paper’s statistical modelling have also come from other universities, including Columbia University in the US, prompting Surgisphere to issue a public statement.

      The link to the public statement goes here, but this public statement does not acknowledge any criticisms at all. Criticisms are addressed only obliquely. For example, the statement says, “We also clearly outlined the limitations of an observational study that cannot fully control for unobservable confounding measures and concluded that off label use of the drug regimens outside of the context of a clinical trial should not be recommended.” But that doesn’t make sense: if their observational study is limited, then how can it be used to make such a strong conclusion? And the criticism is not just that they don’t fully adjust (“control”) for unobservable measures; it’s that they don’t fully adjust for observed measures.

  9. JC says:

    Can someone confirm taht the data are all coming from hospitals cardiologic division ?

    I came to wonder about it since the release and Surgishphere annouvcment that they studied a very specific group of hospitalized patients with COVID-19 and now I am hearing about it too (

    So I would like to know if anyone here has some info about that.

  10. Tom Parke says:

    I have a wild guess at the mistake that the authors of the Lancet paper have made.

    I think, looking at the issues in the data that James Watson lists, that the authors may have inadvertently propensity sampled from the treated populations to match control (instead of the other way about), AND used the propensity sampled data when they reported their baseline population characteristics. This would explain the unusually low variance in the baseline characteristics and how they may have ended up reporting more deaths in their Australian data than has actually occurred in the country (continent?) and how they’ve ended up with suspiciously small confidence intervals.

    This would not be the only thing wrong though, there are still problems with the statistical analysis (e.g. no site effect, disease severity is possibly inadequately captured, …).

  11. Adam Brufsky says:

    We have a decent theory of the disease, we believe.

    This trial does not fit it at all, and we were curious why they simply didn’t stratify by a simple inflammatory marker like LDH. LDH >365 predicts for death 10-18 days in advance.

    We wrote about this in a BMJ rapid response.

    We are all awaiting the randomized HCQ trial.

  12. Hello All,

    Others have discussed the limited HCQ/CQ dosing information. For the HCQ/CQ groups, simply a mean daily dose and mean days of therapy are given.
    The authors present the rates of incubation per group. The intubation rate in each treatment group was 3x higher than that in the contro. For this second outcome, intubation, there is simply no dosing information available. It was a very large mistake to allow the authors to state the intubation rates were much higher in all 4 treatment groups, yet no show how much HCQ or CQ each pt received before being intubated. Those data need to be shown or those statements regarding intubation rates need to be retracted. Does any of you know what percentage of intubated pts received only 200 mg of HCQ before intubation? I certainly can’t determine that from these data. but, unfortunately, they did not present any data on HCQ/CQ dosing prior to intubation. It is unsound to associate drug use with an outcome and not present data on drug exposure before that outcome occurred. It is simply nonsensical to associate total mean dose of HCQ/CQ with intubation, when it is unknown how much of that total dose was given before and how much was given after each pt was intubated.

    Unlike most antimicrobial agents, HCQ/CQ have long ½ lives, meaning when given daily, the drug level accrues for several months. If HCQ/CQ worsens disease severity, the increase in severity should be dose related, actually the disease severity should be linked to the total dose divided by the weight of the pt. Therefore, total dose of HCQ/CQ must be evaluated in context of weight. The total dose of HCQ/CQ (weight-based) at the time of intubation of each pt is crucial to understanding these data. If HCQ/CQ worsen severity, this should be seen in those data.

    To be explicit, those that weigh more should do better after the same milligram total dose than those that weight appreciably less. In our cohort of over 200 Covid-19 pts, the weight range is 40 – 210 kg. So, after the same total dose of HCQ, the 40 kg pts had 5 x more HCQ per kg than the 210 kg pt. Accordingly, the lower weight pts on the same dosing regimen as severely obese and much heavier pts should do worse and do worse more quickly. If this pattern is not seen, something is wrong with the interpretation of the data.

    Even if one leaves out weight based dosing, still, one must determine the mean, the range and S.D. of the HCQ/CQ total dose at the time of intubation. While the authors applied a lot of statistical analyses to their data, they left out these most important ones. For instance, was 200 mg total dose of HCQ at the time of intubation associated with an increased rate compared with controls or was more HCQ needed? Analysis of these data in this way will affect the credibility of their interpretation significantly.

  13. Hello Andrew,

    Regarding the NEJM paper from Columbia/NYP, do you have any idea why 21% of pts had missing BMI data? All EHRs require that the height and weight of each pt be entered before the pt can be registered in the EHR system.
    So, why were they missing 21%.
    That’s a very high percentage for a one center study.
    Stephen Smith

  14. Pete says:

    I’m no HCQ defender, but Surgisphere is obviously a fraudulent company. It completely fabricated the data. See this post for a great explainer on this.

  15. Sabbir Rahman says:

    Given the obvious problems with this study, it is hard to see how it passed peer review unless the journal itself was complicit, together with the lead author, in promoting the conclusion that HCQ is harmful. The fact that a competent body such as the WHO would immediately publicise the results so widely despite its obvious flaws also strongly suggests complicity.

    Of all the drugs undergoing trials it would be strange that there would be such a concerted effort to prevent HCQ from being investigated properly unless it was somehow considered a threat. That could only be the case if (i) HCQ genuinely constituted a significant danger to the public – which seems highly unlikely as it is a well-known drug that has been administered safely for decades, or (ii) if it is actually likely to be efficacious – and a number of smaller scales studies have shown this to be the case when it is administered in combination with zinc supplements in an appropriate way.

    Now, the second of these two possibililties would clearly be to the considerable benefit of the general public in terms of saving lives, and the only reason that it could possibly be considered a threat by the parties concerned is if they were acting on behalf of entities who would be affected negatively should drugs such as HCQ be found to be effective and come into widespread use.

    It is fairly clear that the only parties that would be negatively effected by widespread use of HCQ as an effective treatment for COVID-19 would be the profit-driven pharmaceutical companies, organisations and high-net-worth individuals who stand to gain significantly from sale of newer drugs or vaccines which are still under patent. Effective treatments for COVID-19 through repurposing of low cost generics would pose a considerable financial threat to this promising source of income.

    The lead author of the article, Professor Mandeep Mehra, whose research is funded by pharmaceutical companies, certainly does not shy away from making his bias against the use of HCQ clear on his LinkedIn page, or that Bill Gates is one of his key influencers. In addition, besides contributions from individual goverments (no longer the US), the WHO receives significant its greatest funding contributions from the two largest pro-vaccine organisations, namely the Bill & Melinda Gates Foundation (which is now its largest single source of funding) and the GAVI Alliance – whose founding partners include the Bill & Melinda Gates Foundation, UNICEF, the WHO itself and the World Bank, and whose broader alliance includes the pharmaceutical industry.

    It does not require a great deal of analytical thought to recognise the corruption that must be going on here. The pharmaceutical industry and the vaccine lobby appear to have leading academics, top journals and even global health organisations in their back pockets, and corporate and individual greed is being given priority over the preservation of human life.

Leave a Reply