This controversial hydroxychloroquine paper: What’s Lancet gonna do about it?

Peer review is not a form of quality control

In the past month there’s been a lot of discussion of the flawed Stanford study of coronavirus prevalence—it’s even hit the news—and one thing came up was that the article under discussion was just a preprint—it wasn’t even peer reviewed!

For example, in a NYT op-ed:

This paper, and thousands more like it, are the result of a publishing phenomenon called the “preprint” — articles published long before the traditional form of academic quality control, peer review, takes place. . . . They generally carry a warning label: “This research has yet to be peer reviewed.” To a scientist, this means it’s provisional knowledge — maybe true, maybe not. . . .

That’s fine, as long as you recognize that “peer-reviewed research” is also “provisional knowledge — maybe true, maybe not.” As we’ve learned in recent years, lots of peer-reviewed research is really bad. Not just wrong, as in, hey, the data looked good but it was just one of those things, but wrong, as in, we could’ve or should’ve realized the problems with this paper before anyone even tried to replicate it.

The beauty-and-sex-ratio research, the ovulation-and-voting research, embodied cognition, himmicanes, ESP, air rage, Bible Code, the celebrated work of Andrew Wakefield, the Evilicious guy, the gremlins dude—all peer-reviewed.

I’m not saying that all peer-reviewed work is bad—I’ve published a few hundred peer-reviewed papers myself, and I’ve only had to issue major corrections for 4 of them—but to consider peer review as “academic quality control” . . . no, that’s not right. The quality of the paper has been, and remains, the responsibility of the author, not the journal.

Lancet

So, a new one came in. A recent paper published in the famous/notorious medical journal Lancet reports that hydroxychloroquine and chloroquine increased the risk of in-hospital death by 30% to 40% and increased arrhythmia by a factor of 2 to 5. The study hit the news with the headline, “Antimalarial drug touted by President Trump is linked to increased risk of death in coronavirus patients, study says.” (Meanwhile, Trump says that Columbia is “a liberal, disgraceful institution.” Good thing we still employ Dr. Oz!)

All this politics . . . in the meantime, this Lancet study has been criticized; see here and here. I have not read the article in detail so I’m not quite sure what to make of the criticisms; I linked to them on Pubpeer in the hope that some experts can join in.

Now we have open review. That’s much better than peer review.

What’s gonna happen next?

I can see three possible outcomes:

1. The criticisms are mistaken. Actually the research in question adjusted just fine for pre-treatment covariates, and the apparent data anomalies are just misunderstandings. Or maybe there are some minor errors requiring minor corrections.

2. The criticisms are valid and the authors and journal publicly acknowledge their mistakes. I doubt this will happen. Retractions and corrections are rare. Even the most extreme cases are difficult to retract or correct. Consider the most notorious Lancet paper of all, the vaccines paper by Andrew Wakefield, which appeared in 1998, and was finally retracted . . . in 2010. If the worst paper ever took 12 years to be retracted, what can we expect for just run-of-the-mill bad papers?

3. The criticisms are valid, the authors dodge and do not fully grapple with the criticism, and the journal stays clear of the fray, content to rack up the citations and the publicity.

That last outcome seems very possible. Consider what happened a few years ago when Lancet published a ridiculous article purporting to explain variation in state-level gun deaths using 25 state-level predictors representing different gun control policies. A regression with 50 data points and 25 predictors and no regularization . . . wait! This was a paper that was so fishy that, even though it was published in a top journal and even though its conclusions were simpatico with the views of gun-control experts, those experts still blasted the paper with “I don’t believe that . . . this is not a credible study and no cause and effect inferences should be made from it . . . very flawed piece of research.” A couple of researchers at Rand (full disclosure: I’ve worked with these two people) followed up with a report concluding:

We identified a number of serious analytical errors that we suspected could undermine the article’s conclusions. . . . appeared likely to support bad gun policies and to hurt future research efforts . . . overfitting . . . clear evidence that its substantive conclusions were invalid . . . factual errors and inconsistencies in the text and tables of the article.

They published a letter in Lancet with their criticisms, and the authors responded with a bunch of words, not giving an inch on any of their conclusions or reflecting on the problems of using multiple regression the way they did. And, as far as Lancet is concerned . . . that’s it! Indeed, if you go to the original paper on the Lancet website, you’ll see no link to this correspondence. Meanwhile, according to Google, the article has been cited 74 times. OK, sure, 74 is not a lot of citations, but still. It’s included in a meta-analysis published in JAMA—and one of the authors of that meta-analysis is the person who said he did not believe the Lancet paper when it came out! The point is, it’s in the literature now and it’s not going away.

A few years ago I wrote, in response to a different controversy regarding Lancet, that journal reputation is a two-way street:

Lancet (and other high-profile journals such as PPNAS) play a role in science publishing, that is similar to the Ivy League in universities: It’s hard to get in, but once you’re in, you have that Ivy League credential, and you have to really screw up to lose that badge of distinction.

Or, to bring up another analogy I’ve used in the past, the current system of science publication and publicity is like someone who has a high fence around his property but then keeps the doors of his house unlocked. Any burglar who manages to get inside the estate then has free run of the house. . . .

As Dan Kahan might say, what do you call a flawed paper that was published in a journal with impact factor 50 after endless rounds of peer review? A flawed paper. . . .

My concern is that Lancet papers are inappropriately taken more seriously than they should. Publishing a paper in Lancet is fine. But then if the paper has problems, it has problems. At that point it shouldn’t try to hide behind the Lancet reputation, which seems to be what is happening. And, yes, if that happens enough, it should degrade the journal’s reputation. If a journal is not willing to rectify errors, that’s a problem no matter what the journal is.

Remember Newton’s third law? It works with reputations too. The Lancet editor is using his journal’s reputation to defend the controversial study. But, as the study becomes more and more disparaged, the sharing of reputation goes the other way.

I can imagine the conversations that will occur:

Scientist A: My new paper was published in the Lancet!

Scientist B: The Lancet, eh? Isn’t that the journal that published the discredited Iraq survey, the Andrew Wakefield paper, and that weird PACE study?

A: Ummm, yeah, but my article isn’t one of those Lancet papers. It’s published in the serious, non-politicized section of the magazine.

B: Oh, I get it: The Lancet is like the Wall Street Journal—trust the articles, not the opinion pages?

A: Not quite like that, but, yeah: If you read between the lines, you can figure out which Lancet papers are worth reading.

B: Ahhh, I get it.

Now we just have to explain this to journalists and policymakers and we’ll be in great shape. Maybe the Lancet could use some sort of tagging system, so that outsiders can know which of its articles can be trusted and which are just, y’know, there?

Long run, reputation should catch up to reality. . . .

I don’t think the long run has arrived yet. Almost all the press coverage of this study seemed to be taking the Lancet label as a sign of quality.

Speaking of reputations . . . the first author of the Lancet paper is from Harvard Medical School, which sounds pretty impressive, but then again we saw that seriously flawed paper that come out from Stanford Medical School, and a few months ago we heard about a bungled job from the University of California medical school. These major institutions are big places, and you can’t necessarily trust a paper, just because it comes from a generally respected medical center.

Again, I haven’t looked at the article in detail, nor am I any kind of expert on hydro-oxy-chloro-whatever-it-is, so let me say one more time that outcome 1 above is still a real possibility to me. Just cos someone sends me some convincing-looking criticisms, and there are data availability problems, that doesn’t mean the paper is no good. There could be reasonable explanations for all of this.

83 thoughts on “This controversial hydroxychloroquine paper: What’s Lancet gonna do about it?

  1. One additional important development: The WHO announced today that they would be temporarily pausing their trial of hydroxychloroquine, citing the results of The Lancet paper

  2. Excellent post. Reputations adjust. If something is published in Lancet, we now know that conveys little information. It might actually be a good paper!

    This is now how I view the fields of climate science and epidemiology. If an expert in one of these fields says something that is policy-relevant, it might actually be true, but it has no more credibility than if a journalist said it, and in the case of an assertion about some issue in climate science, less credibility than if, say, a statistician or a mechanical engineer said it.

    Maybe I should clarify that, just in case. If someone with a climate science PhD says, “The temperature will be X in 2050,” that has as much credibility as if a journalist makes the same statement, but less than if someone with stats or mechanical engineering PhD says it. None of them has any expertise worth trusting, but the stats or mechanical engineering PhD is a smart guy who works in a field with intellectual integrity, so our prior on his reliability and honesty should be higher.

    That is not to say individual people who publish in Lancet, climate science, or epidemiology are not very very good in morals and intellect, just that their apparently prestigious connections should not be relied upon.

    • “Maybe I should clarify that, just in case. If someone with a climate science PhD says, “The temperature will be X in 2050,” that has as much credibility as if a journalist makes the same statement, but less than if someone with stats or mechanical engineering PhD says it. None of them has any expertise worth trusting, but the stats or mechanical engineering PhD is a smart guy who works in a field with intellectual integrity, so our prior on his reliability and honesty should be higher.”

      As someone with a climate science PhD (Geography, but my thesis work got published in the Journal of Applied Meteorology), I suppose I should just slink off into a corner, but instead I will point out that climate model predictions have held up pretty well, even older ones: https://www.carbonbrief.org/analysis-how-well-have-climate-models-projected-global-warming

      • John,
        I assumed Eric was trolling or joking or something. Indeed, if you search on YouTube you can find way more engineers saying wacky things about climate science than climate scientists saying wacky things about climate science…not that YouTube is necessarily a reliable way to judge the bigger picture. Still, I have long noticed the phenomenon of engineers expounding naive notions of climate.

        Climate scientists as a group have nothing to be ashamed of. No need to slink off anywhere.

        • The very characterization that climate scientists say “The temperature will be X in 2050” displays a baseline ignorance of what “climate scientists say.”

          Prolly a tic of enignneers that they can’t figure out what people actually say?

    • > but the stats or mechanical engineering PhD is a smart guy who works in a field with intellectual integrity

      Woah woah calm your horses there. I make no claims to intellectual integrity. Also switch guy -> person.

      Maybe the moral to the story here is that these high level credentials are just kinda useless. You gotta zoom in more if you’re gonna get anywhere. The solution is not to just give credence to other high level credentials (stats and mechE PhDs).

  3. Even the most extreme cases are difficult to retract or correct. Consider the most notorious Lancet paper of all, the vaccines paper by Andrew Wakefield, which appeared in 1998, and was finally retracted . . . in 2010. If the worst paper ever took 12 years to be retracted, what can we expect for just run-of-the-mill bad papers?

    But that most notorious Lancet paper of all was shot down immediately! Maybe they took too long to find the fraud in it and “piss on its grave” but it’s not a good example of simple bad science going unrecognised as such and uncorrected for too long.

    • Paul:

      Exactly. It got shot down right away but it took Lancet 12 years to retract. Your terminology helps make my point: retraction is “pissing on its grave,” hence except in extreme cases the bad papers just sit there in the literature forever. I assume that JPSP hasn’t retracted that horrible ESP paper either.

      • Andrew:
        Andrew:

        I don’t think this is right. It took years to discover that Wakefield committed actual fraud, and that only came out as a result of extensive litigation. The original Wakefield paper was just a series. So, it should never have been interpreted as anything other than a disturbing cluster of cases that should have been looked into. Of course, it was not a cluster of cases. Wakefield directed patients that had been recruited by a plaintiff’s law firm to the same hospital, which made it look like there was a strange pattern emerging. Series are important because when doctors notice a strange new group of patients that present in the same way, it could be an emerging medical emergency that everyone needs to know about immediately. I’m not exactly defending Lancet, but in the Wakefield case, Lancet was the victim of a fraud. The pharmaceutical companies’ lawyers were saying Wakefield was a fraud. I think that Lancet was in a difficult spot. They needed someone independently to say, “Yes, the Pharma companies are right, Wakefield is a fraud.” I am not really sure that they could have done that earlier. The fault really lies with Wakefield.

  4. > If you read between the lines, you can figure out which Lancet papers are worth reading.

    I see this line of thought come up a lot with colleagues and at conferences, the idea that sure, there’s a lot of crud floating around, but the smart people know how to avoid it.

    But to (ab)use Andrew’s analogy, all that response does is say that you need a secret handshake to get into the safe room of the house. Only one’s secret insider knowledge has the power to detect shoddy* work. And meanwhile, one is still able to reap the benefits of being in the house as it is being ransacked!

    To be clear, I think that most people who talk about how “everyone knows X does bad work” or “everyone knows this method is flawed” are making honest assessments based on their understanding, and are not trying to be malicious or hoard power. I think the problem is that the discussion seems to end there, instead of treating that realization as the first step.

    * To use a term Andrew does not use.

    • gec said,
      “To be clear, I think that most people who talk about how “everyone knows X does bad work” or “everyone knows this method is flawed” are making honest assessments based on their understanding, and are not trying to be malicious or hoard power. I think the problem is that the discussion seems to end there, instead of treating that realization as the first step.”

      Maybe you and I live in different subcultures — I rarely hear or read anyone saying or writing things like “everyone knows X does bad work” or “everyone knows this method is flawed”. I think these are inherently ill-thought out — like saying “It’s obvious” when it’s not obvious to a lot of people.

      • First, I’m happy to hear that you don’t encounter this line of thinking too often!

        > it’s not obvious to a lot of people

        This is exactly my rub. If we were in an art salon, doing this for ourselves, maybe this attitude would be fine. But our work gets used in health, engineering, design, and policy by people who do not have the time to become experts in everything and need to trust us. And, arguably more importantly, our work is the foundation for research not just next week, but over the next decades and (maybe) centuries. How would it be “obvious” to someone fifty or a hundred years from now that there’s a widely-acknowledged problem if we don’t document it in the contemporaneous literature?

        I’m in agreement with Andrew that there’s nothing inherently more “correct” of something published in, say, Nature versus a blog post—it’s a question of the quality of the underlying work and we’ve seen that the prestige of the venue is at best weakly correlated with quality. So when I say “literature”, I mean just something that will persist and that is easily connected to the rest of ongoing scientific discourse.

  5. Remember the Lancet study of about 10 or 12 years ago about how many civilians were killed in Iraq during the US war in Iraq?

    https://statmodeling.stat.columbia.edu/2014/01/29/questioning-lancet-plos-surveys-iraqi-deaths-interview-univ-london-professor-michael-spagat/#comment-153714

    The researchers claimed they hired locals who knocked on random doors on Iraq in c. 2006 and interviewed the residents about how many family members had died since 2003. This was very politically controversial at the time.

    I, however, was less interested in the political controversy than in the purported methodology: would hired researchers in Iraq in 2006, a year of vicious ethnic slaughter, really knock on random doors? What was the likelihood of knocking on the wrong door and winding up with a hole drilled in your head?

  6. Here are some of Entsophy’s comment on that Lancet Iraq death toll study:

    (3) …There is no way someone was randomly sampling Anbar’s population during that time. I don’t even believe they were driving anywhere. This was a time when the Iraqi government was putting out public service announcements telling everyone to never stop at a Police checkpoint unless Americans were there, because doing so was liable to be a death sentence otherwise.

    (4) For a time in Anbar, foreign Arabs were being regularly killed by the locals because of suspected Al-Qaida ties. When polling agencies hire locals, they’re typically not actual “locals”, they’re usually Arabs who at best speak the Iraqi dialect and often not even that. So they did not have anything like the freedom of movement that a Westerner would imagine.

    (5) I think it VERY likely the locals hired fudged the results. Everything from never leaving their house and making up data, to asking only friends and family. …

    (7) Some of these interviews had to be cut short because the person being questioned would get violently angry at the interviewer for no reason. Basically, the subject matter would get them wound up and they’d take their anger out on pollster because they were the nearest person (and a free target since they weren’t really local)

    (8) Iraqi’s believe in an order of magnitude more conspiracy theories than Americans… If a stranger came up to them and started asking questions it would often set off God-only-knows-what conspiracy theories in their heads. I seriously wouldn’t be surprised if the median interviewee thought they were talking to the CIA. Actually, I’d be shocked if that weren’t true.

    (9) I don’t think the average Iraq “gets” polls the way the average Westerner does.

    https://statmodeling.stat.columbia.edu/2014/01/29/questioning-lancet-plos-surveys-iraqi-deaths-interview-univ-london-professor-michael-spagat/#comment-153723

  7. If peer review isn’t QC, then what is it? Bad papers signal to me that peer review is imperfect QC. But are you saying peer review shouldn’t even try to be QC?

    • Adede:

      For a journal like Lancet, the #1 concern of peer review is that the paper be important or newsworthy. The #2 concern is that it respects the literature of whatever subfield is represented by the editor or associate editor of the journal. Quality control is way down on the list.

  8. Andrew: We need something like this for our peer-reviewed journals:
    https://www.adfontesmedia.com/?v=402f03a963ba

    I’d say every peer-reviewed periodical would rank higher (on reliability) than every periodical in this chart.

    I believe a naive reader could interpret your musings that the lancet, PNAS, etc., should rank at the same level as the enquirer, or infowars.

    I honestly don’t know where you’d rank the lancet, or JASA for that matter, on this chart?

    • Joshua:

      That chart seems pretty ridiculous to me.

      Regarding your question, I don’t really know how you’d put, say, Lancet and NPR on a common scale. They’re trying to do different things. To put it another way, when Lancet and NPR articles are wrong, they’re wrong in different ways. The Enquirer and Infowars are different in that they are pushing a particular political line. Lancet has its political biases but it’s more of a loose confederation; they can publish all sorts of things. To put it another way, Lancet, PNAS, etc., are similar to traditional news organizations in that they have some aspects of an internal marketplace, and their goal is to make money, get scoops, and inform their readers.

  9. Whether or not hyroxychloroquine kills people with Covid-19 is one question, but it is a drug prescribed for other life threatening illnesses such as malaria. Unless there is an infinite supply, then I would assume that lots of people suddenly taking it for Covid-19 would impact on availability for other illnesses. I realise that there is a high level of malarial resistance to chloroquine-alikes, but still, I would be interested to know if anyone has examined this knock on effect.

  10. I think there is a fourth possibility. We know that studies of drugs for their primary indication are fraught with risk. This was clearly articulated in the literature by 1983 (https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.4780020222) bu Olli Miettinen but was built on a long tradition of problematic cases. More recently we have Tobias Kurth’s paper showing implausible results in cases with medical desperation (https://academic.oup.com/aje/article/163/3/262/59818). Treatment is not random when physicians, patients, and families are desperate.

    We also saw this with hormone replacement therapy and CVD; the theory that it would help with CVD ended up channeling the drug to high SES individuals. This is why drug people insist on trials — too many results reverse direction when channeling effects are removed.

    There is no possibility that these medications were being given for an indication other than covid-19. There is just not that much co-infection with malaria.

    So option #4 is the study is fine. The data (despite some alarming patterns) turns out to be okay. But the result is simply wrong because they are studying an intended effect. Or they could be correct. Sometimes you end up accounting for channeling (people do, in fact, occasionally win the lottery) and the results are consistent with the causal association posited by trials. My favorite paper was looking at tricks to try and do this with observational data.

    The real issue is the strength of the conclusions. Of course, the data issues are a separate and concerning factor as well. But I wish we’d show more humility with observational drug research. I do a LOT of it and it annoys me when these basics of interpretation are glossed over with bullet text about observational research having limitations (unknown confounders — grrrrr) and not a recognition that this is why we do trials — because these estimates are inherently unreliable.

    #EndofRant

    • Joseph, thanks for posting those comments.

      I’m perplexed that the phrase “confounding by indication” appears nowhere in the Lancet article. I guess it didn’t get reviewed by any pharmacoepidemiologist worthy of the name.

      With that in mind, the secondary endpoint of arrhythmia is more likely to have yielded useful results, assuming ascertainment bias is not a problem. Would this patient population have their ECG monitored routinely?

      So while I take the headline result with a grain of salt, I’d say there’s still a safety signal of some concern here.

  11. I hope someone can soon figure out what the reality is around this Surgical Outcomes Collaborative (Surgisphere) database in the IBM cloud with a claimed 247,148,306 patient interactions. The websites associated with this company and its projects are glaringly unprofessional to the point where they seem to have never been used by a real, operational company with Dr. Desai as CEO. The research / diagnostic tool they released to the public is also unimpressive and doesn’t have any of the features you would expect from a data analytics company that touts its machine learning and AI software. There’s more that could be said, but I hope someone is able to directly challenge the company or has a way to show whether this company is meeting industry standards and doing what they claim or not. (https://surgisphere.com/research-tools/changelog.php) (https://surgisphere.com/2020/04/07/severity-scoring-api/)

  12. The authors published a blog post on it:

    Ultimately, medicine is a science, and science should be driven by data.

    https://surgisphere.com/2020/05/25/lancet-paper/

    I’m still not sure what was actual data and what was model in those numbers they published. Honestly, without seeing the process to get from the messy raw data to final values they are pretty worthless to me. If people would stop paying for these numbers they would go away and be replaced by something better.

    • Totally useless blog statement – ‘trust us, we are using sophisticated models from EHR’. From my limited reading of the criticisms, they are being accused of data fraud – the data look ‘too good to be true’. Until someone is able to inspect their raw data, I’m going to ignore their findings.

      • The data look too good to be true because they are showing you the propensity weighted data. They haven’t shown us the raw data as far as I can tell, and they may not be allowed to, depends on each of the hospitals’ DUA’s.

  13. Usually, I tend to think about this issue in an optimistic way:

    if an article is good, it can be published in a highly influential journal
    if an article is published in a highly influential journal, it’s not always true it’s good

    My optimism is in the first implication, of course :)

  14. Just to be clear, the firearm law paper in the Lancet used a model that included 39 df (not just 25) to explain 50 observations. They only *interpreted* 25 of the predictors, but they also tossed in many covariates. There were also factual errors and inconsistencies identified in the paper, however, Lancet was unwilling to consider a retraction… so it will be cited forevermore.

  15. The whole big database thing for Surgisphere of the The Lancet paper looks strange. A little bit of digging suggests the name to look for is Quartzclinical which is supposedly their clinical data arm.

    Quartzclinical’s hospital market intelligence tool is about 18 months old. Way too young to have hundreds of hospitals signed up to their system (which are extremely vaguely described). It’s tiny and mostly sales and marketing droids.

    Quartz’s PR bumpf on Crunchbase draws on some old work under Sapan Desai’s name that uses statistical analyses of the National Inpatient Sample to try to make recommendations for clinical best practice. He does this a few times, so I think he has experience with the type of analysis done for The Lancet study.

    However, where it gets slightly more interesting is that this Desai uses this type of analysis on the Vascular Quality Initiative database (VQI) – he is after all a vascular surgeon. VQI (www.vqi.org) happens to have 676 participating centers – not a long way off the 671 referred to in The Lancet paper.

    VQI’s data is blinded, with the ability to pull matched datasets or use propensity score matching. The US system is run by m2s.org, which actually has a cloud platform with data and tools behind it, and seems to be a model Quartz is following. My feeling is the data behind The Lancet study is connected to VQI, or its international umbrella body: The International Consortium of Vascular Registries – Europe Vascunet. The data collected seems to be in the same area. (More details: https://www.ejves.com/article/S1078-5884(19)31084-6/fulltext)

    I’m not sure if any of these breadcrumbs are real, but a hypothesis is that Quartzclinical are hotwiring into some pre-existing datasets of which Vascular Data has obvious links, and then re-purposing them to look for COVID cases.

      • Interesting. Aside from the issues with the data sources, this paper appears to suffer from the same issue with selection bias, despite the abundant matching which was done. Patients were matched in a number of dimensions, including severity of illness, yet three times as many in the control group were on mechanical ventilators as in the avermectin group. Doesn’t this suggest that the severity of illness was different in the 2 groups?

        • I apparently read it wrong – they show the results for both patients on mechanical ventilation and those not. In both cases, ivermectin was associated with far lower death rates. The overall death rates for those on mechanical ventilation were about double those not on mechanical ventilation (2% vs 1%) – I thought the differences were larger than that in other studies. Can anyone shed some light on that?

        • As I read this paper, I can’t make any sense out of their numbers. Here is one example: they report that 7.3% of those on mechanical ventilation and receiving ivermectin died. But, their table 1 shows 22 people on mechanical ventilation and receiving ivermectin. If 1 out of 22 died, that is a death rate of 4.5 % and if 0 out of 22 died, that is a death rate of 0%. And, their Figure 1 shows the death rate as 0.25%. I can’t figure out what I am looking at. Can anyone help?

        • FWIW, I don’t get Fig 1. 13 deaths out of 704 patients on ivermectin it’s 1.8% but mortality in the chart is much lower.

        • the very likelihood of 704 patients being treated with a drug without any credible IN VIVO antiviral activity in this period is extremely questionable

        • The Australian study describing the in vitro effect of ivermectin on SARS-CoV-2 was published on Apr 3rd, and on Apr.19 a preprint appears with 704 treated Covid-19 deidentified patients. I don’t want to use strong words but this is quite doubtful at best.

        • It could be that the treatment results in less people being put on mechanical ventilators. It could be that they were already on them before the treatment was chosen. It probably will be a mix of both. We don’t know to what extent it should be considered an outcome or a baseline characteristic and my guess is that the data is not detailed enough to tell.

        • ” …my guess is that the data is not detailed enough to tell”
          .
          The data or the reporting/description of the data?

        • I was thinking also about the underlying data (that infamous database), in particular about cases where both the treatment and the mechanical ventilation might happen concurrently at admission.

          The language in the “outcomes” section is ambiguous: “The principal outcome assessed was the proportion of patients that died in the Ivermectin group compared with the propensity matched control cohort. If mechanical ventilation was required, we evaluated the death rates in this group separately” and there was a reference to ivermectin being a de novo medication started after diagnosis for COVID-19 without any mention of ventilation.

          But in the Lancet paper they say that “Patients for whom one of the treatments of interest was initiated more than 48 h after diagnosis or while they were on mechanical ventilation, as well as patients who received remdesivir, were excluded.” so it seems the data may allow for some cleaning (even though I’m not sure this was considered in the ivermectin preprint).

        • The inclusion criteria described in the paper specifically EXCLUDE patients who were on ventilation when treatment was begun. So, they consider ventilation as an outcome, not an input.

        • Are you referring to the paper being discussed in this subthread? (Usefulness of Ivermectin in COVID-19 Illness)

      • It’s also got data anomaly problems. For example, the abstract says “3 continents”, the results say the patients are from hospitals in North America, Asia, and Europe, but Table 1 lists South America, Africa, and Australia as well.

        Also, the fact that they don’t say anything about doses is very weird given that the usual anti-worm dose of ivermectin is orders of magnitude lower than the in-vitro anti-viral dose.

        • There was a previous preprint (SSRN 3570270 which is no longer available but I copy the abstract below), which I assume is based in the same data, where the dose is mentioned (150mcg/kg) but it’s only about 52 cases where the treatment was started after mechanical ventilation.

          See also https://twitter.com/MRMehraMD/status/1250862259162750981

          Peer-review may not be a form of quality control, but the fact that this study has gone nowere after 6 weeks maybe tells us something (but it doesn’t mean that they have fabricated the data).

          =====

          Patel, Amit and Desai, Sapan, Ivermectin in COVID-19 Related Critical Illness (April 6, 2020)
          As the quest to define an anti-viral therapy for treatment of COVID-19 illness continues with little success, a new potential candidate has emerged.
          A pre-clinical study, demonstrated that ivermectin, FDA approved as an anti-parasitic agent with an established safety profile, was able to reduce SARS-CoV-2 viral RNA by 5000-fold within 48 hours. Importin (IMP) α/β1 30 is a heterodimer that binds to the SARS-CoV-2 cargo protein and moves it into the nucleus which reduces the host cell antiviral response. Ivermectin destabilizes the Impα/β1 heterodimer, prevents it from binding to the viral protein and thus, entering the nucleus.
          Based on these promising in-vitro findings, we sought to evaluate the clinical usefulness of ivermectin in critically ill patients with COVID-19.
          In an observational registry-based study from 169 hospitals across Asia (AS), Europe (EU), Africa (AF), North (NA) and South America (SA), we evaluated critically ill hospitalized patients diagnosed with COVID-19 with lung injury requiring mechanical ventilation, between January 1st 2020 and March 1st 2020.
          In this series of 1,970 patients, 1,609 survived hospitalization to discharge and 361 died (18.3%). We recorded 52 patients (AS-7, EU-21, AF-3, NA-14, SA-7) who received Ivermectin (150 mcg/Kg) once after mechanical ventilation was instituted.
          The indications for use of the drug were related to clinician preference and based on prior data on the broad antimicrobial and specifically antiviral effects of this agent.
          Compared to 1,918 conventionally treated patients we observed a survival benefit for ivermectin (mortality rate 18.6% vs 7.7%; HR 0.18, 95% CI (0.07-0.48), log rank (Mantel-Cox) p<0.001).
          The hospital length of stay was 15.7 +/- 8.1 days vs 10.9 +/- 6.1 days, p<0.001 and intensive care unit length of stay 8.2 +/- 6.2 days vs 6.0 +/- 3.9 days, p<0.001 respectively.
          In COVID-19 illness, critically ill patients with lung injury requiring mechanical ventilation may benefit from administration of Ivermectin.
          We noted a lower mortality and reduced healthcare resource use in those treated with ivermectin. These observations should not be considered definitive and allow for translation of a hypothesis from bench to bedside which will require confirmation in a controlled clinical trial setting.

        • “The indications for use of the drug were related to clinician preference and based on prior data on the broad antimicrobial and specifically antiviral effects of this agent.”
          This rationale sounds absurd. I simply cannot believe that anybody was thinking about ivermectin for Covid-19 before the publication of Caly et al. in Antiviral Research in the begging of April. The antiviral effects are only registered in vitro at huge concentrations.
          Moreover the removed preprint contained the most impossible Kaplan-Mayer survival analysis I have ever seen with less than 20% survivors in the non-treated group.

        • Apparently some people was thinking about ivermectin for COVID-10 already in March. This is from a newspaper in the Dominican Republic (I omit the link because I want to include another one later, but it’s easy to find): “El primer caso al que aplicaron el medicamento fue el día 22 de marzo. Se trataba de un señor “prácticamente muriéndose”, según comentó, listo para someterlo a ventilador. A las ocho horas de haberle suministrado la primera dosis de ivermectina, sus síntomas de gravedad desaparecieron.”

          Anyway, the statistical analysis can be very bad even if the treated patients did exist. More coments about (and a link to) the original, retired preprint:

          https://www.isglobal.org/en/healthisglobal/-/custom-blog-portlet/ivermectin-and-covid-19-how-a-flawed-database-shaped-the-covid-19-response-of-several-latin-american-countries/2877257/0

    • This is an intriguing idea, particularly given Desai’s vascular background. Of course, it would still be breathtaking research fraud, as clearly these databases are not from QuartzClinical as claimed.

  16. The removed preprint SSRN 3570270 whose abstract you posted contained an impossible Kaplan-Meier analysis plot showing less than 20% survival in the cohort not treated with ivermectin.Moreover the stated rationale “The indications for use of the drug were related to clinician preference and based on prior data on the broad antimicrobial and specifically antiviral effects of this agent” is ridiculous. Ivermectin has never been authorized for use as an antiviral drug and has antiviral activity only in vitro. The preprint suggests that patients were treated with ivermectin for Covid-19 before the publication of the Monash University for its anti-SARS-CoV-2.
    In most European countries this drug has limited utility in human medicine and the idea that infectious disease and intensive care specialists were aware of the few published in vitro studies of ivermectin in Dengue viruses etc. is extremely doubtful.

      • The text “Compared to 1,918 conventionally treated patients we observed a survival benefit for ivermectin (mortality rate 18.6% vs 7.7%; HR 0.18, 95% CI (0.07-0.48), log rank (Mantel-Cox) p<0.001)" and the survival analysis plot are apparently not compatible. It is not surprising that they removed the preprint but it is a real mistery how a Harvard professor could co-author voluntarily such a manuscript.

        • Thank you, I misunderstood your remark and thought that the issue was the high mortality on its own. I also find surprising that the second preprint is still there, because there are inconsistencies in the little info it contains as well.

  17. Because there have been no true scientific studies of hydroxychloroquine treatment, no one knows for sure whether hydroxychloroquine is safe and effective or not. That would require randomly separating a large group of coronavirus patients into two groups, one of which receives the hydroxychloroquine treatment and the other of which doesn’t, and then following the outcomes of both groups for comparison. As the authors of this study state in their paper:

    “Randomized clinical trials will be required before any conclusion can be reached regarding benefit or harm of these agents in COVID-19 patients.” And then they reached a conclusion.

    This study looked at hospitalized patients, comparing the outcomes of those that had been treated with chloroquine/hydroxychloroquine to those that had not. In the study, only 15% of patients were in the hydroxychloroquine group. The other 85% did not receive the medication. It would be reasonable to assume that there was a strong bias toward the placement of patients that were having more significant clinical symptoms into the hydroxychloroquine treatment group, and therefore the study has an inherent strong bias against the use hydroxychloroquine. In my opinion the study is next to worthless.

    • Georgi:

      Better than nothing, but . . . they write, “several concerns were raised with respect to the veracity of the data and analyses.”

      I have two problems here.

      1. Even had the data been clean, there were many questions raised about the analysis, not just veracity but also statistical appropriateness. Even if the first author of this paper trusted Surgisphere’s data, why should he have trusted their statistical analysis?

      2. What’s with the passive voice? “Several concerns were raised . . .” They should give credit to James Watson and Peter Ellis, along with the (mostly anonymous) commenters on pubpeer. Watson and Ellis did a lot of work, and they stuck their necks out. They should be thanked by the author of the paper and the editors of the journal.

      It’s frustrating to me how this sort of retraction is considered to be an embarrassment to be swept under the rug.

      But, yeah, good that they retracted after just a couple weeks instead of 12 years this time.

      • I understand very well your disappointment Andrew, but no one is fooled. It was their only way to avoid a lawsuit if the data were fraudulent.

        I’m very touched by all the analytical work you’ve all done here and on the social networks. Many scientists have joined you, supported you, found the courage to express their opinions and analysis thanks to you, that’s what you have to look at. Scientists, together, have an incredible strength, and this is something new in our world, a great experience. If we can do it for this, we can do it for other important issues. Something’s changed, definitely. We’re on the road of becoming responsible and active.

        And most importantly, clinical trials can go on.

        So, just… THANK YOU and BRAVO

        (and again sorry for my terrible english)

      • This is a very sad story… and such a powerful blow on science credibility, impact factors, peer review practices etc.
        Such influential journals should have the best peer review practices since even WHO doesn’t dare to question their data…

        https://www.google.com/amp/s/amp.ft.com/content/9ac02bc4-465b-4734-a741-d714a04b477e

        The same WHO, whose expert opinion paper on antimalarials states inter alia “Hundreds of metric tonnes of chloroquine have been dispensed annually since the 1950s, making chloroquine one of the most widely used drugs in humans. Despite this extensive use, the lethality of the 4-aminoquinoline chloroquine in overdose has caused concern over the use of chloroquine for the treatment of malaria.”
        The role of this drug for Covid-19 is probably marginal but without rigorous testing we will never know. Such fraudulent studies are so detrimental especially during such unprecedented times…

    • Wow. I guess the NEJM paper will go the same way. Clearly Desai’s intention with these papers was to push whatever he was trying to sell, but how could he consider that it was a good idea to use stolen/counterfeit goods (now we may never find out) to showcase his product? It’s not like chloroquine studies go unnoticed!

  18. Does anybody else find this strange? The retraction notice says “Our independent peer reviewers informed us that Surgisphere would not transfer the full dataset, client contacts, and the full ISO audit report to their servers for analysis, as such transfer would violate client agreements and confidentiality requirements.” But of course Surgisphere is Dr. Desai who is also an author on the papers. I realize that Surgisphere is a separate legal entity, and that the client agreements also presumably involve the various hospitals, but it seems strange to me for the authors to make it sound like Surgisphere is not cooperating and that “We always aspire to perform our research in accordance with the highest ethical and professional guidelines.”

    Can someone translate for me?

    • The retraction letter on the HCQ study isn’t signed by Desai, so it looks like his co-authors requested the retraction because Desai wouldn’t let them actually look at the data (which as you and I know, has next to zero chance of even existing).

      • Maybe you’re right (at this point I don’t know what to think) but if the data is completely made up why not go for a less controversial result?

        Publishing strong results about hydroxychloroquine is risky (especially if they are negative, people have very strong opinions on the subject) and the paper would have been published just the same if it had got a milder conclusion (like previous papers that reported confidence interval wide enough to cover pretty huge effects in both directions).

        Maybe he found the NEJM paper didn’t have enough impact? (and looking for more exposure he got completely burned) Maybe he was advancing multiple agendas at the same time? (big-pharma money or whatever)

        Now that the papers are retracted there is no “pressure” but I hope we eventually learn more about this.

        • If the data exists, then it’s exfiltrated data. There seems to have been multiple attempts to find a hospital that would acknowledge having given surgisphere any data of any kind, and no one could come up with *any* instances. The Scientist was one example… I haven’t been keeping track enough to remember what else but it seems like there have been efforts to find any way in which this data could have been voluntarily given to surgisphere…. and it doesn’t seem likely

          https://www.the-scientist.com/news-opinion/concerns-build-about-surgisphere-corporations-dataset-67605

          So why do this? Well, why do STAP cells or Marc Hauser’s studies or whatever… I don’t know.

        • Since the shallowness of Surgisphere was made evident I assumed they were piggybacking on some existing database. That would explain that hospitals didn’t know of Surgisphere but left a lot of questions open about to what extent the obtention of the data was in compliance with the different contracts between parties.

      • Good catch – I hadn’t noticed that Desai (I’m dropping the Dr title now) did not sign the retraction letter. I’m not aware of seeing other retractions that were prompted by only a subset of the authors. Very strange behaviors. I wonder if any of the reputations (including the journals of course) will be tarnished by this episode.

    • I don’t know if it has any deep meaning, but the NEJM paper was retracted by all the authors, including Desai, and the language is different (without reference to Surgisphere): “Because all the authors were not granted access to the raw data and the raw data could not be made available to a third-party auditor, we are unable to validate the primary data sources underlying our article”.

      The NEMJ expression of concern said simply that “We have asked the authors to provide evidence that the data are reliable”.

      The one in the Lancet referenced the ongoing audit requested by the other authors: “an independent audit of the provenance and validity of the data has been commissioned by the authors not affiliated with Surgisphere and is ongoing, with results expected very shortly,”

  19. Desai obviously used the Harvard professor as a trojan horse to overcome the ‘rigorous’ peer review of these extremely influential journals.
    As already noted a team of 2 surgeons and 2 cardiologists without statisticians, epidemiologists etc. is as capable of performing the analysis this study is based on as a statistician is capable of treating a dissection of the aorta. Desai should have included in the authors team his pornstar and science fiction writer employees…

Leave a Reply to Paul Hayes Cancel reply

Your email address will not be published. Required fields are marked *