IEEE’s Refusal to Issue Corrections

This is Jessica. The following was written by a colleague Steve Haroz on his attempt to make corrections to a paper he wrote published by IEEE (which, according to Wikipedia, publishes “over 30% of the world’s literature in the electrical and electronics engineering and computer science fields.”)

One of the basic Mertonion norms of science is that it is self-correcting. And one of the basic norms of being an adult is acknowledging when you make a mistake. As an author, I would like to abide by those norms. Sadly, IEEE conference proceedings do not abide by the standards of science… or of being an adult.

Two years ago Robert Kosara and I published a position paper titled, “Skipping the Replication Crisis in Visualization: Threats to Study Validity and How to Address Them”, in the proceedings of “Evaluation and Beyond – Methodological Approaches for Visualization”, which goes by “BELIV”. It describes a collection of problems with studies, how they may arise, and measures to mitigate them. It broke down threats to validity from data collection, analysis mistakes, poorly formed research questions, and a lack of replication publication opportunities. There was another validity threat that we clearly missed… a publisher that doesn’t make corrections.

Requesting to fix a mistake

A few months after the paper was published, a colleague, Pierre Dragicevic, noticed a couple problems. We immediately corrected and annotated them on the OSF postprint, added an acknowledgment to Pierre, and then sent an email to the paper chairs summarizing the issues and asking for a correction to be issued.

Dear organizers of Evaluation and Beyond – Methodological Approaches for Visualization (BELIV),

This past year, we published a paper titled “Skipping the Replication Crisis in Visualization: Threats to Study Validity and How to Address Them”. Since then, we have been made aware of two mistakes in paper:

  1. The implications of a false positive rate

In section 3.1, we wrote:

…a 5% false positive rate means that one out of every 20 studies in visualization (potentially several each year!) reports on an effect that does not exist.

But a more accurate statement would be:

…a 5% false positive rate means that one out of every 20 non-existent effects studied in visualization (potentially several each year!) is incorrectly reported as being a likely effect.

  1. The magnitude of p-values

In section 3.2, we wrote:

…p-values between 0.1 and 0.5 are actually much less likely than ones below 0.1 when the effect is in fact present…

But the intended statement was:

…p-values between 0.01 and 0.05 are actually much less likely than ones below 0.01 when the effect is in fact present…

As the main topic of the paper is the validity of research publications, we feel that it is important to correct these mistakes, even if seemingly minor. We have uploaded a new version to OSF with interactive comments highlighting the original errors (https://osf.io/f8qey/). We would also like to update the IEEE DL with the version attached. Please let us know how we can help accomplish that.

Thank you,

Steve Haroz and Robert Kosara

Summary of what we wanted to fix

  1. We should have noted that the false positive rate applies to non-existent effects. (A sloppy intro-to-stats level mistake.)
  2.  We put some decimals in the wrong place. (It probably happened when hurriedly moving from a Google doc to latex right before the deadline.)

We knew better than this, but we made a couple mistakes. They’re minor mistakes that don’t impact conclusions, but mistakes nonetheless. Especially in a paper that is about the validity of scientific publications, we should correct them. And for a scientific publication, the process for making corrections should be in place.

Redirected to IEEE

The paper chairs acknowledged receiving the email but took some time to get back to us. Besides arriving during everyone’s summer vacation, there was apparently no precedence for requesting a corrigendum (corrections for mistakes made by the authors) at this publication venue, so they needed a couple months to figure out how to go about it. Here was what IEEE eventually told them:

Generally updates to the final PDF files are not allowed once they are posted in Xplore. However, the author may be able to add an addendum to address the issue. They should contact [email protected] to make the request. 

So we contacted that email address and after a month and a half got the following reply:

We have received your request to correct an error in your work published in the IEEE Xplore digital library. IEEE does not allow for corrections within the full-text publication document (e.g., PDF) within IEEE Xplore, and the IEEE Xplore metadata must match the PDF exactly.  Unfortunately, we are unable to change the information on your paper at this time.  We do apologize for any inconveniences this may cause.

This response is absurd. For any publisher of scientific research, there is always some mechanism for corrigenda. But IEEE has a policy against it.

Trying a different approach

I emailed IEEE again asking how this complies with the IEEE code of ethics:

I am surprised by this response, as it does not appear consistent with the IEEE code of ethics (https://www.ieee.org/about/corporate/governance/p7-8.html), which states that IEEE members agree:

“7 … to acknowledge and correct errors…”

I would appreciate advice on how we can comply with an ethical code that requires correcting errors when IEEE does not allow for it. 

And one of the BELIV organizers, to their credit, backed us up by replying as well:

As the organizer of the scientific event for which the error is meant to be reported, […] I am concerned about the IEEE support response that there are NO mechanisms in place to correct errors in published articles. I have put the IEEE ethics board in the cc to this response and hope for an answer on how to acknowledge and correct errors as an author of an IEEE published paper.

The IEEE ethics board was CCed, but we never heard from them. However, we did hear from someone involved in “Board Governance & Intellectual Property Operations”:

IEEE conference papers are published as received. The papers are submitted by the conference organizers after the event has been held, and are not edited by IEEE. Each author assumes complete responsibility for the accuracy of the paper at the time of publication. Each conference is considered a stand-alone publication and thus there is no mechanism for publishing corrections (e.g., in a later issue of a journal). The conference proceedings serves as a ‘snapshot’ of what was distributed at the conference at the time of presentation and must remain as is. IEEE will make metadata corrections (misspelled author name, affiliation, etc) in our database, but per IEEE Publications policy, we do not edit a published PDF unless the PDF is unreadable. 

That said, any conference author who identifies an error in their work is free to build upon and correct a previously published work by submitting to a subsequent conference or journal. We apologize for any inconvenience this may cause.

The problem with IEEE’s suggestion

Rather than follow the norm of scientific publishing and even its own ethics policies, IEEE suggests that we submit an updated version of the paper to another conference or journal. This approach is unworkable for multiple reasons:

1) It doesn’t solve the problem that the incorrect statements are available and citable.

Keeping the paper available potentially spreads misinformation. In our paper, these issues are minor and can be checked via other sources. But what if they substantially impacted the conclusions? This year, IEEE published a number of papers about COVID-19 and pandemics. Are they saying that one of these papers should not be corrected even if the authors and paper chairs acknowledge they include a mistake? 

2) A new version would be rejected for being too similar to the old version.

According to IEEE’s policies, if you update a paper and submit a new version, it must include “substantial additional technical material with respect to the … articles of which they represent an evolution” (see IEEE PSPB 8.1.7 F(2)). Informally, this policy is often described as meaning that papers need 30% new content to be publishable. But some authors have added entire additional experiments to their papers and gotten negative reviews about the lack of major improvements over previous publications. In other words, minor updates would get rejected. And I don’t see any need to artificially inflate the paper with 30% more content just for the heck of it.

It could even be rejected for self-plagiarism unless we specifically cite the original paper somehow. What a great way to bump up your h-index! “And in conclusion, as we already said in last year’s paper…”

3) An obnoxious amount of work for everyone involved.

The new version would need to be handled by a paper chair (conference) or editor (journal), assigned to a program committee member (conference) or action editor (journal), have reviewers recruited, be reviewed, have a meta-review compiled, and be discussed by the paper chairs or editors. What a blatant disregard for other people’s time.

The sledgehammer option

I keep cringing every time I get a Google Scholar alert for the paper. That’s not a good place to be. I looked into options for retracting it, but IEEE doesn’t seem very interested in retracting papers that make demonstrably incorrect statements or that incorrectly convey the authors’ intent:

Under an extraordinary situation, it may be desirable to remove access to the content in IEEE Xplore for a specific article, standard, or press book. Removal of access shall only be considered in rare instances, and examples include, but are not limited to, a fraudulent article, a duplicate copy of the same article, a draft version conference article, a direct threat of legal action, and an article published without copyright transfers. Requests for removal may be submitted to the Director, IEEE Publications. Such requests shall identify the publication and provide a detailed justification for removing access.  -IEEE PSPB 8.1.11-A

So attempting to retract is unlikely to succeed. Also, there’s no guarantee that we would not get accused of self-plagiarism if we retracted it and then submitted the updated version. And really, it’d be such a stupid way to fix a minor problem. I don’t have a better word to describe this situation. Just stupid.

Next steps

  1. Robert and I ask any authors who would cite our paper to cite the updated OSF version. Please do not cite the IEEE version. You can find multiple reference formats on the bottom right of the OSF page.
  2. This policy degrades the trustworthiness and citability of papers in IEEE conference proceedings. And any authors who have published with IEEE would be understandably disturbed by IEEE denigrating the reliability of their work. What if a paper contained substantial errors? And what if it misinformed and endangered the public? It is difficult to see these proceedings as any more trustworthy than a preprint. At least preprints have a chance of authors updating them. So use caution when reading or citing IEEE conference proceedings, as the authors may be aware of errors but unable to correct them.
  3. IEEE needs to make up its mind. It could decide to label conference proceedings as in-progress work and allow them to be republished elsewhere. However, if updated versions of conference papers cannot be resubmitted due to lack of novelty or “self-plagiarism”, IEEE needs to treat these conference papers the way that scientific journals treat their articles. In other words, if IEEE is to be a credible publisher of scientific content, it needs to abide by the basic Mertonian norm of enabling correction and the basic adult norm of acknowledging and correcting mistakes.

What about this idea of rapid antigen testing?

So, there’s this idea going around that seems to make sense, but then again if it makes so much sense I wonder why they’re not doing it already.

Here’s the background. A blog commenter pointed me to this op-ed from mid-November by Michael Mina, an epidemiologist and immunologist who wrote:

Widespread and frequent rapid antigen testing (public health screening to suppress outbreaks) is the best possible tool we have at our disposal today—and we are not using it.

It would significantly reduce the spread of the virus without having to shut down the country again—and if we act today, could allow us to see our loved ones, go back to school and work, and travel—all before Christmas.

Antigen tests are “contagiousness” tests. They are extremely effective (>98% sensitive compared to the typically used PCR test) in detecting COVID-19 when individuals are most contagious. Paper-strip antigen tests are inexpensive, simple to manufacture, give results within minutes, and can be used within the privacy of our own home . . .

If only 50% of the population tested themselves in this way every 4 days, we can achieve vaccine-like “herd effects” . . . Unlike vaccines, which stop onward transmission through immunity, testing can do this by giving people the tools to know, in real-time, that they are contagious and thus stop themselves from unknowingly spreading to others.

Mina continues:

The U.S. government can produce and pay for a full nation-wide rapid antigen testing program at a minute fraction (0.05% – 0.2%) of the cost that this virus is wreaking on our economy.

The return on investment would be massive, in lives saved, health preserved, and of course, in dollars. The cost is so low ($5 billion) that not trying should not even be an option for a program that could turn the tables on the virus in weeks, as we are now seeing in Slovakia—where massive screening has, in two weeks, completely turned the epidemic around.

The government would ship the tests to participating households and make them available in schools or workplaces. . . . Even if half of the community disregards their results or chooses to not participate altogether, outbreaks would still be turned around in weeks. . . .

The sensitivity and specificity of these tests has been a central debate – but that debate is settled. . . . These tests are incredibly sensitive in catching nearly all who are currently transmitting virus. . . .

But wait—if this is such a great idea, why isn’t it already happening here? Mina writes:

The antigen test technology exists and some companies overseas have already produced exactly what would work for this program. However, in the U.S., the FDA hasn’t figured out a way to authorize the at-home rapid antigen tests . . . We need to create a new authorization pathway within the FDA (or the CDC) that can review and approve the use of at-home antigen testing . . . Unlike vaccines, these tests exist today—the U.S. government simply needs to allocate the funding and manufacture them. We need an upfront investment of $5 billion to build the manufacturing capacity and an additional $10 billion to achieve production of 10-20 million tests per day for a full year. This is a drop in the bucket compared to the money spent already and lives lost due to COVID-19. . . .

I read all this and wasn’t sure what to think. On one hand, it sounds so persuasive. On the other hand, lots of tests are being done around here and I haven’t heard of these rapid paper tests. Mina talks about at-home use, but I haven’t heard about these tests being given at schools either. Also, Mina talks about the low false-positive rate of these tests, but I’d think the big concern would be false negatives. Also, it’s hard to believe that there’s this great solution and it’s only being done by two countries in the world (Britain and Slovakia). You can’t blame the FDA bureaucracy for things not happening in other countries, right?

Anyway, I wasn’t sure what to think so I contacted my epidemiologist colleague Julien Riou, who wrote:

I think the idea does make sense from a purely epi side, even though the author appears extremely confident in something that has basically never been done (but that maybe what you need to do to be published in Time magazine). In principle, rapid antigen testing every 4 days (followed by isolation of all positive cases) would probably reduce transmissibility enough if people are relatively compliant and if the sensitivity is high. The author is quick to dismiss the issue of sensitivity, saying:

People have said these tests aren’t sensitive enough compared to PCR. This simply is not true. It is a misunderstanding. These tests are incredibly sensitive in catching nearly all who are currently transmitting virus. People have said these tests aren’t specific enough and there will be too many false positives. However, in most recent Abbott BinaxNOW rapid test studies, the false positive rate has been ~1/200.

Looking at the paper the author himself links (link), the sensitivity of the Abbott BinaxNOW is “93.3% (14/15), 95% CI: 68.1-99.8%”. I find it a bit dishonest not to present the actual number (he even writes “>98%” somewhere else, without a source so I couldn’t check) and deflect on specificity which is not the issue here (especially if there is a confirmation with RT-PCR). The authors of the linked paper even conclude that “this inherent lower sensitivity may be offset by faster turn-around, the ability to test more frequently, and overall lower cost, relative to traditional RT-PCR methods”. Fair enough, but far from “these tests are incredibly sensitive” in the Time piece.

Two more points on the sensitivity of rapid antigen tests. First, it is measured with the RT-PCR as the reference, and we know that the sensitivity of RT-PCR itself is not excellent. There are a lot of papers on that, I randomly picked this one where the sensitivity is measured at 82.2% (95%CI 79.0-85.1%) for RT-PCR in hospitalised people. This should to be combined with that of rapid antigen testing if you assume both tests are independent. Of course there is a lot more to say about this, sensitivity probably depends on who is tested, when, whether there are symptoms, and both tests are probably not independent. Still I think it’s worth mentionning, and again far from “these tests are incredibly sensitive”. Second, the sensitivity is measured in lab conditions, and while I don’t have a lot of experience with this I doubt that you can expect everyone to use the test perfectly. And on top of that, people might not comply to isolation (especially if they have to work) and logistics problems are likely to occur.

Even with all these caveats, I think that this mass testing strategy might be sufficient to curb down cases if we can pull it off. Combined with contact tracing, social distancing, masks and all the other control measures in place in most of the world, being able to identify and isolate even a small proportion of infectious cases that you wouldn’t see otherwise can be very helpful. We’ll soon be able to observe the impact empirically in Slovakia and Liverpool.

So, again, I’m not sure what to think. I’d think that even a crappy test if applied widely enough would be better than the current setting in which people use more accurate tests but then have to wait many days for the results. Especially if the alternative is some mix of lots of people not going to work and to school and other people, who do have to go to work, being at risk. On the other hand, some of the specifics in that above-linked article seem fishy. But maybe Riou is right that this is just how things go in the mass media.