The ERROR project: “We pay experts to examine important and influential scientific publications for errors . . . We expect most published research to contain at least some errors . . . our reward system pays bonuses to both authors and reviewers even when minor errors are found. We believe that our field would be strengthened by a culture of checking, accepting, and communicating errors.”

Malte Elson writes:

I read your article with Andy King [non-paywalled version is here] in the Chronicle of Higher education with great interest. I totally agree that pre-publication peer review as a “quality management device” is not enough, and that particularly highly cited or otherwise influential scientific publications should undergo re-review as errors often proliferate in the literature for years before (and, regrettably, even after) critical errors or flaws are discovered.

I’m not sure if you’re aware of this, but we run a project that does exactly that out of my lab at the University of Bern in Switzerland: https://error.reviews/

The very short description of ERROR is that we pay experts to examine important and influential scientific publications for errors in order to strengthen the culture of error checking, error acceptance, and error correction in our field. As in other bug bounty programs, the payout scales with the magnitude of errors found. Less important errors pay a smaller fee, whereas more important errors that affect core conclusions yield a larger payout.

We expect most published research to contain at least some errors, and that we believe that the presence of minor errors does not negate an article’s scientific contribution. Indeed, our reward system pays bonuses to both authors and reviewers even when minor errors are found. We believe that our field would be strengthened by a culture of checking, accepting, and communicating errors.

The project’s overarching goals are to understand (a) what kinds of errors actually occur, (b) the rate at which they occur, and (c) what methods allow us to detect errors effectively and efficiently. Finally, this project will be used to determine how expensive a dedicated error detection system is compared to the follow-up costs of undetected errors.

For pragmatic reasons (because data and code often are not readily available) and overall tolerability of the project by the research community, ERROR currently requires active consent by the authors, meaning that a lot of highly cited papers will not be looked at. Again, I don’t think it’s possible to do so in fields where sharing and documentation of research materials is overall poor.

So far, we published 4 full review cycles, with one of the papers having a major error that affects a core conclusion: https://error.reviews/reviews/hehman-et-al-2018/

We intend to publish ~100 in total over the course of 4 years.

This sounds great.

22 thoughts on “The ERROR project: “We pay experts to examine important and influential scientific publications for errors . . . We expect most published research to contain at least some errors . . . our reward system pays bonuses to both authors and reviewers even when minor errors are found. We believe that our field would be strengthened by a culture of checking, accepting, and communicating errors.”

  1. Can’t wait for the first schemes to come out of this, just like schemes concerning citations cartels, or honorary authorship, or reviewing friend’s papers, etc.

    I mean, is it possible to try and earn some money by purposely making mistakes and alerting a friend to note this mistake and split the money?

    And, is this money possibly eventually (indirectly) coming from the tax-payers?

    And if so, is this a case of digging ditches and then filling them up again (see Binswanger, 2014)?

    Sigh…

      • Step 1: we’re asking “experts” to check for ERRORS because they cost so much money for “science” and “society”

        Step 2: see, we’re finding all sorts of ERRORS

        Step 3: wait, wouldn’t it be better to prevent all these ERRORS by having “experts” help with the writing instead of checking afterwards

        Step 4: wait, some ERRORS could have been prevented if “experts” were involved with designing the study and not just involved with writing the paper

        Step 5: wait, we have some friends over at the Controlling of Science (COS) institute who just happened to have developed some protocol that also involves “experts” at the design phase of research. We can all just “collaborate” and have our “experts” help everyone out.

        Just because we’re all friends, and very nice, and very helpful of course.

        Sigh…

        • We would prefer for error checking to become and remain a separate institution. Yes, authors may well change their practices away from the amateur software development standards they follow now if they expect their errors to be found and corrected.
          We do think it would be great if error checking were institutionalised in areas other than medicine. The FDA has done a lot of good, e.g. preventing thalidomide in the US while it was given to pregnant women in Germany. I guess that error cost a lot for “science” and “society”, causing a lot of “death”, my “boys” and, in some cases, “gals”.

        • Quote from above: “We would prefer for error checking to become and remain a separate institution”

          I checked an ERROR review via your website which led me to Psyarxiv (how come I’m not surprised by that….) which is affiliated with the Center for Open Science (COS) if I am not mistaken (?) Anyway, this COS should of course not be confused with the Controlling of Science (COS) institute in my reply above by the way, just to make things very clear!

          Anyway, it’s the Fernbach et al. 2019 paper about genetically modified foods or something like that. In the ERROR review posted on Psyarxiv I can read the following two things:

          1) “In general these issues can be avoided in future work by (i) having a researcher independently attempt to computationally reproduce the results prior to the submission of the manuscript,(…)”

          2) “This small number of issues amounted to what are likely to have been typos, but can be avoided in principle by (i) having an independent researcher specifically check for such typos prior to manuscript submission (…)”

          So, these two examples for me point to what I am afraid of and might happen: more control, more involvement of “experts”, more “collaboration”, etc. which seems nice but may end up to be(-come) something very not good for science.

          Also, from my (possibly incomplete) understanding of ERROR you are somehow acting as some sort of meta-reviewers who somehow advise that “independent researchers” should verify and check things (what have the reviewers been doing then?) and state that they think more critical error checking should be normalized or something like that while somehow not thinking this should be done by all the scientists in the normal procedures of science but by “experts” and “independent researchers”.

          It’s absolute madness in my view, and is becoming almost comical…

          Can you tell me who’s paying for all this stuff? Is it the COS themselves, or some of their sponsors?

        • Anon:

          I can understand your skepticism, but it seems nuts for you to label their program as “madness.” It’s a demonstration project, not going to reform science or even a subfield of science on its own, but it seems like it’s worth a try, and if people find it to be useful they can try to scale it up.

        • Quote from above: “(…) but it seems nuts for you to label their program as “madness.” ”

          I did not directly label their program madness but severely implied that due to me wondering about the following. How is it possible to state that one intends to make critical error checking normal or something like that while somehow not thinking this should be done by all the scientists in the normal procedures of science but by “experts” and “independent researchers”? That’s what triggered the “madness” reply, that’s what’s madness in my view. But maybe that’s not what their program is (implicitly) communicating.

          Regarding your comment about worth a try, scale up, etc. I think that view might be part of many problematic issues. I reason many proposal in the past, which may have led to extremely damaging results, may have started just like that. As a side note: when framed and phrased optimally the gist of this attitude and view can almost give people a cart blanche to make irreversible changes in things under the guise of “well, we’ll just see how it turns out. We’ll check and verify after our implementation and test. That’s science, right!”

          Sigh…

        • Anon:

          You write, “I did not directly label their program madness but severely implied that…”

          But in your above quote, you literally wrote, “It’s absolute madness in my view.” That sounds pretty direct to me!

          In any case, now we’re reaching what Phil calls “garbage time” in the thread so time for us to let others participate.

        • I don’t understand why “experts” is in quotes in the stuff Anonymous writes. Anonymous, can you clarify: do you think there’s no such thing as experts, nobody is more knowledgeable than anyone else about experimental design or data analysis or whatever? Or do you think yeah, ok, some people are experts at some things but it’s “madness” to think that if you’re doing a survey you should have an expert on writing survey questions on the team (to give an example). What’s the “madness” part, can you explain it clearly?

        • Just for the record: Other than Brian Nosek being on ERROR’s advisory board, the Center for Open Science is not involved in this project (financially or otherwise). We host the reports (on psychology papers) on PsyArXiv, and the associated files on OSF.

      • I’m one of the Co-PIs of ERROR.
        We had not thought about the scenario that someone will intentionally put errors in a career-defining paper, wait for it to garner citations, then ask a stooge to collect a bounty years later and “lose” the paper. But it does not seem likely to occur at all. As it stands, we do not allow self-nominations of papers or authors to nominate reviewers, which would even further complicate this hypothetical cartel’s operations.

        • Quote from above: “We had not thought about the scenario (…)”

          That’s not the scenario I had in mind either.

          I thought more about very recent papers being checked because they are “influential” because they are mentioned in the news or something like that. Or maybe this project ERROR idea will be transferred to pre-publication review in the future (via Registered Reports for instance, to name just one possible option that one might think of regarding this whole thing) and will be “incentivized” in the beginning because somehow everything in social science needs to be “incentivized” and “rewarded” for some reason (do you ever wonder why that is, and how that came to be, and what that might lead to?).

          Sigh…

        • Yeah, you’re not being very clear. I guess you think all the scare quotes and sighs carry significant meaning, but it is not getting across and sounds very conspiratorial. You cannot decide not to have incentives in a competitive field; you can decide to ignore the reward structure of the current way of doing science.

        • Quote from above: “Yeah, you’re not being very clear.”

          Maybe I only need to be clear for those who hear and see and understand what I am communicating. Perhaps you might call them experts in a certain way of communicating in this sense. I just try and make some things clear, and wonder about things, and take part in a discussion, and I don’t know how to make things more clear than I have (to experts, “experts”, and/or other readers).

          One more attempt perhaps. Let’s say I wrote a lot about pre-registration and talked a lot about that to the media about that topic and such things. Would that make me an expert on pre-registration in your view? And, what if I make a mistake concerning my own pre-registration in my own paper, does this negate being viewed as an expert. Am I still an expert on pre-registration then? Can I still talk to the media as an expert? Can I be hired by ERROR to check pre-registrations because you might (still?) view me as an expert concerning pre-registration? What even is the point of using words like expert in science?

          Do you, and/or Phil, know how to determine who is or isn’t an “expert” or an expert or (merely?) a reader in this light? Do you have to be an expert to decide who is an expert? How does ERROR decide who is an expert to then ask for their verifying, or whatever ERROR is doing? Why aren’t the reviewers and editor(s) from the original assessment that (likely) made the publication possible which is now under scrutiny again if I am understanding things correctly seen as experts? I thought that was part of the whole point of peer-review, to have experts decide whether or not something is worthy of being published or some nonsense like that. So many questions regarding ERROR, but I think that’s pointless to further try and make clear. I don’t know what else to say, or how to make things clear, but maybe you will understand one day.

        • Do we teach anything if we only ask rhetorical questions? Is Anonymous even trying to explain what he or she means? Is Anonymous an “expert” in being passive-aggressive? Is an expert the same as an “expert”? Do I wish Anonymous had chosen a different monicker so that I could safely ignore any comment by someone called “Anonymous”? Does Anonymous know how to write a simple declarative sentence? Maybe someday we will know the answer to these questions.

        • “Do we teach anything if we only ask rhetorical questions?” – I don’t know, and I don’t care at this moment. I am not a teacher, nor trying or willing to be a teacher so I don’t know what to do with this remark by you.

          “Is Anonymous even trying to explain what he or she means?” – Yes, this is my way of explaining what I meant with the expert thing in quotation marks. I ignored your initial comment because of Mr. Gelman’s comment which I tried to adhere to (not saying stuff for at least a while), but given the comment by Mr. Arslan I attempted to include your remark in my comment here.

          I do think some people are more knowlegeable than others on certain things. I also think it may be hard to determine who is more or less knowledgeable, especially for people not knowledgeable. I think the term expert may, for these reasons, be less than optimal to use in many cases for instance in the context of the project discussed here.

          “Is Anonymous an “expert” in being passive-aggressive?” – If I am passive-aggressive it is not my intention. I don’t even know what that means exactly, I looked it up and it’s still vague to me. I think I am pretty straightforward, although my way of communicating and my choice of words and actions might be seen as unusual by some.

          “Is an expert the same as an “expert”?” – Nope, the quotation marks are regularly used by me to sort of indicate any, or a combination of, the following: undefined, vague, false, fake, unclear, ambiguous, nonsensical. In the context of the discussion here, I used the quotation marks as a way to make clear that I think what an expert is might be unclear and/or nonsensical in the context of the project discussed here.

          “Do I wish Anonymous had chosen a different monicker so that I could safely ignore any comment by someone called “Anonymous”?” – You tell me. Wait, are you providing an example of passive-aggressiveness?

          “Does Anonymous know how to write a simple declarative sentence?” – I think I do.

          Again, I’ve tried to keep the remark by Mr. Gelman in mind and not possibly contribute to “garbage time”. That’s why I did not say anything yesterday, did not reply at first to your remarks, and only replied when a second remark appeared. It is not my intention to contribute to possible “garbage time”, but I did want to try and answer your questions.

          It’s all good, no worries. I thank everyone for their participation, including those more directly involved with the project under discussion here. I can understand that it’s perhaps not easy or annoying or whatever to read certain things. I can only say that I have tried to contribute in my way, for my reasons.

      • Andrew,

        We only review impactful papers (e.g. by their citations or other metrics of importance). Purposely hiding errors and betting on that paper becoming impactful in the future for the exchange of a few thousand Swiss Francs seems like a bad trade at this time.

        Cheers
        Malte

  2. “For pragmatic reasons (because data and code often are not readily available) and overall tolerability of the project by the research community, ERROR currently requires active consent by the authors, meaning that a lot of highly cited papers will not be looked at.”
    Wait- Are they saying that only critiques approved by the original authors will be accepted, or only ones in which the critique uses only the data and codes provided by the original author?

  3. Interesting idea. The website for this project is a bit strange however- finding the links to the completed reviews reminded me of the recent post on hidden instructions for LLMs – the color of the links made them almost disappear! The website also is hard to navigate. I checked two of the completed reviews. For one, the link to the data said it was forbidden. For the second, I couldn’t find the author response document. I’d say there are too many links and the navigation is not intuitive. None of this should distract from a potentially valuable endeavor. If we’re going to worry about moral hazard, then I guess we should also consider the possibility that this might provide a positive incentive to provide the data used in research.

    • Apologies for the technical issue with the link colors (this is a hickup, not on purpose).

      I checked and all the links are working for me. Can you point me to the ones not working on your end?

      The author responses are in the full reports, unless the authors chose not to reply.

  4. Nice. This is another reason for people

    1. to sign up for the Unjournal’s Evaluator Pool (https://bit.ly/joinujteam):we are currently targeting $450 average compensation for evaluators (reviewers). If you find bugs, you can report them to this initiative and get further bounties.

    2. And to read our evaluations at unjournal.pubpub.org: these may already highlight “bugs” eligible for bounties, or give hints at where to look.

Leave a Reply

Your email address will not be published. Required fields are marked *