Preregistration is a floor, not a ceiling.

This comes up from time to time, for example someone sent me an email expressing a concern that preregistration stifles innovation: if Fleming had preregistered his study, he never would’ve noticed the penicillin mold, etc.

My response is that preregistration is a floor, not a ceiling. Preregistration is a list of things you plan to do, that’s all. Preregistration does not stop you from doing more. If Fleming had followed a pre-analysis protocol, that would’ve been fine: there would have been nothing stopping him from continuing to look at his bacterial cultures.

As I wrote in comments to my 2022 post, “What’s the difference between Derek Jeter and preregistration?” (which I just added to the lexicon), you don’t preregister “the” exact model specification; you preregister “an” exact model specification, and you’re always free to fit other models once you’ve seen the data.

It can be really valuable to preregister, to formulate hypotheses and simulate fake data before gathering any real data. To do this requires assumptions—it takes work!—and I think it’s work that’s well spent. And then, when the data arrive, do everything you’d planned to do, along with whatever else you want to do.

Planning ahead should not get in the way of creativity. It should enhance creativity because you can focus your data-analytic efforts on new ideas rather than having to first figure out what defensible default thing you’re supposed to do.

Aaaand, pixels are free, so here’s that 2022 post in full:

What’s the difference between Derek Jeter and preregistration?

There are probably lots of clever answers to this one, but I’ll go with: One of them was hyped in the media as a clean-cut fresh face that would restore fan confidence in a tired, scandal-plagued entertainment cartel—and the other is a retired baseball player.

Let me put it another way. Derek Jeter had three salient attributes:

1. He was an excellent baseball player, rated by one source at the time of his retirement as the 58th best position player of all time.

2. He was famously overrated.

3. He was a symbol of integrity.

The challenge is to hold 1 and 2 together in your mind.

I was thinking about this after Palko pointed me to a recent article by Rose McDermott that begins:

Pre-registration has become an increasingly popular proposal to address concerns regarding questionable research practices. Yet preregistration does not necessarily solve these problems. It also causes additional problems, including raising costs for more junior and less resourced scholars. In addition, pre-registration restricts creativity and diminishes the broader scientific enterprise. In this way, pre-registration neither solves the problems it is intended to address, nor does it come without costs. Pre-registration is neither necessary nor sufficient for producing novel or ethical work. In short, pre-registration represents a form of virtue signaling that is more performative than actual.

I think this is like saying, “Derek Jeter is no Cal Ripken, he’s overrated, gets too much credit for being in the right place at the right time, he made the Yankees worse, his fans don’t understand how the game of baseball really works, and it was a bad idea to promote him as the ethical savior of the sport.”

Here’s what I think of preregistration: It’s a great idea. It’s also not the solution to problems of science. I have found preregistration to be useful in my own work. I’ve seen lots of great work that is not preregistered.

I disagree with the claim in the above-linked paper that “Under the guidelines of preregistration, scholars are expected to know what they will find before they run the study; if they get findings they do not expect, they cannot publish them because the study will not be considered legitimate if it was not preregistered.” I disagree with that statement in part for the straight-up empirical reason that it’s false; there are counterexamples; indeed a couple years ago we discussed a political science study that was preregistered and yielded unexpected findings which were published and were considered legitimate by the journal and the political science profession.

More generally, I think of preregistration as a floor, not a ceiling. The preregistered data collection and analysis is what you need to do. In addition, you can do whatever else you want.

Preregistration remains overrated if you think it’s gonna fix science. Preregistration facilitates the conditions for better science, but if you preregister a bad design, it’s still a bad design. Suppose you could go back in time and preregister the collected work of the beauty-and-sex-ratio guy, the ESP guy, and the Cornell Food and Brand Lab guy, and then do all those studies. The result wouldn’t be a spate of scientific discoveries; it would just be a bunch of inconclusive results, pretty much no different than the inconclusive results we actually got from that crowd but with the improvement that the inconclusiveness would have been more apparent. As we’ve discussed before, the benefits of procedural reforms such as preregistration are indirect—making it harder for scientists to fool themselves and others with bad designs—but not direct. Are these indirect benefits greater than the costs? I don’t know; maybe McDermott is correct that they’re not. I guess it depends on the context.

I think preregistration can be valuable, and I say that while recognizing that it’s been overrated and inappropriately sold as a miracle cure for scientific corruption. As I wrote a few years ago:

In the long term, I believe we as social scientists need to move beyond the paradigm in which a single study can establish a definitive result. In addition to the procedural innovations [of preregistration and mock reports], I think we have to more seriously consider the integration of new studies with the existing literature, going beyond the simple (and wrong) dichotomy in which statistically significant findings are considered as true and nonsignificant results are taken to be zero. But registration of studies seems like a useful step in any case.

Derek Jeter was overrated. He was a times a drag on the Yankees’ performance. He was still an excellent player and overall was very much a net positive.

18 thoughts on “Preregistration is a floor, not a ceiling.

  1. I agree wholeheartedly. Pre-registration is great unless you think it is handcuffs. It is the start of honest and open reporting and nothing more. It does not stop you from doing things outside the pre-registration at all.

    I think something missed in the Flemming story is that he admitted it was a chance (exploratory) finding that was then pursued further. The problem is that people aren’t doing that. And, as you have pointed out many times, often don’t realize. The biggest benefit is that pre-registration makes it apparent to them when that happens.

    • John:

      Ahhhh, good point regarding the penicillin study. I’m reminded of the classic 50 shades of gray story, where Nosek et al. found something interesting and statistically significant in an experiment they had conducted—but the study had not been preregistered. To check, they ran their own preregistered replication, and the finding did not replicate. As they wrote:

      Surely ours was not a case to worry about. We had hypothesized it, the effect was reliable. But, we had been discussing reproducibility, and we had declared to our lab mates the importance of replication for increasing certainty of research results.

      But they checked it anyway. Had it replicated, good for them, they could’ve gone forward with the result.

  2. I recently attended a Royal Society workshop on pre-registration (online). I gave up on the very first day when they gave a guy from Munich a platform for arguing that pre-registration is outright harmful. I’ve seen the associated articles and debates online around this from this group of people, and it just amazes me that they get to air their views at all. When I read the associated articles, I got the feeling that the people who oppose pre-registration don’t actually understand what it is and what it’s for (which Andrew explains in this post). Is that so hard to understand, or are these people just being contrarian? One reason why this group of authors gets a platform at the Royal Society is that at least one of the authors (not the speaker at the conference, though) is really famous; I can’t think of any other reason why anyone would want to hear the case against pre-registration. It has many limitations, sure; but anyone who has done pre-registrations knows its limitations. And anyone who does any scientific work understands the value of exploration and chance discovery. They just completely miss the whole point. I feel this is just contrarianism. Scientists just like to disagree with each other, which is of course usually good. But the limitations of pre-registration could have been discussed without taking such an extreme position, that they are outright harmful.

    • Shravan:

      Do you have a link to the paper in question, or at least the abstract of the talk? The workshop that you mention sounds like this recent meeting where I spoke, and I didn’t see most of the other talks. My talk was all about how preregistration has been useful to me, not for the purpose of getting calibrated p-values or whatever but just as a way of improving my scientific practice.

      • OK, I think I see it here, the abstract for the talk to which you referred:

        Preregistration will not improve our theories

        Professor Chris Donkin, LMU Munich, Germany

        Proponents of preregistration argue that, among other benefits, it improves the diagnosticity of statistical tests. In the strong version of this argument, preregistration does this by solving statistical problems, such as family-wise error rates. In the weak version, it nudges people to think more deeply about their theories, methods, and analyses. We argue against both: the diagnosticity of statistical tests depend entirely on how well statistical models map onto underlying theories, and so improving statistical techniques does little to improve theories when the mapping is weak. There is also little reason to expect that preregistration will spontaneously help researchers to develop better theories (and, hence, better methods and analyses).

        I agree with his last sentence: There is also little reason to expect that preregistration will spontaneously help researchers to develop better theories. Preregistration provides indirect evidence to perform better science—but taking bad science and preregistering it doesn’t make it into good science.

        But I disagree with his other claims. In particular, I do think that preregistered studies can be useful in creating clean evidence to evaluate a hypothesis, and I do think that preregistration nudges people to think more deeply about their theories, methods, and analyses—at least, it does that for me!

        I agree with Donkin’s argument that preregistration has been oversold, but I think he goes too far in the other direction by denying preregistration’s real benefit. It’s the Derek Jeter thing all over again.

        It seems that he spoke in the same session where I spoke! Had I been there in person I would’ve done some screaming during the discussion period.

        • I blogged about a different talk Donkin gave in 2022, but from the abstract sounds similar.
          https://statmodeling.stat.columbia.edu/2022/09/15/the-distinction-between-exploratory-and-confirmatory-research-cannot-be-important-per-se-because-it-implies-that-the-time-at-which-things-are-said-is-important/

          My impression from what I saw back then was that he was not denying that it can be useful to researchers, he was arguing that there is no single strong argument for the problem it solves. His work strikes me as an attempt to shed light on some of the philosophical commitments underpinning strong arguments for pre-registration. So I’m curious if he’s now arguing that it should not be adopted at all.

        • Hi Andrew,

          Like you, I also agree with this last sentence:

          “There is also little reason to expect that preregistration will spontaneously help researchers to develop better theories (and, hence, better methods and analyses).”

          I would have continued with another sentence:

          Indeed, there is little reason to expect that anything at all will spontaneously help researchers to develop better theories (and, hence, better methods and analyses).

          In that sense, the last sentence is trivially true. There is no magic bullet, period.

          Also agree that pre-reg has been oversold, but then isn’t that always the case for pretty much everything? Bayesian methods are sometimes presented as the solution to all our problems. Noise is noise, whether Bayesian or frequentist. Even regularized noise is just noise.

          PS I did hear your talk. I’ll try to find the paper I read once, it was led by Shiffrin I think (a well-known scientist in memory research). My memory is that he was attributing to pre-reg imagined advantages that at least I never thought it had; I think he also implied at one point (at least in some draft floating around in the internet) that people were advocating that every expt must be pre-registered (but I might be mis-remembering this). I also find pre-reg useful but it has never stopped me from exploring (and then checking if the patterns found pan out in pre-reg replications; the answer is so far almost always no).

        • For example, look at this paper:

          https://www.tandfonline.com/doi/full/10.1080/09515089.2022.2113771

          I just read the title and abstract (like a good scientist) but the claims there seem overblown. I have been doing experiment for some 25 years and I have found that most exploratory analyses that yield surprising or intriguing patterns (unless they are super obvious things like word length and/or n-gram frequency affects reading time) rarely come out in a confirmatory analysis. But this may be just the field I work in. Maybe he is doing studies with gigantic effect sizes that don’t even need an experiment to confirm.

  3. As an editor, I find that pre-registration does in fact cause problems sometimes. In some cases, the pre-registered plan for data analysis is, well, incorrect, but the authors want to include it in the paper anyway because it was what they said they would do. In other cases, authors simply resist suggestions/requests to report a different analysis. It can even happen that the pre-reg analysis shows an effect of interest but a more justifiable analysis does not, although I cannot recall a case where this happened.

    • Surely all of these cases would increase the information content of a paper relative to not including/admitting them? If the argument against that as an editor is space, then I would raise you unlimited space in online SI. Also, not sure I understand the case where authors resist editorial suggestions — isn’t that then a case of rejection? I don’t see that authors resisting editorial suggestions is especially specific to pre-reg.

      • From a reviewer’s perspective, I’ve had the problem Jonathon refers to. At the least, the authors’ resistance to changing/adding to their pre-registered analyses can lead to an additional round of reviews following an initial “but we pre-registered it this way!” response. It’s a fairly minor thing, but it does make the process slightly slower and more annoying for me (as it requires an extra “yes, but…” re-review).

        As you say though, author reluctance to respond to reviews isn’t unique to pre-registration.

  4. I find that these preregistration conversations never seem to find any common ground every time they come up, as people tend to occupy extreme positions online like (a) prereg should be mandatory for every study vs. (b) prereg actively holds back scientific progress.

    I’ve used prereg a number of times, but not every time. I see a lot of real-life situations where students have to develop plans before they’ve learned enough, so a prereg would be so flawed as to be kind of useless. Like, say a grad student writes their plan before ever taking their stats course. They prereg using a median split on a continuous variable, but by the time they sit down to analyze their data they realize that’s a terrible idea so they keep it as continuous. Their paper will either be:

    a) Reporting that they had a pre-registered plan, but their original plan is so flawed they’re not doing it (so, the prereg was ultimately a waste of time, but is a net neutral on the paper’s quality)
    b) Report the analysis with the median split anyway, then the better analysis after (which actively makes the paper worse because the first method is not appropriate)

    I’m not entirely sure how much I’d want to read about how much students didn’t understand about statistics at the prereg stage, and how much they learned in the interim in every paper (since so many papers are written by grad students!). I guess it’s transparent, but it feels a bit like asking people to submit their incomplete drafts to publish alongside the full paper, which is transparent but maybe embarrassingly so.

    I guess I feel like a well-done prereg is usually a net positive. But a shoddily done prereg could make the study worse, so I’d rather no prereg than a poorly done prereg.

    • You can just report in a footnote or open reporting/data section that you did a prereg and that, because the author was inexperienced, the prereg was so far off of what needed to be done that it’s not worth discussing further. You could even specify in a sentence or two critical deviations. Experts can read it for what it’s worth.

      The problem with that really comes down to the review process. I can see some reviewers reacting to that poorly when it’s just honest reporting. It’s like one I recently had who asked me to report a power analysis when the honest N justification was that it was a student project and that was the largest one we could collect in the time permitted. We also didn’t do any significance testing in the paper so I’m not sure it even means much. Post hoc margins of error were available but I also wasn’t going to make up a story on how we decided that the N was going to be one that generated ones of a certain size. The reviewer was bent on me lying for blind allegiance to a procedure.

  5. I attended the Royal Society event in person, and also the Critical Perspectives on the Metascience Reform Movement Symposium online. I echo Shravan’s comment that generally opponents of preregistration seem not to understand what it is and what it’s for – frustrating!

    Specifically (and this extends beyond those critics, to others in the research and publishing communities) there seems to be a lack of knowledge re: the distinctions between standard preregistration and the Registered Reports format (and in general the latter’s advantages over the former).

    That said, at least one speaker at the Royal Society who was critical of preregistration did seem to slightly change their perspective when discovering the benefits of Registered Reports from another speaker (if memory serves, relating to how it combats publication bias).

  6. I am sharing two posts I put up in the past few days on my blog error statistics.com, sketching some remarks I would have made at the Royal Society conference on Preregistration, had I been able to attend in-person. I’m sorry that I didn’t realize I could have presented online. ( I was invited a year ago and accepted the invite.) Anyway, in my comments I start out distinguishing pejorative and nonpejorative data dredging to scotch some popular dismissals.

    https://errorstatistics.com/2024/03/06/promises-and-pitfalls-of-preregistration-an-rss-conference-i-was-to-speak-at/

    https://errorstatistics.com/2024/03/17/preregistration-promises-and-pitfalls-continued/

Leave a Reply

Your email address will not be published. Required fields are marked *