Why we say that honesty and transparency are not enough:

Someone recently asked me some questions about my article from a few years ago, Honesty and transparency are not enough. I thought it might be helpful to summarize why I’ve been promoting this idea.

The central message in that paper is that reproducibility is great, but if a study is too noisy (with the bias and variance of measurements being large compared to any persistent underlying effects), that making it reproducible won’t solve those problems. I wrote it for three reasons:

(a) I felt that reproducibility (or, more generally, “honesty and transparency”) were being oversold, and I didn’t want researchers to think that just cos they drink the reproducibility elixir, that their studies will then be good. Reproducibility makes it harder to fool yourself and others, but it does not turn a hopelessly noisy study into good science.

(b) Lots of people are honest and transparent in their work but still do bad research. I wanted to be able to say that the research is bad without that implying that I think they are being dishonest.

(c) Conversely, I was concerned that, when researchers heard about problems with bad research by others, they would think that the people who are doing that bad research are cheating in some way. This leads to the problem of researchers saying to themselves, “I’m honest, I don’t ‘p-hack,’ so my research can’t be bad.” Actually, though, lots of people do research that’s honest, transparent, and useless! That’s one reason I prefer to speak of “forking paths” rather than “p-hacking”: it’s less of an accusation and more of a description.

5 thoughts on “Why we say that honesty and transparency are not enough:

  1. That’s why it’s great to explore and contrast critical appraisal models across disciplines and methodologies. Recently, I found the Total Quality Framework from qualitative research that focuses on Credibility (score, data gathering), Analyzability (data processing, triangulation), Transparency (reporting), and Usefulness.

    A more comprehensive list from my own thinking would include Fairness, Openness, Relevance, Ethics, Clarity, Applicability, (Self)-skepticism, Transparency, Efficiency, and Rigor. Of course, that’s a lot harder to scale up with advocacy and educational initiatives.

    • Jay:

      Yes, I think that the normalization and expectation of criticism is important in science, indeed in problem-solving more generally. One way to interpret my statement, “honesty and transparency are not enough,” is that it is a terrible mistake if people think that, just because they are honest and transparent, that their work will not benefit from outside scrutiny.

      • I agree, that was my understanding from your original post “Honesty and transparency are not enough.” I just wanted to note that there are qualitative and mixed methods researchers (typically absent or less visible in these open science debates and activities) who would probably align with your views on research quality. It was interesting for me to learn about this through readings and coursework even after years of following these open science and metascience issues.

  2. I think you are conflating reproducibility with the ability to distinguish between different explanations.

    A method could be perfectly reproducible but useless to tell which explanation fits it best.*

    Eg, observations of the planetary orbits were reproducible, but could not distinguish between Newtonian mechanics and relativity. Except Mercury, which has the most irregular orbit. Those irregularities rose above the noise floor set by observational error.

    This issue is orthagonal to reproducibility.

    * Typically those are something like “chance vs not chance”, or “exactly zero correlation vs some correlation”, which is its own problem not directly relevant here.

  3. Is “prompt engineering” (in generative AI) a form of p-hacking? If you change even one word in the prompt, you may get a different result. You keep trying different prompts until you find one that generates an output that looks amazing… then you send a tweet saying the AI did this amazing thing. The best tweets are those who give the impression that they happened upon that prompt on their first try.

Leave a Reply

Your email address will not be published. Required fields are marked *