“For a research assistant, do you think there is an ethical responsibility to inform your supervisor/principal investigator if they change their analysis plan multiple times during the research project in a manner that verges on p-hacking?”

A reader who works as a research assistant and who wishes to remain anonymous writes:

I have a hypothetical question about ethics in statistics. For a research assistant, do you think there is an ethical responsibility to inform your supervisor/principal investigator if they change their analysis plan multiple times during the research project in a manner that verges on p-hacking? Or do you think that the hierarchy within this relationship places the burden on the supervisor/principal investigator and not the research assistant?

My reply:

Let me separate this into two issues:

1. The ethics of the supervisor’s behavior.

2. The ethics of the research assistant’s behavior.

1. Regarding the ethics of the supervisor’s behavior, I guess this depends a bit on the social relevance of the application area. If the supervisor is p-hacking on the way to a JPSP paper on extra-sensory perception, then I guess all that’s at stake is the integrity of science, some research funding, and the reputation of Ivy League universities, so no big deal. But if he’s p-hacking on the way to a claim about the effectiveness of some social intervention, whether it be early-childhood intervention or food labeling in schools, then there are some policy implications to exaggerating your results. Indeed, even if there’s no p-hacking at all, we can expect published estimates to be overestimates, and that seems like an ethical problem to me. Somewhere in between are claims with no direct policy implications that still have ideological implications. For example, a p-hacked claim that beautiful parents are more likely to have daughters does not directly do any damage—except to the extent that it is used as ammunition in a sexist political agenda. The most immediately dangerous cases would be manipulating an analysis to make a drug appear safer or more effective than it really is. I guess this happens all the time, and yes, it’s unethical!

2. Regarding the research assistant: I’d say, yeah, the burden is on the supervisor. I admire whistleblowers, but it’s awkward to say there’s an ethical responsibility to blow the whistle, given the possibility of retaliation by the supervisor.

Rather than saying there’s an ethical responsibility of the research assistant to blow the whistle, I’d rather say that the supervisor has an ethical responsibility to set up a working environment where it’s clear that subordinates can express their concerns without fear of retaliation, and the institution where everyone is working has the ethical responsibility to enforce that subordinates can express their concerns without fear of retaliation, and society has an ethical responsibility to enforce that institutions allow safe complaints.

Two questions then remain:

3. Is the supervisor’s behavior actually unethical here? Is it even bad statistics or bad science?

4. What should the research assistant do?

3. Is it unethical to “change an analysis plan multiple times during the research project in a manner that verges on p-hacking”? It depends. It’s unethical to change the plan and hide those changes. It’s not unethical to make these changes openly. Then comes the analysis. What’s important in the analysis is not whether it accounts for all the different plans that were not done. Rather, what’s important is that the analysis reflects the theoretical perspectives embodied in these analysis plans. For example, if the original plan says that variable A should be important, and the later plan says that variable B should be important, then the final analysis should include both variables A and B. Or if the original plan says that the effect of variable A should be positive, and the final plan says the effect should be negative, then the final analysis should respect these contradicting theoretical perspectives rather than just going with whatever noisy pattern appeared in the data. My point here is that the ethics depends not just on the data and the steps of the analysis; it also depends on the substantive theory that motivates the data collection and analysis choices.

4. I don’t know what I’d recommend the research assistant do. In similar situations I’ve suggested an indirect approach: instead of directly confronting the supervisor, make a positive suggestion that the analysis would be improved by a clearer link to the underlying substantive theory. You can also express concerns by invoking a hypothetical third party: say something like, “A reviewer might be concerned about possible forking paths in the data coding and analysis, and maybe a multiverse analysis would allay such a reviewer’s concerns.”

13 thoughts on ““For a research assistant, do you think there is an ethical responsibility to inform your supervisor/principal investigator if they change their analysis plan multiple times during the research project in a manner that verges on p-hacking?”

  1. Just sent this to my class (many of whom work in labs) with the addition that they may not know what the supervisor plans to do (e.g., will they report all, multiverse it) or the supervisor may not know that what they are doing is wrong, so come prepared with the forking path paper. I think assistants often hear their methods instructors talk about p-hacking and pre-registration, so assume their profs who were trained in the “p is everything era” also know about these issues (they should, but that’s a different issue).
    And I echo your point that it depending on the circumstance the path of least resistance may be prudent.

  2. I couldn’t disagree more with your point 1. If a research technique is unethical, it’s unethical regardless of what subject matter it is applied to. If you’re talking about harms to society, you’re getting into morals, not ethics.

    Honestly as a lawyer I found your answer kind of shocking. Are there no formal ethics rules in this field?

    • Andrew:

      I was being ironic when I wrote, “I guess all that’s at stake is the integrity of science, some research funding, and the reputation of Ivy League universities, so no big deal.”

      It’s hard to convey intonation in typed speech.

    • >If you’re talking about harms to society, you’re getting into morals, not ethics.
      You’re being rather pedantic. Point 1 in the blog post is clearly about morals but those are assessed by _ethics_ committees. Take a look how “ethical” is used in informal speech.

      >Are there no formal ethics rules in this field?

      Usually, that’d be an unfortunate event. Ethics committees encourage immoral research practices of a certain institutional type.

  3. Have never worked in a lab. So, my question is (very) basic:

    How does one distinguish p-hacking, a false positive, from a p-value that’s statistically significant if there aren’t enough trials to distinguish between the two? So, my question is, is p-hacking always evident, during an experiment, or is there a process that needs to be done, afterwards, such as more experiments, to insure that the experiment was p-hacked?

    This is related to the question of ethics because, it assumes, that one can tell the difference between a legitimate series of experiments and one that’s searching for a false positive. If there’s uncertainty in this process, than, therefore, there’s uncertainty in detecting unethical behavior.

    • Not quite sure what you are asking with number of trials. Most p-hacking amounts to using a statistical model that assumes your data is IID then proceeding to blatantly violate what you assumed by introducing dependence between the datapoints.

      Eg, do a batch of 5 rats then if it looks “promising” you do another batch, etc until either you get significance and stop or run out of money. Or throwing out datapoints just for being outliers relative to the majority of the points. And there is *always* a plausible excuse to throw out any given datapoint.

      Also, of course, the classic looking at 20 different outcomes then making a big deal about the one that is significant while downplaying the rest.

      Essentially the model you are using should actually reflect the process used to generate the data (along with whatever theoretical assumptions you want to test), not some idealized stats 101 example just because it is easy to calculate. Remember if you make the deduction:

      If (A and B) then C.

      Then you fail to observe C (!C), then modus tollens says:

      !(A and B) = !A or !B or (!A and !B)

      I.e., All you know is at least one of your assumptions is false. So if you already know there is some false methodological assumption being made (violation of IID) then you have rendered the whole process of testing your model pointless. You can’t conclude anything.

  4. Your answer to 3 makes me think of any easy general ethics rule: nothing is unethical if the researcher is clear and open about what they are doing. It wouldn’t be unethical to write a paper and say “here are the types of results you can get by p-hacking”, although then it would be clear that such results shouldn’t be taken clearly. In the extreme cases where researchers straight-up fabricate data, there wouldn’t be unethical if what they did was clear and open, rather it would just be simulated data!

    • Sam:

      I disagree with your principle, “nothing is unethical if the researcher is clear and open about what they are doing.” If someone publicly states that he’s gonna rip people off, and then he goes at and does it, I still think it’s unethical. I agree with you about the fabricated data, though. If that Pizzagate dude had just said he was designing a bunch of thought experiments, that would’ve been ok. The trouble is that then this work never would’ve been published, or, if it had, it wouldn’t have gotten all that attention. Similarly with plagiarism: If you put in the quotation marks it’s not plagiarism. But plagiarists (or, more generally, people such as authors of chess books who copy material from other sources without attribution) don’t put in the quotation marks, because if they did, they wouldn’t get the credit they crave, and also they wouldn’t be able to manipulate and misrepresent their sources.

  5. But here’s the tough follow-on real-world question. This supervisor is gonna publish, without clarifying the hacking, without Bonferroni or anything. In most fields the assistant’s name would routinely go on the paper. The assistant knows that the results are pretty bogus. What does the asssitant do at that point?
    I say she has to ask that her name not be included as an author, and state why. Coauthorship explicitly means that you agree with at least the factual points of the paper.

    • Michael:

      Yes, I agree that at some point you just have to say, “I don’t feel comfortable being a coauthor of this paper.” I’ve done that sometimes; other times I wish I had; other times there were disagreements between the authors after the paper had already been written and the work never got published, as there was no clean way to divide the baby.

  6. If it’s an “interesting” result, would an ethical option be to get the study replicated? The assistant could maybe do it later in their own career, or maybe find someone else in the field to do it. If the replication confirms the result, all is fine; if it doesn’t, you say “guess our first result wasn’t as solid as we thought”.

    The problem with this is that there may not be as much credit/effort for this replication (and funding!) than there is for alternative original work?

  7. My advice is these situations is to talk to a senior member of you organization being “I am not sure about this but this seems concerning” and let them handle it.

    That might be clarifying your advisor is not being unethical or having them fired.

    But if they don’t address your concern find somewhere else to be.

    And don’t talk openly about your perception of the problem at least until you are secure somewhere else.

Leave a Reply

Your email address will not be published. Required fields are marked *