Addressing legitimate counterarguments in a scientific review: The challenge of being an insider

Review articles can be written by outsiders or insiders.

From the outside, it’s easier. You assess the evidence and draw your conclusions. For example, this is what I did when summarizing the research on ballot-order effects and addressing the question of whether Donald Trump won the 2016 election because his name came first on the ballot in key states; see pages 240-242 of Active Statistics for the full story. Or when my colleagues and I wrote about “nudge” interventions. We could be right, we could be wrong, but in any case the job is clear enough.

Writing a review is more difficult from the inside. For example, consider the meta-analysis of nudge that was written by some nudge researchers. It had big problems: garbage in, garbage out. It’s hard to step back and examine the evidence, if you’re part of the story. For another example, we recently discussed a review of the controversial Implicit Association Test that was flawed in only presenting part of the story, not even acknowledging that there was a controversy.

What to do if you’re an insider and you want to write a review that includes some of your own work?

I don’t think you should just give up. As an insider, you have a special perspective, and it makes sense that you’ll want to review the evidence. At the same time, you have to avoid the natural inclination to try to present too much of a coherent story. In real life, the evidence doesn’t always all go in the same direction!

My recommendation, for the insider writing a review article, is not to try to debate or shoot down such counterarguments but rather to acknowledge the disagreement and fit it into your larger story.

Here’s an example where we did exactly that. Our article is called Reconciling Evaluations of the Millennium Villages Project, and it begins:

The Millennium Villages Project was an integrated rural development program carried out for a decade in 10 clusters of villages in sub-Saharan Africa starting in 2005, and in a few other sites for shorter durations. An evaluation of the 10 main sites compared to retrospectively chosen control sites estimated positive effects on a range of economic, social, and health outcomes (Mitchell et al. 2018). More recently, an outside group performed a prospective controlled (but also nonrandomized) evaluation of one of the shorter-duration sites and reported smaller or null results (Masset et al. 2020). Although these two conclusions seem contradictory, the differences can be explained by the fact that Mitchell et al. studied 10 sites where the project was implemented for 10 years, and Masset et al. studied one site with a program lasting less than 5 years, as well as differences in inference and framing. Insights from both evaluations should be valuable in considering future development efforts of this sort. Both studies are consistent with a larger picture of positive average impacts (compared to untreated villages) across a broad range of outcomes, but with effects varying across sites or requiring an adequate duration for impacts to be manifested.

“Mitchell et al. 2018” was us! I worked with the Millennium Villages Project to help conduct a retrospective evaluation, which yielded positive estimated effects. My take was that the project worked well. That’s what I believe, but I’m an insider. Masset et al. 2020 was an outside team that did a different analysis on different data and reported null results. The purpose of our review article was to understand how these two studies of the same program came to such different conclusions. In writing this paper we had to walk a fine line: we’re trying our best to assess the past work objectively, but one of the papers was ours. The key here is that we presented the disagreement—we did not pretend the dissenting article did not exist, nor did we dismiss it—; rather, we incorporated it into a larger understanding. I’m not saying that this new paper of ours was perfect, nor that it was influential—indeed, according to Google scholar it has exactly zero citations, really a low payoff given all the work we put into it—but I still think of it as a model for how to present and assess conflicting evidence from the inside.

4 thoughts on “Addressing legitimate counterarguments in a scientific review: The challenge of being an insider

  1. 1. Almost always regular articles contain literature reviews, however brief. The challenge applies to these as well. Thus it comes up often.
    2. One way to handle the challenge: when self-citing, write, e.g., “Gelman shows such & such.” When citing somebody else, write, e.g., “Huberman reports such and such.” When arguing with somebody else replace “reports” with “claims.”

    • Gur:

      Another complexity is that there are multiple publication outlets, and things are set up so it’s not so easy for different articles to respond to each other.

      In this case, there were four major articles:

      Pronyk et al. (2012)–this was the one that led things off, that had to be retracted because it misrepresented the results.

      Mitchell et al. (2018)–this article, which included me as a coauthor, was a kind of re-do of Pronyk et al., in that both articles were written by people connected to the Millennium Villages Project. But neither Mitchell nor I had been involved in the bad Pronyk et al. study.

      Masset et al. (2020)–these people were pretty harsh on the Millennium Villages Project, which is fine, but I don’t think they did any kind of serious attempt to reconcile their findings with those of Mitchell et al. (2018).

      Gelman et al. (2023)–this was the paper that I wrote (with the collaboration of Mitchell, Sachs, and Sachs) to attempt to reconcile Mitchell et al. (2018) and Masset et al. (2020), as discussed in the above post.

      Now, here’s the point. Those four articles were published in four different journals. It’s hard to get an article published, so you publish it where you can. The trouble is that it’s hard to move forward. There’s no point in talking any more about Pronyk et al. (2012), except as a cautionary tale about conflicts of interest, but the other three papers have valuable things to say. If, for example, Masset et al. wanted to follow up on our 2023 paper, they could, but they’d need to find yet another journal to publish their response.

      And now, for the punch line, the number of citations of each of these papers in Google scholar:

      Pronyk et al. (2012): 113 citations

      Mitchell et al. (2018): 58 citations

      Masset et al. (2020): 16 citations

      Gelman et al. (2023): 1 citation

      So . . . the only unambiguously bad paper in the bunch gets more citations than all the others put together. And our reconciliation, which in some ways should be the last word (so far) on the topic has only 1 citation. You can see why Masset et al. might not want to waste any time responding to it. Indeed, looking at these counts makes me wonder, first why I spent so many hours writing and revising that eventually-published-in-2023 paper, and, second, why Shira Mitchell and the rest of us spent several years on the evaluation study leading to our 2018 paper. Success is not measured in citations; still, it’s dispiriting to see such a small impact. In retrospect, maybe we should’ve figured out how to get our paper published in a medical journal. That’s how you get those big citation numbers, and maybe actual policy influence?

  2. This interests me because I nearly always write and submit as an outsider and am reviewed by insiders. Predictably, the insiders describe my contribution as uninformed and therefore misguided, as if the only reason I don’t share their perspective is that I lack the background. Of course, my reading of the prior literature sometimes omits important stuff, but that’s generally not the reason I take the positions I do.

    Stepping outside my frustration, I try to imagine what a journal editor should do. The obvious solution would be to pull together a mixed team of reviewers, including both insiders and outsiders. The problem I think an editor would have, however, is figuring out which outsiders are qualified to do this. Also, non-insiders are much more likely to refuse the assignment. So there may be no practical solution. If so, this is another argument for lighter vetting prepub and much heavier evaluation postpub.

Leave a Reply

Your email address will not be published. Required fields are marked *