Do research articles have to be so one-sided?

It’s standard practice in research articles as well as editorials in scholarly journals to present just one side of an issue. That’s how it’s done! A typical research article looks like this:

“We found X. Yes, we really found X. Here are some alternative explanations for our findings that don’t work. So, yeah, it’s really X, it can’t reasonably be anything else. Also, here’s why all the thickheaded previous researchers didn’t already find X. They were wrong, though, we’re right. It’s X. Indeed, it had to be X all along. X is the only possibility that makes sense. But it’s a discovery, it’s absolutely new. As was said of the music of Beethoven, each note is prospectively unexpected but retrospectively absolutely right. In conclusion: X.”

There also are methods articles, which go like this:

“Method X works. Here’s a real problem where method X works better than anything else out there. Other methods are less accurate or more expensive than X, or both. There are good theoretical reasons why X is better. It might even be optimal under some not-too-unreasonable conditions. Also, here’s why nobody tried X before. They missed it! X is, in retrospect, obviously the right thing to do. Also, though, X is super-clever: it had to be discovered. Here are some more examples where X wins. In conclusion: X.”

Or the template for a review article:

“Here’s a super-important problem which has been studied in many different ways. The way we have studied it is the best. In this article, we also discuss some other approaches which are worse. Our approach looks even better in this contrast. In short, our correct approach both flows naturally from and is a bold departure from everything that came before.”

OK, sometimes we try to do better. We give tentative conclusions, we accept uncertainty, we compare our approach to others on a level playing field, we write a review that doesn’t center on our own work. It happens. But, unless you’re Bob Carpenter, such an even-handed approach doesn’t come naturally, and, as always with this kind of adjustment, there’s always the concern of going too far (“bending over backward”) in the other direction. Recall my criticism of the popular but I think bogus concept of “steelmanning.”

So, yes, we should try to be more balanced, especially when presenting our own results. But the incentives don’t go in that direction, especially when your contributions are out there fighting with lots of ideas that other people are promoting unreservedly. Realistically, often the best we can do is to include Limitations sections in otherwise-positive papers.

One might think that a New England Journal of Medicine editorial could do better, but editorials have the same problem as review articles, which is that the authors will still have an agenda.

Dale Lehman writes in, discussing such an example:

A recent article in the New England Journal of Medicine caught my interest. The authors – a Harvard economist and a McKinsey consultant (properly disclosed their ties) – provide a variety of ways that AI can contribute to health care delivery. I can hardly argue with the potential benefits, and some areas of application are certainly ripe for improvements from AI. However, the review article seems unduly one-sided. Almost all of the impediments to application that they discuss lay the “blame” on health care providers and organizations. No mention is made about the potential errors made by AI algorithms applied in health care. This I found particularly striking since they repeatedly appeal to AI use in business (generally) as a comparison to the relatively slow adoption of AI in health care. When I think of business applications, a common error might be a product recommendation or promotion that was not relevant to a consumer. The costs of such a mistake are generally small – wasted resources, unhappy customers, etc. A mistake made by an AI recommendation system in medicine strikes me as quite a bit more serious (lost customers is not the same thing as lost patients).

To that point, the article cites several AI applications to prediction of sepsis (references 24-27). That is a particular area of application where several AI sepsis-detection algorithms have been developed, tested, and reported on. But the references strike me as cherry-picked. A recent controversy has concerned the Epic model (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8218233/?report=classic) where the company reported results were much better than the attempted replication. Also, there was a major international challenge (PhysioNet: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6964870/) where data was provided from 3 hospital systems, 2 of which provided the training data for the competition and the remaining system was used as the test data. Notably, the algorithms performed much better on the systems for which the training data was provided than for the test data.

My question really concerns the role of the NEJM here. Presumably this article was peer reviewed – or at least reviewed by the editors. Shouldn’t the NEJM be demanding more balanced and comprehensive review articles? It isn’t that the authors of this article say anything that is wrong, but it seems deficient in its coverage of the issues. It would not have been hard to acknowledge that these algorithms may not be ready for use (admittedly, they may outperform existing human models, but that is an area on which there is research and it should be noted in the article). Nor would it be difficult to point out that algorithmic errors and biases in health care may be a more serious matter than in other sectors of the economy.

Interesting. I’m guessing that the authors of the article were coming from the opposite direction, with a feeling that there’s too much conservatism regarding health-care innovation and they wanted to push back against that. (Full disclosure: I’m currently working with a cardiologist to evaluate a machine-learning approach for ECG diagnosis.)

In any case, yes, this is part of a general problem. One thing I like about blogging, as opposed to scholarly writing or journalism, is that in a blog post there’s no expectation or demand or requirement that we come to a strong conclusion. We can let our uncertainty hang out, without some need to try to make “the best possible case” for some point. We may be expected to entertain, but that’s not so horrible!

21 thoughts on “Do research articles have to be so one-sided?

  1. In my experience this is strongly enforced by editors and peer review. If you point out the flaws in your paper or alternative explanations, reviewers attach themselves to those flaws and use them to propose rejecting the paper; editors use it to label the work as low-impact and not suitable for their journal.

  2. Yes, the incentives provided by editorial policies, peer review, and employer vested interests are a large part of the problem. I think there is also a very human incentive to want to make strong recommendations or even decisive statements. I think this is especially true among academics. Policy makers have all the control and often view academic research skeptically. Making strong declaratory conclusions provides a satisfying counterbalance to the futility that many academics feel about their own ability to influence things. The roles of decision maker and analysts are not clearly defined – and I think there is a tendency of both to encroach on the turf of the other. Decision makers cherry pick analyses and/or declare truth in the absence of any evidence, and analysts seem to pretend that they are, or know better than, the decision makers.

  3. I think this complaint misses the fact that there are, roughly, a billion articles published every year. My bioRxiv feed for yesterday alone, a typical day, has 35 articles in the four categories I get updates on. No one will ever read an article of the form: “here’s some stuff; the data itself aren’t intrinsically interesting, and studying this data doesn’t push us to some particular conclusion,” not because that’s bad, but because it can’t compete with all the other articles that promise to take us to some destination. (And yes, most of the other articles are awful.)

    But review articles: yes, these should give a broad perspective!

  4. In my experience, computer science does better than average at crediting prior art (w/useful references!) and making fair comparisons. I do not know if this is related the CS tradition that most papers are conference submissions.

    • Really? I think of some CS as being among the worst because novelty is so prized! We often don’t even mention the related work til the end of the paper, which says something about prioritization.

      There are areas though where its harder to hide, due to standardization of data sets and baselines, AI/ML being one obvious one.

      • But its easier to hide with AI in healthcare because datasets are rarely open and easily hidden behind pleas to privacy.

        I’m all for privacy, but most healthcare models are safe-harbor by the time they are algorithm ready matrices. Journals should push back that some data release is necessary even when the full data set is not releasable.

  5. “But, unless you’re Bob Carpenter, such an even-handed approach doesn’t come naturally, and, as always with this kind of adjustment, there’s always the concern of going too far (“bending over backward”) in the other direction”

    I don’t know, I feel like I’ve blogged multiple times in the last few years about how I can hardly bring myself to give talks because the sales mode that audiences expect has become so unnatural. I’m getting better at designing talks I’m comfortable giving that don’t make the audience uncomfortable either, but its a constant challenge. I had to learn the the hard way that people really don’t want to see you question your own ideas in your talks unless its as set-up for an even better solution!

    • Jessica:

      A lot depends on the venue. Sometimes the audience is willing to follow me wherever I go, and I can give a talk raising all sorts of difficulties with my methods, and they seem to get the point. Other times the audience is unfamiliar with the topic, and then it’s hard to bring up subtleties. I remember once many years ago speaking to a group of applied mathematicians and talking about logistic regression . . . but they’d never heard of logistic regression! Had I realized this would’ve been an issue, I could’ve structured the talk differently and framed it as a math problem; the problem was that I didn’t catch this communication problem until the talk was already over.

      • Yeah, I think the places where questioning one’s results goes the most wrong is when the entire idea of what is being presented is new to the audience. E.g., when I’ve been asked to give a talk on uncertainty visualization to people who never realized that visualization was something people studied seriously, much less that there are behavioral experiments on uncertainty visualizations. So then when I say things about how I doubt how much we’re really learning from these studies for reasons X, Y, Z, unless I have some answers about what exactly is going to be better, they wonder why they invited me!

        Things got easier for this style of talk once I came up with some stuff I think is better than the work I’ve done that gets me invited, though I still prefer for the message of my talks to be more ‘this is really hard to get right’ then ‘you should do this.’ I’m getting better at finding ways to give talks around bigger questions that my work can say something about, even if no resolution is reached, where the question is important enough that the audience seems to appreciate thinking through it with me.

        When it comes to writing papers, the same basically applies.

        • Jessica:

          Writing books is a lot easier than writing papers because there’s no publication bottleneck. You can write whatever you want in a book and get it published; with a paper, there’s always this background level of stress on what will the referees think, whether the paper will get accepted, etc.

        • Andrew, Jessica:

          The answer to this puzzling phenomenon lies in the media, the textbook sidebars and the Saturday morning cartoon portrayals of scientists known to readers of this blog as “the hero meme”. Nowhere among these is there a scientist plugging away in obscurity who claims to not quite get what’s going on in the world – unless, of course, scientist makes a great discovery, then their slaving away in obscurity becomes an immoral pennance which had been foisted upon the misunderstood genius by the ruthlessly mundane establishment (usually vaguely represented as corporate-like).

          Science is an evangelical business. When you make your great discovery that saves humanity, it’s not enough to present it rationally. If you don’t believe deeply enough in the refilling soup bowl to chain yourself to the corporate offices of Campbell’s Soup and demand its evil inhabitants stop killing people with soup, why are you even a scientist? Perhaps you’re part of the evil! Tune in next week, kids, when the Scooby and company seek the answer!!

        • Andrew, I know what you’re thinking: What is Chipmunk talking about? “Soup” can’t be rationally defined! No one really knows the difference between soup and porridge! So – the existential question is, which headquarters should a Moral Hero Scientist chain themselves to? Campbells? Or Quaker?

  6. Besides the already rare case of articles saying “we found X, but it may be wrong”, it’s entirely possible to conceive the case of articles saying “we found that the explanation can be X or Y, and current data supports both”. Two mutually exclusive theories, both supported, presented in the same work. Makes total sense, and I’m always looking for such a thing, but so far I’ve never come across one. If anyone knows one, please tell, since it would be useful in teaching how science should be done, without championing pet explanations.

    • There is a popular paper called ‘A Fine is a Price’ (Gneezy and Rustichini, 2000; open access at https://www.jstor.org/stable/10.1086/468061), which essentially offers two interpretations. The statistical evidence is not particularly strong, in my opinion. The paper has nice graphs that lead me to my conclusion (p. 9). While I disagree with the authors that the ‘facts’ they conclude are not that certain, their paper is very helpful for anyone to make up their own mind about it. In brief, that after a small fine was introduced for picking up a child late from nursery school, the number of late pick-ups increased. After the fine was withdrawn, the level of late pick-ups did not decrease but remained at a high level.
      In the section ‘Interpretation of the Results’ the authors propose two different approaches to understanding the mechanism they observed. One is a game theoretic model (explained in A to C). The idea is essentially that a fine is a signal that the nursery school management tolerates a certain amount of parental misbehaviour. The model says that parents have lost their fear the nursery school will expel the child if they are occasionally late. So they start to exploit this.
      In subsection D, the authors write about social norms. I would summarise it as follows: Initially, the parents understood tardiness as requiring an act of grace by the nursery teacher – it had a high moral value. By attaching a small fine to it, tardiness became a minor infraction of a faceless policy, a victimless crime in the eyes of the parents. Only a minor rule had been broken, so what?
      The authors clearly like the second explanation better, because they even named the paper after it. (I think the name is one of the catchiest research paper titles ever.) The paper has two conclusions- just as you wanted. But I would definitely not use this paper in class without pointing out that the statistical evidence is probably not as strong as the authors purport it to be.

      • Thank you, it certainly helps to illustrate the point. And by the way, as a complete ignoramus (but as a father who used to pick up a child) I strongly feel that the first explanation is closer to truth!

  7. This doesn’t quite belong here, but speaking of one-sided articles…

    A while ago there was what appeared to be an invited article on the problems of urbanization in Science. Since the lead photo was a pic of downtown Tokyo, and I’m a fan of cities, I read said article.

    https://www.science.org/doi/10.1126/science.adi6636

    The authors were mad about something, and I couldn’t quite figure out what. To the best I could tell, there’s some sort of internal fight going on in the urban studies field, and the authors thought the mainstream in the field was barking up the wrong tree, and anyway, they don’t like cities and think the problems are terrible.

    My analysis is that they think urbanization is terrible, but they don’t consider the alternative, which is distrubuting the same number of people over a much lager area. Nah, urbanization is great. You don’t need a car, public transportation can work, and whatever your interests are, there are people doing that within public transportationable distances. The author of the book I’m reading (an intro to literary criticism) teaches a course at a University less than an hour from here, for example. They complain how difficult the supply chain is, but, again, the alternative is distributing those people over a larger area, making the supply chain problem far worse.

    But one of their complaints was how horrifically crowded cities are, and to demonstrate that point, the photo of Tokyo they presented was, they implicitly claimed, a typical downtown street. The street in the pic was Takeshita Dori.

    Now, this is one hilariouos howler. First of all, backstreets in Tokyo are largely devoid of traffic and people. I’m on a street that’s smack dab in the middle of downtown Tokyo (search for “David in Tokyo” on Google maps), and there’s hardly any traffic. Far less than there was on Beacon Hill around 1970, for example. Sure, the major streets that border the triangular region** I’m in are busy (the corners are the Meiji Kinenkan, Yotsuya, and Yotsuya sanchome, if you’re a map fan). But once you get off a main street, there’s not much traffic. And that’s a major defining characteristic of Tokyo: the geography (and street layouts) are so crazy that you don’t go onto a back street unless you have business there.

    So how did they find a street that’s packed with people? Takeshita Dori* is a major tourist attraction and a sort of mecca for high school kids hoping to be discovered by talent scouts. It’s like saying “Anaheim is crowded and the architecture is ugly” and showing a photo of DisneyLand at peak busy season.

    *: https://en.wikipedia.org/wiki/Harajuku

    ** The three corners of this triangle are all clean 90-degree corners. Just like Boston’s Boston Common, which has FIVE 90-degree corners.

Leave a Reply

Your email address will not be published. Required fields are marked *