“Like a harbor clotted with sunken vessels”: update

A few years ago I reported on this story:

In 2005, Michael Kosfeld, Markus Heinrichs, Paul Zak, Urs Fischbacher, and Ernst Fehr published a paper, “Oxytocin increases trust in humans.” According to Google, that paper has been cited 3389 times.

In 2015, Gideon Nave, Colin Camerer, and Michael McCullough published a paper, “Does Oxytocin Increase Trust in Humans? A Critical Review of Research,” where they reported:

Behavioral neuroscientists have shown that the neuropeptide oxytocin (OT) plays a key role in social attachment and affiliation in nonhuman mammals. Inspired by this initial research, many social scientists proceeded to examine the associations of OT with trust in humans over the past decade. . . . Unfortunately, the simplest promising finding associating intranasal OT with higher trust [that 2005 paper] has not replicated well. Moreover, the plasma OT evidence is flawed by how OT is measured in peripheral bodily fluids. Finally, in recent large-sample studies, researchers failed to find consistent associations of specific OT-related genetic polymorphisms and trust. We conclude that the cumulative evidence does not provide robust convergent evidence that human trust is reliably associated with OT (or caused by it). . . .

Nave et al. has been cited 101 times.

OK, fine. The paper’s only been out 3 years. Let’s look at recent citations, since 2017:

“Oxytocin increases trust in humans”: 377 citations
“Does Oxytocin Increase Trust in Humans? A Critical Review of Research”: 49 citations

OK, I’m not the world’s smoothest googler, so maybe I miscounted a bit. But the pattern is clear: New paper revises consensus, but, even now, old paper gets cited much more frequently.

What’s happened since then? Here’s a quick look at total citations:

The old paper with the false conclusions is increasing its lead! Since 2019, Google says “about 440” citations for the old paper, “about 64” for the new one. So the ratio is improving, but still.

I also came across this:

A registered replication study on oxytocin and trust

Carolyn H. Declerck, Christophe Boone, Loren Pauwels, Bodo Vogt & Ernst Fehr

In an influential paper, Kosfeld et al. (2005) showed that intranasal administration of oxytocin (OT) increases the transfers made by investors in the trust game—suggesting that OT increases trust in strangers. Subsequent studies investigating the role of OT in the trust game found inconclusive effects on the trusting behaviour of investors but these studies deviated from the Kosfeld et al. study in an important way—they did not implement minimal social contact (MSC) between the investors and the trustees in the trust game. Here, we performed a large double-blind and placebo-controlled replication study of the effects of OT on trusting behaviour that yields a power of more than 95% and implements an MSC condition as well as a no-social-contact (NoC) condition. We find no effect of OT on trusting behaviour in the MSC condition. Exploratory post hoc analyses suggest that OT may increase trust in individuals with a low disposition to trust in the NoC condition, but this finding requires confirmation in future research.

There’s nothing wrong with people publishing research that turns out to be mistaken. No problem at all. Sometimes you can’t know a path is a dead end until you walk down it.

The problem is not (necessarily) with the original study. The problem is with a scientific culture that doesn’t have a good way of letting go of these mistakes. Like a harbor clotted with sunken vessels.

17 thoughts on ““Like a harbor clotted with sunken vessels”: update

  1. The problem with counting citations is that we don’t know the polarity of the citation. Maybe all the citations are criticisms. Given academic publishing conventions, wouldn’t most authors citing the reviews also cite the original paper? For example,

    Although the original study [25] that popularized this theory has been the subject of fierce methodological critiques (e.g., [26]),

    That [25] is Kosfeld et al. It didn’t get more interesting after the comma—it was basically saying, hey there might be something here.

    So no such luck. Instead, it’s getting used as ad homonym ammunition for a line of inquiry

    Thus, substantial effects of IN OXT were described on social cognition [138, 140,141,142], fear [143, 144], empathy [145, 146], trust [147], and even xenophobia [148, 149] often accompanied by altered neuronal activity [150].

    Kosfeld et al. is 147. The next citation I looked up concluded its abstract with “enough evidence exists to warrant funding large-scale studies/trials.”

    P.S. What’s with all of this science stuff being paywalled? It’s time for us scientists to stop giving away all of our content and then having our institutions buy it back.

    • On a slight tangent:

      I think it’s absolutely scandalous that academics routinely cite papers that they haven’t read in only the most cursory amount of detail. It seems like there is a game that the more citations = victory and the further back in time they go = more learned one is. It’s a more polished version of who has the bigger…credentials

      More on topic:

      I agree that there is a massive issue in citation tracking. But not sure how one would clean it up other than via some intermediary one has to trust to track better than a nose count.

      • Allan:

        I’ve been known to cite papers I haven’t carefully read, and the reason I do this not that I’m playing a game where more citations = victory and the further back in time they go = more learned one is. Rather, I include citations as an attempt to be fair and give credit, and also to help the reader who wants to trace the ideas backward. I find it very frustrating when academics don’t cite key work that’s come before, thus depriving readers of historical context.

      • > academics routinely cite papers that they haven’t read

        Agreed, this is problematic, but I think it is a symptom of the fact that there is just so much being published and it is impossible to read even just the papers most likely to be relevant to your work. As a result, it is hard to verify what papers say and whether they have any good reason to say it. So academics do what we all do—take shortcuts like relying on the title/abstract, knowledge of the authors, and opinions of others.

        > more citations = victory

        I agree, I think this is a latent belief that many academics have. But it is exacerbated by modern scientific publishing which encourages short reports of isolated studies with little attempt at theoretical integration (compare the popularity of the 2005 oxytocin paper with the 2015 review!). So instead of citing, say, a handful of theoretical/review papers that attempt to synthesize many studies, one ends up citing the hundred individual studies (while potentially ignoring the hundred other isolated studies that come to opposite conclusions).

        > the further back in time they go = more learned one is

        This may be true in some fields, but in data science or machine learning, it seems like just the opposite rule applies. If you’re citing anything more than 5 years old, you’re out of touch and not worth listening to.

        • I strongly agree that review articles seldom get the respect they deserve (if they’re good). I think this is because the implicit audience for most academic work is the small circle of specialists in the topic, and they typically regard themselves as “above” reviews in their field. So citing, much less foregrounding, reviews is a way of saying you are an outsider, not really a member of the club.

          But of course an insightful review that exposes unresolved questions, potential contradictions, unexplored pathways etc. is a high-level contribution that deserves recognition. FWIW, in my own work I always go out of my way to recommend such reviews to readers. And this also means you have to actually read or at least carefully skim the review to determine whether it is truly value-adding.

      • If I’m coming into an issue new (as so often happens when I’ve been asked to look at the econometrics of an issue I don’t know much about) I’d rather have too many citations than too few, and I don’t mind that much if the citations aren’t apt. The author is doing the research that otherwise I’d have no way to do.

        In the old days (mid ’70s) my standard way of learning anything new was to try and find the most recent discussions in journals and work backwards through the references. I still do that and the only big problem is that you can only find the now-forgotten gems by second- and third-order references from the new stuff.

      • To All:

        I didn’t mean to imply that a large number of citations was always indicative of gamesmanship! Indeed, there are many good reasons to have a large number of references (and I take no issue with reference count in and of itself). My qualm is with researchers who erroneously pad their reference list for other non-savory reasons. And it has been my impression, at least in the fields that I follow to some degree, that the latter is more prevalent than the former. It’s intentional obscurantism and I view it as unethical behavior.


        All else being equal I agree with you and I would rather have too many than too few to chase if there was no choice. But we do have a choice. To actually curate our references AND at least attempt to describe why we cite them either explicitly or implicitly by placement. Frequently they are not explicitly described (no blame given length constraints) but usually even the cite location leaves you scratching your head as to why the cite exists. God forbid you actually chase down a reference; frequently doing so makes you even more confused as to the cite (or worse yet by the author’s take on the reference).


        Agree about ML and other areas that have the apparent trait of progressing fast. In those circumstance the game would be played as you describe it!

      • > But not sure how one would clean it up other than via some intermediary

        I was a bit concerned too, but Bob’s two-sample follow up survey convinced me that this wouldn’t actually be too difficult.

        Like, if even a quarter of the citations since the review came out were positive (at least deferring without question to the original study), that seems like a big problem right? And I assume it’s a much higher fraction.

  2. Would the conventional laws of gravity & motion still be valid to you if you had never heard of Isaac Newton and his published works?
    Do you know or care who specifically designed the brakes in your automobile? (a very personal life or death issue)

    ‘Tracing ideas backward by name’ is an optional pursuit of historians, but largely unimportant to STEM.

    Citations are mostly an arbitrary social tradition from simpler times of communication, but now archaic for 98.6% of Earth population.
    They still prosper in academia as a ‘coin of the realm’ to help monetize (directly or indirectly) the huge mass of publishing output.

    • Citations are useful in various ways. One not mentioned in the discussion so far is that, if you know the literature in field, you can tell a lot about an author by who they cite, or don’t cite. It is particularly revealing when they cite a paper inappropriately/

      • “It is particularly revealing when they cite a paper inappropriately/”

        100% on this. Even more disturbing is when you find they’ve cited YOUR paper inappropriately (sometimes I gone down a trail of papers caused by a single paper that has cited mine in the wrong context. The subsequent papers probably did not look at an abstract from my paper. Oh well.)

        Sometimes if I find something particularly surprising I will hunt down the citation to see if I agree with the citing author’s interpretation (especially if I think I have read that paper before). In the case where I find the claim surprising or counter to my prior knowledge (which obviously biases the result), I would estimate about half the time I disagree and a small but non-zero percent of the time the cited paper explicitly declares results opposite of what the citing paper suggests.

        This is a problem that I think is pretty common in sciences that require every one to have a deep understanding of a large subset of the field, but the research is often done within very specialized set of circumstances. So you trust others in your specialized section in their interpretations of papers within that overarching conversation. This is where really well done review articles should be where authors… but again if it’s super specialized, how do you know if the review is really well done, you don’t know who the authorities are, and which ones are definitely representing a view point that may or may not be under dispute.

  3. The first article’s title is pithy and exciting (although incorrect).
    The second’s title is … well what’s the opposite of “click bait”? Is “click repellent” a thing? Need a title like

    “You — Yes, You — Disproved the 2005 Paper By Trusting It Even Though You’re Not Snorting Oxytocin (Probably)”

    Other thoughts on a better title?

  4. I’ve been using PubPeer to keep up with “flagged” research – do others do this as well? For example, this Kosfeld paper came up flagged for me on ResearchGate. Is this widely used? It would improve their results if so.

Leave a Reply

Your email address will not be published. Required fields are marked *