This study could be just fine, or not. Maybe I’ll believe it if there’s an independent preregistered replication.

David Allison sent along this article, Sexually arousing ads induce sex-specific financial decisions in hungry individuals, by Tobias Otterbringa and Yael Sela, and asked whether I buy it.

I replied that maybe I’ll believe it if there’s an independent preregistered replication. I’ve just seen too many of these sort of things to ever believe them at first sight.

Allison responded:

My intuition agrees with yours. The thing that initially caught my eye was a study with high human interest appeal of an evolutionary psychology finding of the type that some have described as “just so stories.” I too have published such ‘just so stories’ and many people (including me) are drawn to them, but lately questions about their robustness and replicability have been raised. For a recent example, see here.

The second thing that caught my eye was that the key findings involve a high-order interaction. Higher order interactions of course can be real, and even prespecified, but when a finding is, by definition, so “conditionally dependent” it raises the question of whether this is perhaps a chance finding due in part, to the often inherent multiple testing involved in subgrouping analyses and higher order interactions. In addition, interaction tests often have low power which tends to increase the false positive rate under reasonable assumptions.

In a further look at the paper, I see no statement that the study was pre-registered. I also note that there seem to be post hoc data analytic decisions made which, as you have described in your paper on “a garden of forking paths” may also lead to non-replicable findings. There is no statement that the assignment of subjects to conditions was random. An ad hoc measure of hunger was used when there are standard pre-existing measures available (e.g., < href="https://www.ncbi.nlm.nih.gov/pubmed/3981480">here) and better still, it would have been relatively easy to randomize subjects to simply skip breakfast and lunch for a day or not and one would have been a more valid approach for assessing the causal effects of hunger.

When all of these factors are put together, it raises skepticism. None of these factors mean that the finding is wrong or that the study is not cool, interesting, and well executed and honestly reported, but it does support the intuition that the result has a low subjective probability of replication.

18 thoughts on “This study could be just fine, or not. Maybe I’ll believe it if there’s an independent preregistered replication.

  1. From the paper:

    “Historically, women have faced greater pressure than men to inhibit potentially maladaptive sexual responses because pregnancy has a much higher biological cost for women in terms of parental investment(Bjorklund&Kipp,1996;Trivers,1972). Our findings suggest that exposure to sex cues not only enhances women’s inhibitory responses in the sexual domain, but may also improve their ability to exert self control in other domains and make patient financial decisions. In ancestral times, it should have been more harmful for women to engage in uncommitted sex when the availability of nutritious foods was scarce, given the vast metabolic costs associated with a pregnancy. Conversely, resource scarcity has been shown to trigger more shortsighted behaviors in men, presumably in a response designed to increase the chances of passing their genes on before it is too late (Wilson & Daly, 1985).”

    I am surprised to see these old papers so credulously cited in this day and age. This stuff has a distinct 1980s feel to it. The term “just so stories” goes back to Rudyard Kipling, but the term was derisively applied to evolutionary psychology by Stephen Jay Gould. The quoted text is a perfect example. The basic premise is that if a trait can be described, even a slight behavioral preference, then it must have been selected for according to the rules of survival of the fittest. The problem is that the better you understand the complexity of evolutionary selection, the less likely you are to believe the premise. Gould in particular had a very nuanced understanding of just how messy selection can be. So even setting all the statistical issues aside, papers like this are built upon premises that themselves are at best highly controversial, and at worst highly dubious.

    • “The basic premise is that if a trait can be described, even a slight behavioral preference, then it must have been selected for according to the rules of survival of the fittest.”

      To me, “survival of the fittest” does a very poor job of describing what evolution really is. The phrase “survival of the fit enough to have survived up to a certain point” is a better description of what actually happens in evolution.

    • They all need to reread Darwin. Most traits have to be neutral at any given moment. That is a necessary premise for evolution to work. There have to be variants that are neutral before the extinction event but beneficial under that selective pressure. If every trait were optimized to a particular environment, then a change to that environment would decimate that species. The presumption should be when we see variation within a species that most of it is neutral (maybe it was subject to evolutionary pressure in the past, but if so, why isn’t the trait pervasive). All of the evolutionary stories about male/female difference strike me as contrived. When we put our preconceptions to the side, men and women don’t seem that different. We now have women and men in the same jobs and roles. Where differences are minor, why would we look to natural selection as an explanation. There likely was selective pressure at some point that explains why we have five fingers because we all have five fingers, but do people believe that blues eyes are the result of natural selection?

      • All of the evolutionary stories about male/female difference strike me as contrived. When we put our preconceptions to the side, men and women don’t seem that different.

        I love this debate tactic. State something as your personal viewpoint and ask the other side to convince you otherwise. Of course, there is never quite enough evidence to convince you otherwise. The ingenuity of the ploy is that it makes you the arbiter or what is convincing and what is not.

        I think of this as the “Invincible Ignorance” gambit.

        The reproductive differences between women and men are so small that I can barely see them. Convince me otherwise.

        Reproduction is almost completely unrelated to evolution. Convince me otherwise.

        • Certainly, I wasn’t talking about reproductive differences. As I said, I am talking about variation within the species. If men and women are almost always different in some trait, it seems likely that that trait is the result of selective pressure, which reproductive differences may well be, but a lot of claims that I have seen (I agree that is just an personal observation) are about somewhat small differences between men and women in some social trait that is measured by some noisy test like the claim above that women are better at thinking long term than men. It doesn’t seem like a big difference to me. Lots of men can engage in long term planning, so why would we assume that this difference is the result of selective pressure. If their were selective pressure, it would eliminate the women who are not as good at long term planing as men, and we would see a big disparity not a little one.

        • I agree that evolutionary psychology, like many areas, has many papers that try to extend the paradigm too far and end up making trivial and dubious observations based on noise. The question is whether the field has anything of value to teach us. Beggar that I am, I’ll settle for just a few nuggets of value.

        • The question with evolutionary psych is whether having “many papers that try to extend the paradigm too far and end up making dubious trivial and dubious observations based on noise” is a feature or a bug. I tend to believe it is a bug, but I am open to being proved wrong.

      • “In ancestral times, it should have been more harmful for women to engage in uncommitted sex when the availability of nutritious foods was scarce, given the vast metabolic costs associated with a pregnancy. Conversely, resource scarcity has been shown to trigger more shortsighted behaviors in men, presumably in a response designed to increase the chances of passing their genes on before it is too late (Wilson & Daly, 1985).”

        The reason this is a “just-so” story is that it’s non-randomly selecting a very small set of the factors that impact human sexual behavior (the factors the authors tested for or have easy access to some bit of information for) and creating a story that fits only those factors, ignoring tens, hundreds or possibly even thousands of other relevant factors both known and unknown. Furthermore, like many narratives involving pre-history, it’s projecting them back into an environment we know almost nothing about. This is a common feature of earth disaster stories as well.

      • “They all need to reread Darwin.”

        I want to plug a different book, Oliver Sacks’ “Island of the Colorblind.” This fascinating book really brings home how complicated selection can be, when we see that historical contingency is powerful enough to cause a severely deleterious genetic mutation to increase significantly in an isolated population. It should be mandatory reading for all evolutionary psychologists!

  2. I tried to read the article but couldn’t get past the addition of hunger. Talk about confounding dimensions! You can’t unravel that to well-ordered. Now if they’d added, ‘on Wednesday’, then I’d be all in because everyone knows that hot bods on hump day makes people want to smoke Camels (one or two humps), and that urge to smoke can’t be expressed in the modern work place so they instead do …

  3. Ask a psychologist to measure a simple everyday thing and this is what you get:

    “Next, all participants replied to six items meant to measure their
    level of hunger (1 = strongly disagree; 7= strongly agree). An exploratory factor analysis with varimax rotation revealed that only four
    items loaded on a common hunger factor. These items (Right now, I feel
    very hungry; It feels like I have good appetite right now; I need to do
    something about my hunger; I would like to have something appetizing
    right now) had an eigenvalue of 3.39, explained 56.52% of the variance, and did not contain any cross-loadings above 0.40. Thus, they
    were averaged to create a hunger index (α = 0.88; M = 3.65,
    SD = 1.86; skewness: 0.33, kurtosis: −1.04)”

    I can just imagine the day-to-day hassle, having to conduct all those factor analyses to determine if their spouse _really_ is hungry, and what questions _possibly_ could correlate with that pesky latent feature that is otherwise known as hunger.

  4. “Results indicate that exposure to sexually arousing (vs. neutral or no) ads makes men more financially impatient than women.”

    Is this controversial or even open to question? :)

    An attractive scantily clad woman can sell a shoe lace to a man for fifty bucks. But not even adds for ED products directed at women try to use sexual allure to sell to women – it’s all about how good they’ll feel the next day (
    good morning, good morning!) ! Hilarious.

Leave a Reply

Your email address will not be published. Required fields are marked *