Someone writes in:
In the most recent absurd embodiment paper on wobbly stools leading to wobbly relationship beliefs, Psych Sci asked for a second study with a large N. The authors performed it, found no effect and then performed a mediation analysis to recover the effect. It’s a good example for garden of forking paths given that the mediation analysis is decided on post hoc and there are a number of ways to approach the problem.
No kidding! The article, by Amanda Forest, David Kille, Joanne Wood, and Lindsay Stehouwer, is called, “Turbulent Times, Rocky Relationships: Relational Consequences of Experiencing Physical Instability.” Almost a parody of a Psych Science tabloid-style paper. From the abstract:
Drawing on the embodiment literature, we propose that experiencing physical instability can undermine perceptions of relationship stability. Participants who experienced physical instability by sitting at a wobbly workstation rather than a stable workstation (Study 1), standing on one foot rather than two (Study 2), or sitting on an inflatable seat cushion rather than a rigid one (Study 3) perceived their romantic relationships to be less likely to last. . . . These findings indicate that benign physical experiences can influence perceptions of relationship stability, exerting downstream effects on consequential relationship processes.
This is no joke. Here’s how the paper begins:
The earthquake that struck Sichuan, China, in 2008 made headlines not only because of the tremendous loss of life it caused, but also because after the quake, Sichuan came to lead the country in number of divorces (Zhiling, 2010). Experts and popular media outlets made causal claims (e.g., “Earthquake Boosts Divorce Rate,” 2010). If the earthquake truly caused changes in Sichuan’s divorce rate, why might this be? Emotional distress, financial hardship, and mortality salience may well contribute. Sociologist Guang Wei speculated that Sichuan residents “decided to live each day to the fullest . . . if they do not get along with their spouses, they decide to part ways” (Zhiling, 2010, paras. 9–10). We examine a different feature of earthquakes that may affect relationships: physical instability.
Don’t get them wrong, though. They very graciously admit to not having the whole story:
Certainly, the shaking ground was not solely responsible for the change in the divorce rate in Sichuan.
That’s a relief!
Getting to my correspondent’s criticisms, yes, lots of forking paths in preparing the dataset:
Data for 54 participants were collected. Because our main measure was perceived stability of a person’s relationship with a particular partner, only data from participants in exclusive romantic relationships were included in the analyses (36 exclusively dating, 8 cohabiting, 2 engaged, and 1 married) . . . Data from 3 participants—1 who stood instead of sitting and 2 who communicated with friends while completing the questionnaire—were omitted from analyses. . . . Participants responded to six items (α = .96) regarding the stability of their current romantic relationship. These included the four items from Study 1 as well as similar items querying confidence in still being together in 10 and 20 years. . . . Data from 4 participants—1 who reported not having followed the posture instructions and 3 who correctly guessed the study hypothesis—were omitted from analyses. . . .
And forking paths in the analysis:
We averaged participants’ ratings of their beliefs that they would remain with their partners over each of the four time periods in the items on relationship stability. . . . Mediation analysis using PROCESS Model 4 revealed a significant indirect effect of condition on reports of relationship quality via perceived relationship stability . . .
Lots and lots of mediation analyses. But what happened to the main effect in the replications?
The physical-stability manipulation used in this study did not produce significant condition differences in negative affect. . . . Contrary to prediction, a one-way ANOVA yielded no evidence of a direct effect of condition on perceived relationship stability.
And our old friend, the difference between significant and non-significant:
Participants’ experience of negative affect did not differ between the two conditions, F < 1, which suggests that mood is not a viable alternative explanation for the observed condition differences. . . . A model in which the order of perceived relationship stability and relationship quality was reversed did not yield a significant indirect effect . . .
And good old “marginally significant”:
Although we observed only a marginally significant effect of posture condition on perceived relationship stability, it is widely accepted that indirect (i.e., mediated) effects can be examined even in the absence of any direct link between a predictor and outcome.
The research team found a newsworthy result which did not appear in the replications. But that didn’t stop them from doing some mediation analyses and finding some statistical significance and some non-significance in various places along their forking paths. They wove this together and wrote it up as if they’d discovered something important.
Let’s check the score. Again, from the abstract:
Participants who experienced physical instability by sitting at a wobbly workstation rather than a stable workstation (Study 1), standing on one foot rather than two (Study 2), or sitting on an inflatable seat cushion rather than a rigid one (Study 3) perceived their romantic relationships to be less likely to last.
Is this true? For study 1, yes, after all their choices in data construction and data analysis, they achieved “p less than .05” (p=.034, to be precise). For study 2, despite their flexibility in excluding people and defining the outcome, they were only able to get p down to .069. For study 3, nothing at all, “F less than 1,” as they put it. And they really did have lots of things to win—they bought lots of tickets for the “p less than .05” lottery. For example:
For our behavioral measure, participants were asked to select and send a “thinking of you” electronic greeting card (e-card) to their romantic partners. Each participant chose an e-card design from six choices that had been prerated and selected to vary in intimacy (for details, see Supplemental Material). The intimacy of the card design selected was one outcome of interest. However, we observed no direct or indirect effects of stability condition on card design intimacy, so we do not discuss it further.
The authors get full credit for reporting this—but no credit for realizing what this sort of thing does to their analysis! They consistently report their successes in detail and downplay the null findings. That’s called capitalizing on chance.
Published in Psychological Science: if we reward this sort of research behavior, I see no reason we won’t get lots more of it. I have no reason to think the authors and journal editor are trying to mislead anyone; rather, I’m guessing they’re true believers. They did their own replication and it failed. But they did not do the next step and place their theory and methods under criticism. Too bad.