In a discussion of yesterday’s post on studies that don’t replicate, Nick Brown did me the time-wasting disservice of pointing out a recent press release from Psychological Science which, as you might have heard, is “the highest ranked empirical journal in psychology.”
The press release is called “Blue and Seeing Blue: Sadness May Impair Color Perception” and it describes a recently published article by Christopher Thorstenson, Adam Pazda, Andrew Elliot, which reports that “sadness impaired color perception along the blue-yellow color axis but not along the red-green color axis.”
Unfortunately the claim of interest is extremely difficult to interpret, as the authors do not seem to be aware of the principle that the difference between “significant” and “not significant” is not itself statistically significant:
The paper also features other characteristic features of Psychological Science-style papers, including small samples of college students, lots of juicy researcher degrees of freedom in data-exclusion rules and in the choice of outcomes to analyze, and weak or vague theoretical support for the reported effects.
The theoretical claim was “maybe a reason these metaphors emerge was because there really was a connection between mood and perceiving colors in a different way,” which could be consistent with almost any connection between color perception and mood. And then once the results came out, we get this: “‘We were surprised by how specific the effect was, that color was only impaired along the blue-yellow axis,’ says Thorstenson. ‘We did not predict this specific finding, although it might give us a clue to the reason for the effect in neurotransmitter functioning.'” This is of course completely consistent with a noise-mining exercise, in that just about any pattern can fit the theory, and then the details of whatever random thing that comes up is likely to be a surprise.
It’s funny: it’s my impression that, when a scientist reports that his findings were a surprise, that’s supposed to be a positive thing. It’s not just turning the crank, it’s scientific discovery! A surprise! Like penicillin! Really, though, if something was a surprise, maybe you should take more seriously the possibility that you’re just capitalizing on chance, that you’re seeing something in one experiment (and then are motivated to find in another). It’s the scientific surprise two-step, a dance discussed by sociologist Jeremy Freese.
As usual in such settings, I’m not saying that Thorstenson et al. are wrong in their theorizing, or that their results would not show up in a more thorough study on a larger sample. I’m just saying that they haven’t really made a convincing case, as the patterns they find could well be explainable by chance alone. Their data offer essentially no evidence in support of their theory, but the theory could still be correct, just unresolvable amid the experimental noise. And, as usual, I’ll say that I’d have no problem with this sort of result being published, just without the misplaced certainty. And, if the editors of Psychological Science think this sort of theorizing is worth publishing, I think they should also be willing to publish the same thing, even if the comparisons of interest are not statistically significant.
OK, on to the main event. After Nick alerted me to this paper, I thought I should post something on it. But my post needed a title. Here were the titles I came up with:
“Feeling Blue and Seeing Blue: Desire for a Research Breakthrough May Impair Statistics Perception”
“Stop me before I blog again”
“The difference between ‘significant’ and ‘not significant’ is enough to get published in the #1 journal in psychology”
“They keep telling me not to use ‘Psychological Science’ as a punch line but then this sort of thing comes along”
“This week in Psychological Science.”
But maybe you have a better suggestion?
Winner gets a free Stan sticker.
P.S. We had another one just like this a few months ago.
P.P.S. I have nothing against Christopher Thorstenson, Adam Pazda, or Andrew Elliot. I expect they’re doing their best. It’s not their fault that (a) statistical methods are what they are, (b) statistical training is what is is, and (c) the editors of Psychological Science don’t know any better. It’s all too bad, but it’s not their fault. I laugh at these studies because I’m too exhausted to cry, that’s all. And, before you feel too sorry for these guys or for the editors of Psychological Science or think I’m picking on them, remember: if they didn’t want the attention, they didn’t need to publish this work in the highest-profile journal of their field. If you put your ideas out there, you have to expect (ideally, hope) that people will point out what you did wrong.
I’m honestly surprised that Psychological Science is still publishing this sort of thing. They’re really living up to their rep, and not in a good way. PPNAS I can expect will publish just about anything, as it’s not peer-reviewed in the usual way. But Psych Science is supposed to be a real journal, and I’d expect, or at least hope, better from them.
P.P.P.S. Lots of great suggestions in the comments, but my favorite is “Psychological Science publishes another Psychological Science-type paper.”
P.P.P.P.S. I feel bad that the whole field of psychology gets tainted by this sort of thing. The trouble is that Psychological Science is the flagship journal of the Association for Psychological Science, which I think is the main society for psychology research. The problem is not haters like me that draw attention to these papers; the problem is that this sort of work is regularly endorsed and publicized by the leading journal in the field. When the Association for Psychological Science regularly releases press releases touting this kind of noise study, it does tell us something bad about the field of psychology. Not all the work in the field, not most of the work in the field, not the most important work in the field. Psychology is important and I have a huge respect for many psychology researchers. Indeed I have a huge respect for much of the research within statistics that has been conducted by psychologists. And I say, with deep respect for the field, that it’s bad news that its leading society publicizes work that is not serious and has huge, obvious flaws. Flaws that might not have been obvious 10 or even 5 years ago, when most of us were not so aware of the problems associated with the garden of forking paths, but flaws which for the past couple of years have been widely known. They should know better; indeed I’d somehow thought they’d cleaned up their act so I was surprised to see this new paper, front and center in their leading journal.