I pointed Steven Pinker to my post, How much time (if any) should we spend criticizing research that’s fraudulent, crappy, or just plain pointless?, and he responded:
Clearly it *is* important to call out publicized research whose conclusions are likely to be false. The only danger is that it’s so easy and fun to criticize, with all the perks of intellectual and moral superiority for so little cost, that there is a moral hazard to go overboard and become a professional slasher and snarker. (That’s a common phenomenon among literary critics, especially in the UK.) There’s also the risk of altering the incentive structure for innovative research, so that researchers stick to the safest kinds of paradigm-twiddling. I think these two considerations were what my late colleague Dan Wegner had in mind when he made the bumbler-pointer contrast — he himself was certainly a discerning critic of social science research. [Just to clarify: Wegner is the person who talked about bumblers and pointers but he was not the person who sent me the email characterizing these as “our only choices in life.”—AG.]
The other comment is that I don’t think that evolutionary psychology is a worse offender at noise-mining than social psychology in general. Quite the contrary, the requirement that a psychological mechanism enhance reproductive success in a pre-modern environment at least imposes a modicum of aprioricity on hypotheses, which is entirely lacking in non-evolutionary (and defiantly atheoretical) social psychology. The worry that you can spin scientifically respectable evolutionary hypotheses post hoc for any finding is, in my view, greatly exaggerated. The Griskevicius finding may be wrong, for all the usual reasons, but the hypothesis is well motivated by prior theory and research.
To which I replied:
I think there are 3 things going on:
1. The science. As Lakatos and other philosophers of science have emphasized, any real scientific theory will make all sorts of predictions. The mapping of theory to prediction is a messy and necessary part of science. So a theory can be valid even if it is difficult to test, indeed part of the reason for testing a theory is often not to confirm or dispute the theory’s validity but to refine the theory.
2. Data collection. The studies by Griskevicius etc. have an extremely low ratio of signal to noise. Variability is high, measurements are crude, comparisons are performed between subjects, and this is all with a background of small effects that vary in sign and magnitude. As a result, the studies provide essentially zero information about the theory.
3. Multiple comparisons. The reason that multiple comparisons come in is to explain how it is that researchers such as Bem, Griskevicius, etc., manage to consistently find statistical significance (typically, many statistically significant comparisons in a single study) even though their noise level is so high. Multiple comparisons is the answer, and the point of our garden of forking paths paper is to explain how this problem can arise even for studies that are well motivated by substantive theory.
In short, my claim is not that the theories of Griskevicius etc. are wrong (about that, I have no idea) and my central criticism of them is not data-mining and multiple comparisons. Rather, my problem is that the study design is such that the data provide essentially no information about the science. I’d have no problem with the theory being presented as such; my problem is with the incorrect (in this case) claims that the data add anything to the story.
Regarding incentives structure, I fear that the current lack of incentives to criticize serves to offer an incentive for researchers to do small noisy studies which then they can sometimes publish in places such as Psychological Science. I would love if the incentives were to change so that researchers would put more effort into careful measurement and design!