Social science research has been getting pretty bad press recently, what with the Excel buccaneers who didn’t know how to handle data with different numbers of observations per country, and the psychologist who published dozens of papers based on fabricated data, and the Evilicious guy who wouldn’t let people review his data tapes, etc etc. And that’s not even considering Dr. Anil Potti.
On the other hand, the revelation of all these problems can be taken as evidence that things are getting better. Psychology researcher Gary Marcus writes:
There is something positive that has come out of the crisis of replicability—something vitally important for all experimental sciences. For years, it was extremely difficult to publish a direct replication, or a failure to replicate an experiment, in a good journal. . . . Now, happily, the scientific culture has changed. . . . The Reproducibility Project, from the Center for Open Science is now underway . . .
And sociologist Fabio Rojas writes:
People may sneer at the social sciences, but they hold up as well. Recently, a well known study in economics was found to be in error. People may laugh because it was an Excel error, but there’s a deeper point. There was data, it could be obtained, and it could be replicated. Fixing errors and looking for mistakes is the hallmark of science. . . .
I agree with Marcus and Rojas that attention to problems of replication is a good thing. It’s bad that people are running incompetent analysis or faking data all over the place, but it’s good that they’re getting caught. And, to the extent that scientific practices are improving to help detect error and fraud, and to reduce the incentives for publishing erroneous and fradulent results in the first place, that’s good too.
But I worry about a sense of complacency. I think we should be careful not to overstate the importance of our first steps. We may be going in the right direction but we have a lot further to go. Here are some examples:
1. Marcus writes of the new culture of publishing replications. I assume he’d support the ready publications of corrections, too. But we’re not there yet, as this story indicates:
Recently I sent a letter to the editor to a major social science journal pointing out a problem in an article they’d published, they refused to publish my letter, not because of any argument that I was incorrect, but because they judged my letter to not be in the top 10% of submissions to the journal. I’m sure my letter was indeed not in the top 10% of submissions, but the journal’s attitude presents a serious problem, if the bar to publication of a correction is so high. That’s a disincentive for the journal to publish corrections, a disincentive for outsiders such as myself to write corrections, and a disincentive for researchers to be careful in the first place. Just to be clear: I’m not complaining how I was treated here; rather, I’m griping about the system in which a known error can stand uncorrected in a top journal, just because nobody managed to send in a correction that’s in the top 10% of journal submissions.
2. Rojas writes of the notorious Reinhardt and Rogoff study that, “There was data, it could be obtained, and it could be replicated.” Not so fast:
It was over two years before those economists shared the data that allowed people to find the problems in their study. If the system really worked, people wouldn’t have had to struggle for years to try to replicate an unreplicable analysis.
And, remember, the problem with that paper was not just a silly computer error. Reinhardt and Rogoff also made serious mistakes handling their time-series cross-sectional data.
3. Marcus writes in a confident tone about progress in methodology: “just last week, Uri Simonsohn [and Leif Nelson and Joseph Simmons] released a paper on coping with the famous file-drawer problem, in which failed studies have historically been underreported.” I think Uri Simonsohn is great, but I agree with the recent paper by Christopher Ferguson and Moritz Heene that the so-called file-drawer problem is not a little technical issue that can be easily cleaned up; rather, it’s fundamental to our current practice of statistically-based science.
And there’s pushback. Biostatisticians Leah Jager and Jeffrey Leek wrote a paper, which I strongly disagree with, called “Empirical estimates suggest most published medical research is true.” I won’t go into the details here—my take on their work is that they’re applying a method that can make sense in the context of a single large study but which won’t generally work with meta-analysis—my point is that there remains a constituency for arguments that science is basically OK already.
I respect the view of Marcus, Rojas, Jager, Leek, and others that the current environment of criticism has in some ways gone too far. All those people do serious, respected research, and those of us who do serious research know how difficult it can be to publish in good journals, how hard we work—out of necessity—to consider all possible alternative explanations for any results we find, how carefully we document the steps of our data collection and analysis, and so forth. But many problems still remain.
Thomas Basbøll analogizes the difficulties of publishing scientific criticism to problems with the subprime mortgage market before the crash. He quotes Michael Lewis:
To sell a stock or bond short you need to borrow it, and [the bonds they were interested in] were tiny and impossible to find. You could buy them or not buy them but you couldn’t bet explicitly against them; the market for subprime mortages simply had no place for people in it who took a dim view of them. You might know with certainty that the entire mortgage bond market was doomed, but you could do nothing about it.
And now here’s Basbøll:
I had a shock of recognition when I read that. I’ve been trying to “bet against” a number of stories that have been told in the organization studies literature for years now, and the thing I’m learning is that there’s no place in the literature for people who take a dim view of them. There isn’t really a genre (in the area of management studies) of papers that only points out errors in other people’s work. You have to make a “contribution” too. In a sense, you can buy the stories people are telling you or not buy them but you can’t criticize them.
This got me thinking about the difference between faith and knowledge. Knowledge, it seems to me, is a belief held in a critical environment. Faith, we might say, is a belief held in an “evangelical” environment. The mortgage bond market was an evangelical environment in which to hold beliefs about housing prices, default rates, and credit ratings on CDOs. There was no simple way to critique the “good news” . . .
Eventually, as Lewis reports, people were able to bet against the subprime mortgage market, but it wasn’t easy. And the fact that some investors, with great difficulty, were able to do it, doesn’t mean the financial system is A-OK.
Basbøll’s analogy may be going too far, but I agree with his general point that the existence of a few cases of exposure should not make us complacent. Marcus’s suggestions on cleaning up science are good ones, and we have a ways to go before they are generally implemented.
P.S. Coincidentally, Jeff Leek posted something today on the same topic, but with a slightly different perspective (he refers to “the current over-pessimism about science”). Leek argues, reasonably enough, “that people are using a few high-profile cases to hyperventilate about the real, solvable, and recognized problems in the scientific process” and he worries that “the rational reasonable problems we have, with enough hyperbole, will make it look like the scientific process ‘sky is falling'” and lend support to political attacks on science more generally. I think Jeff and I should be able to agree to the following:
– Science is hard, we all make mistakes, the system has problems but all human systems have problems, in working to fix these problems we shouldn’t thrown the research baby out with the bathwater that is the changing rules of scientific communication.
– We’re not there yet, we still live in a world in which it’s easier to publish and hype a elaborate flawed claim than to report a simple correction, a world in which data sharing is far from the norm, and where social and statistical biases lead to systematic overreporting of dramatic claims and systematic overestimation of effect sizes.
Leek is making the valid point that the sort of doomsaying that has been needed to draw attention to problems in scientific communication and to motivate improvements, can also be used, in guilt-by-association sense, to disparage good science. And, even in popular culture, my impression is that things aren’t as bad as they used to be. Sure, vaccine deniers and global warming deniers and all the other deniers are out there, but it’s not like the 70s when people were buying millions of copies of Chariots of the Gods, The Jupiter Effect, and The Bermuda Triangle, right?