Significance testing, the replication crisis, and the market for lemons

In an article, “Accounting research and the significance test crisis,” David Johnstone writes:

There are now hundreds of published papers and statements, echoing what has been said behind closed doors for decades, namely that much if not most empirical research is unreliable, simply wrong or at worst fabricated. The problems are a mixture of flawed statistical logic . . . fishing for significant results and publications, selective reporting . . . and ultimately the ‘‘agency problem” that researchers charged by funding bodies . . . are motivated more by the personal need to publish and please other researchers. Expanding on that theme, the supply of empirical research in the “market for statistical significance” is described in terms of “market failure” and “the market for lemons”.

He elaborates:

The problem for genuine research is that the process of learning while experimenting, gathering more data, improving proxies and experimental or statistical controls, and even things like cleaning data and removing outliers, can be explained equally by either “good” or “bad” science. To the outsider, they often look for all intents and purposes the same, which is a problem that will not go away in any research environment where work is published in elite journals even when it will not or cannot be replicated with new and independent data. The mechanics of information asymmetry, adverse selection and the market for lemons suggest that genuine research effort will go relatively unrewarded.

Here’s the background on that lemons thing:

A problem in the market for statistical significance is that the intrinsic qualities of the thing being sold are not observable to the buyer, much as in the Akerlof (1970) “market for lemons.” Many of the Akerlof corollaries apply. Information asymmetry between authors (sellers) and readers (buyers) will allow and reward opportunistic behaviours by authors and leave an adverse selection problem for journal editors in the case of papers rejected at other journals or sent to journals in more need of copy. Genuine researchers may be driven out of production, if they have no way to effectively “signal” the true quality of their work. In a limiting case, all papers published will be assumed to be “lemons”, as could hold true if enough genuine researchers find more rewarding applications for their skills and honesty, and the market will collapse.

Another finance analogy paints statistical researchers as akin to noise traders, where statistical noise masquerading as reliable evidence or a meaningful pattern can tempt belief and investment in a false lead. Even the investigator does not know when false assumptions (e.g. a false model) or pet hypotheses have been confirmed merely by luck or noise.

What’s amusing is that the “market for lemons” idea comes from economics, but empirical economics researchers are often entirely oblivious to these selection problems. Even famous, award-winning economists working on important problems can entirely miss the idea. Maybe the connection that Johnstone makes to the well-known lemons problem will help.

15 thoughts on “Significance testing, the replication crisis, and the market for lemons

  1. To the outsider, they often look for all intents and purposes the same

    It really isn’t hard. Any outsider can easily tell the difference between science and NHST, by checking whether the rearchers tested their own hypothesis or a default null hypothesis. It was explained perfectly fine in 1967:

    Because physical theories typically predict numerical values, an improvement in experimental precision reduces the tolerance range and hence increases corroborability. In most psychological research, improved power of a statistical design leads to a prior probability approaching 1/2 of finding a significant difference in the theoretically predicted direction. Hence the corroboration yielded by “success” is very weak, and becomes weaker with increased precision. “Statistical significance” plays a logical role in psychology precisely the reverse of its role in physics.

    It is no surprise that, if you reverse the logic of science, you end up producing a vast amount of misinformation and progress grounds to a halt. That is exactly what we observe in every field that adopts it.

    Now we are dealing with a kind of mass denial because the mind recoils at how bad this problem is.

  2. Akerlof is married to Janet Yellen, and indeed married her not that long after he published his paper on markets with asymmetric information, so you know he takes this stuff seriously.

    • Slightly more seriously, the market for lemons in academic papers is just like the market for lemons in just about everything. It’s just Sturgeon’s Law. Or, to quote Daniel Dennett: “90% of everything is crap. That is true, whether you are talking about physics, chemistry, evolutionary psychology, sociology, medicine – you name it – rock music, country western. 90% of everything is crap.”

      • Jonathan:

        Sure, but I think something like 99% of Dennett’s work is crap!

        More seriously, the market for lemons isn’t just Sturgeon’s law—it’s also an explanation for Sturgeon’s law, right? A theoretical model that predicts the Sturgeon effect?

        P.S. We should’ve included Sturgeon and Clarke in the Creators of Laws or Rules category of our speakers competition.

        • You are correct. Akerlof’s Market for Lemons is an explanation (and prediction of) Sturgeon’s Law, so long as you assume that asymmetric information is sufficently rampant, That’s how you get one of those Nobel Prizes…. (An exercise to the reader… why aren’t 90 percent of Nobel Prizes crap?)

        • Jonathan:

          Indeed, someone should look into the fractal implications of Sturgeon’s law. For example, 90% of my papers aren’t crap! I’d say it’s more like 50% for me.

        • @Andrew: One problem with doing a fractal analysis of Sturgeon’s Law is that the fraction of work involving fractals that’s crap is larger than Sturgeon’s 90%, presenting some sort of paradox.

  3. I would think there’s a real difference between academic papers and lemon cars. The buyer of a car wants to avoid a “lemon” that doesn’t work well. The market for used cars can collapse if every buyer thinks that any such car must be a lemon, and thus refuses to enter that market. But people can publish academic papers that are unreliable and fail to replicate because they meet the criteria journals use for publication, and being published in the journal is what the author wants. A hypothetical reader who actually cares about the reliability of the paper isn’t actually necessary for the “market” to continue functioning.

    • I agree. I don’t think the analogy is too helpful beyond, “assymetric information is a problem in the published manuscripts market”. But still, using the lemons analogy is a nice way to try to make the problem stick in peoples’ minds for longer.

  4. Is there a converse to the market for lemons?

    I’ve often considered firearms silencers. They’re tightly-regulated in the US, to such an extent that you need to pay for a $200 tax stamp for each one you want to own. Mechanically and physically, silencers do not have to be that complex, it’s a small metal tube with some baffles inside. At the high-end, there’s no end to the technological advancements that are trumpeted by manufacturers (3d printing allowing arbitrarily complex baffle designs to direct gas flow, fancy alloys, modular designs), but there’s no reason that they can’t be made cheaply enough to be disposable and sold in blister packs at the checkout at Walmart. Except that even a dirt-cheap one would still require you to buy a $200 tax stamp, so if you have to spend at least $200 you’re not going to buy a Harbor Freight silencer.

    This phenomenon feels like it should have a name.

    • I think one concept from economics would be transaction costs. Regulation can reduce transaction costs (when you buy groceries at the supermarket you don’t need to check that the pepper was not adulterated with wood chips and the beer has the alcohol content it says it has) or increase them (eg. tariffs or stamp taxes)

  5. I reject the premise that a market for good science exists.

    Oh, there’s a market for for publications/citations among authors, a market for eyeball-grabbing results among editors, a market for time and effort on the part of volunteer reviewers, a market for prestige among universities and other research institutions (as a means to obtain funding and top-tier researchers and programs), and a market for journalists and other media for clickbait. The closest we get to a market for good science is the competition for funding, but at least in the social sciences, that often as not relies upon a strong CV, facility with jargon, and consistency with published results.

    Certainly, there are real people who read and use published social science research. There is demand. But a market requires more than demand, it requires a means of influencing those with the supply. How do you reward a sincere author/journal and punish a compromised author/journal? By buying journal subscriptions? A drop in the bucket next to university subscriptions. By citing papers? See above–also, reviewers will insist you cite the classic and/or hot papers, regardless. By only submitting to/reviewing for journals with integrity? How’s that look on your CV?

    Fine, much of this is hyperbole, but you get the point. If we want to influence authors to write/editors to accept/reviewers to approve good science, and to reject bad science, it is insufficient to educate people about what good science is. There are too many actors who are disingenuous, or who sincere but disincentivized to learn. Maybe we need an organized bloc of researchers who prioritize good science–a cartel. Might not be a big cartel, but it would have a helluvalot more influence than each of us acting individually.

    • The end stage of the market for lemons is that anyone with a quality used car refuses to sell into the market because they can’t get paid for the quality they know they have.

      That is precisely what has been going on the last 20 years or so in science. More and more quality researchers not continuing in science or doing so inside pharma companies or oil exploration or engineering firms or medical device mfg or SpaceX or the NSA or whatever. I only know one person from my PhD cohort who is in academia still.

      • Right. There are apple sellers and apple buyers (where apples = good science), but there is no Apple Market. The only market in town is the Lemon Market, which happens to have a sign out front that says “RED JUICY APPLES!” (because that’s an effective slogan). So, we take our apples to the Lemon Market, because we believe the hype and there’s nowhere else to take them anyway. Then we get frustrated because so many people are willing to bake at least some lemons into their apple pies if the lemons are cheaper and more plentiful (which of course they are at the Lemon Market), and to call those pies apple pies. So of course the apple sellers, like your friends, leave the market.

        This is different from the used car lemon market, where buyers aren’t experts (mechanics) and can’t tell a lemon from an apple. But almost all consumers of scientific journal articles are science experts. We may not be able to tell whether a study is good science if the paper isn’t transparent, but we can almost always tell if it’s not transparent, and we are free to refuse to cite those articles, and to refuse to read or submit to or review for those journals. We mostly do not do so because we know or quickly learn we’re in a lemon market, and we figure out that it’s lemon pies or nothing. Lemons in this kind of market cannot lead to market collapse, because lemon markets do not collapse from people buying lemons.

        In short: The solution is not to get the lemons out of the apple market. The solution is to establish an apple market and bar the lemon sellers.

        • Man I have a hard time parsing all of that. But I think you’re saying somehow the “market” for science articles is a market *without* asymmetric information because scientists are the ones consuming science and scientists are experts and therefore can discern which is real and which is bullshit.

          But I don’t think that’s right. There are several issues. One of which is that even if you’re selling lemon cars to mechanics, the mechanic can’t tear down the car and see all the wear on all the parts. If you’ve been drag racing your car at the redline a lot, that’s information you have and the mechanic doesn’t. Similarly in science, if you’re collecting data and throwing out data points that don’t fit your narrative or doing a crap job of surgeries, or not actually blinding people to the meds they receive, but you say you are… then that’s information you have but the readers don’t. There’s a lot of such information. People have tried to replicate biology papers and can’t because the papers don’t actually say enough about what was done. Even with help from the biology lab that first published the stuff replications are hard.

          Second, there’s collusion. Scientists write grants saying they’re going to do X and then do Y instead **all the time** and the funding agencies hire the colleagues of the scientists to evaluate the utility of grants saying they’ll do X, and this is evaluated on the basis of “does this sound like stuff I like” by friends of the grant applicants…. this is a big issue, particularly in Bio.

          Third, the “sellers” get paid by third parties… They get paid by grants which are given by funding agencies to those people who publish a lot of papers and talk a lot of advertising talk (aka “grantsmanship”), and they get paid by promotions which are given by departments to people who bring in a lot of grants and who publish a lot of papers.

          Essentially no-one gets paid by selling “good science” into the “science market”. Ie. if I want to read a good paper about how adiposity affects heart disease risk through time in adult humans… I can’t, because no-one has written one and you can’t buy such a thing anyway. What I can do is read a bunch of papers in which people argue that their meaningless study on BMI, a known flawed measure, changes how future studies should be done with more research needed, and more grants! It’s not like if someone collects definitive data and does the real deal study and publishes a proper analysis of how to assess risk from fat gain that the person will get $50M to do a bunch of other really good studies… In fact this person will likely have to pay $5k to a publisher for the privilege of publishing, and fight their way through peer review with peers who desparately want that study deep-sixed because it invalidates their entire career.

          It’s cynical but there’s a lot of truth in there.

Leave a Reply

Your email address will not be published. Required fields are marked *