bla bla bla PEER REVIEW bla bla bla

Posted on June 11, 2020 9:00 AM by Andrew

OK, I’ve been saying this over the phone to a bunch of journalists during the past month so I might as well share it with all of you . . .

1. The peers . . .

The problem with peer review is the peers. Who are “the peers” of four M.D.’s writing up an observational study? Four more M.D.’s who know just as little as the topic. Who are “the peers” of a sociologist who likes to bullshit about evolutionary psychology but who doesn’t know much about the statistics of sex ratios? Other sociologists who like to bullshit about evolutionary psychology but who don’t know much about the statistics of sex ratios. Who are “the peers” of a couple of psychologists who like to imagine that hormonal changes will induce huge, previously undetected changes in political attitudes, and who think this can be detected using a between-person study of a small and nonrepresentative sample? That’s right, another couple of psychologists who like to imagine that hormonal changes will induce huge, previously undetected changes in political attitudes, and who think this can be detected using a between-person study of a small and nonrepresentative sample. Who are “the peers” of a contrarian economist who likes to make bold pronouncements based on almost no data, and whose conclusions don’t change even when people keep pointing out errors in his data? That’s right, other economists who like to make bold pronouncements based on almost no data, and whose conclusions don’t change even when people keep pointing out errors in their data. Who are “the peers” of a wacky business-school professor who cares more about cool experiments than data management and who doesn’t seem to mind if the numbers in his tables don’t add up? Yup, it’s other business-school professors who care more about cool experiments than data management and who don’t seem to mind if the numbers in their tables don’t add up? Who are “the peers” of fake authors of postmodern gibberish? Actual authors of postmodern gibberish, of course.

I think you get the idea.

Peer review is fine for what it is—it tells you that a paper is up to standard in its subfield. Peer reviewers can catch missing references in the literature review. That can be helpful! But if peer review catches anything that the original authors didn’t understand . . . well, that’s just lucky. You certainly can’t expect it.

So, when the editor of Lancet writes:

And when an M.D. who doesn’t know poop about data or statistics but is willing to write this:

I just think they don’t know what they’re talking about.

2. I get it . . .

I get why you like peer review even if you’re not one of the winners, even if you haven’t directly benefited from the peer-review system the way the above people have. The reason you like “peer review” is that it seems better than two alternatives: (1) “political review” and (2) “pal review.” For all its flaws, peer review is (usually) about the quality of the paper, not about politics, logrolling, trading of favors, etc. Sure, this sometimes happens—sometimes a journal editor will print flat-out lies because he’s friends with the author of an article—but peer review, with its layers of anonymity, really can allow papers by outsiders to get accepted and papers by insiders to get rejected. Not always—some politics remains—but I see the appeal of peer review as a preferable alternative to a pure old boys’ network.

But . . .

3. The actual alternative to peer review is . . .

Instead of thinking of the alternative to peer review as backroom politics, think of the alternative to peer review as post-publication review, which, in addition to all its other benefits (most notably, you can get out of the circle of ignorance of “the peers”), has the benefit of efficiency.

4. All the papers that don’t get retracted . . .

One problem with the attention given to the fatally flawed papers that get retracted is that we forget all the fatally flawed papers that aren’t retracted.

For example, from that long paragraph above:

– The beauty-and-sex-ratio paper: Not retracted. 58 citations. Endorsement from Freakonomics never rescinded.

– The ovulation-and-voting paper: Not retracted. 107 citations. Fortunately this one never got taken seriously by the news media.

– The gremlins paper. Not retracted: 1064 citations. Might still be influencing some policy debates.

Etc etc etc.

And, if we just want to look at papers published in Lancet:

– That seriously flawed Iraq survey: Never retracted. 779 citations. Came up in policy debates.

– That hopeless-from-the-start paper on gun control based on an unregularized regression with 50 data points, 25+ predictors, and a bunch of ridiculous conclusions: OK, this one had only 75 citations and, fortunately, it was blasted in the press when it came out. So, too bad that something like 75 people thought this paper was worth citing (and, no, a quick glance at the citations suggests that these are not 75 papers using this as an example to show people how not to do policy analysis). As I wrote the other day, the useless study is included in a meta-analysis published in JAMA—and one of the authors of that meta-analysis is the person who said he did not believe the Lancet paper when it came out! But now it’s in the literature now and it’s not going away.

5. So . . .

When the editor of Lancet bragged about his journal’s peer-review process, he didn’t say, “Yeah, but our bad about that vaccine denial and the Iraq survey and the gun control analysis. For political reasons we can’t actually retract these papers, but we’ll try to do better next time.” No, he didn’t mention those articles at all.

6. A statistical argument . . .

It might be that everything I’m saying here is wrong. Not wrong in the details—I’m pretty sure these particular articles are fatally flawed—but wrong in the conclusions I’m implicitly drawing. After all, Lancet publishes, oh, I dunno, 1000 papers a year? 2000 maybe? Even if it’s just 1000, then I’m saying that they published (and didn’t retract) 2 bad papers in the past 15 years? That’s a failure rate of 2/15000. Even if I’ve missed a few and there are 2 fatally flawed papers a year, that’s still only a 0.1% failure rate, which would be not bad at all! My own failure rate (as measured by the number of papers I’ve had to issue major corrections for, divided by the total number of papers I’ve published) is about 1%. So here I am criticizing Lancet for being maybe 10 times more reliable than I am!

I don’t know. I really don’t know. The only way to really get a bead on this, I think, would be to take a random sample of papers from the journal and carefully review them. I see bad papers because people send me bad papers. Sometimes they send me good papers too, but I probably see a lot of the worst.

How bad are things? Back in 2013 or so, I think Psychological Science was really bad. I remember looking at their This Week in Psychological Science feature a few times, and it seemed that more than half the papers they were featuring were junk science; see slides 14-16 of this presentation. I didn’t do a careful survey, and maybe I just caught the journal at a bad time, but it really seemed that they had no control over what they were publishing. The cargo cult scientists had basically hacked the system. They’d figured out the cheat codes and were driving their Pac-Men all over the board.

Lancet can’t be that bad.

I’ll say this, though. It may well be that 99%, or 90%, of Lancet articles are just fine, in the sense that their flaws, such as they are (and just about no research paper is flawless) are not overwhelming, so that the articles represent real contributions to science (including negative contributions such as, “This treatment does not do much”). If so, great. But it may well be that 99%, or 90%, of Medrxiv articles are just fine too. I just don’t know.

7. Half-full or half-empty . . .

So, the peer-review system is either the last bastion protecting us from a revised old boys’ network, or a waste of time and resources that could better be spent on post-publication review. It’s either an efficient if imperfect tool for sifting through millions of research articles published each year, or an absolute disaster. Probably it’s both.

I don’t know what to think. Consider computer science. They mostly seem to have abandoned journals; instead they have something like 10 major conferences a year, and the idea is to publish a bunch of papers in each conference. Getting published in a conference proceedings is competitive, but it’s different than publishing in a journal. Or maybe it’s more like publishing in a medical journal, I’m not sure. Hype seems to be important. You gotta show that your method is an order of magnitude better than the alternatives. Which is tough: how can you improve performance by an order of magnitude, 10 times a year? But people manage to do it, I guess using the same methods of hype that led the other James Watson to think in 1998 that cancer was two years away from being cured.

8. Again . . .

Surgisphere appears to be the Theranos, or possibly the Cornell Food and Brand Lab, of medical research, and Lancet is a serial enabler of research fraud (see this news article by Michael Hiltzik), and it’s easy to focus on that. But remember all the crappy papers these journals publish that don’t get retracted, cos they’re not fraudulent, they’re just crappy. Retracting papers just cos they’re crappy—no fraud, they’re just bad science—I think that’s never ever ever gonna happen. Retraction is taken as some kind of personal punishment meted out to an author and a journal. This frustrates me to no end. What’s important is the science, not the author. But it’s not happening. So, when we hear about glamorous/seedy stories of fraud, remember the bad research, the research that’s not evilicious but just incompetent, maybe never even had a chance of working. That stuff will stay in the published literature forever, and journals love publishing it.

As we say in statistics, the shitty is the enemy of the good.

9. Open code, open data, open review . . .

So, you knew I’d get to this…

Just remember, honesty and transparency are not enuf. Open data and code don’t mean your work is any good. A preregistered study can be a waste of time. The point of open data and code is that it makes it easier to do post-publication review. If you’re open, it makes it easier for other people to find flaws in your work. And that’s a good thing.

An egg is just a chicken’s way of making another egg.

And the point of science and policy analysis is not to build beautiful careers. The purpose is to learn about and improve the world.

61 thoughts on “bla bla bla PEER REVIEW bla bla bla”

Zhou Fang on June 11, 2020 9:03 AM at 9:03 am said:

Hi Andrew, did you see my recent comments on the newest Harvard covid catastrophe?

Please can you help signal boost. I need a respectable white guy it seems, because pointing out blatant errors as a person with a chinese name only gets me called a CCP bot. :(

My twitter thread on the issue

https://twitter.com/Fang__z/status/1270360169248108547

Reply ↓
- Daniel Lakeland on June 11, 2020 10:41 AM at 10:41 am said:
  
  Wow, I’m really sorry to hear this… But there’s a reason I stay away from twitter. It seems to be the battlefield for culture wars.
  
  Reply ↓
- Andrew on June 11, 2020 12:22 PM at 12:22 pm said:
  
  Zhou:
  
  Wow—this one’s weird. Parking lot statistics—I’ve never seen that before! Is there a pubpeer for preprints?
  
  Reply ↓
- Phil on June 11, 2020 12:33 PM at 12:33 pm said:
  
  Zhou Fang, great catch on the parking lot photos. Your point seems so inarguable. I mean, it _is_ inarguable. Even if it weren’t for the facts you point out, I’d be wondering if there are other places people park near this hospital, but the fact that one of the parking areas didn’t even exist and the other was not visible…I almost hope this _does_ get past peer review, because it will so perfectly illustrate the point that peer review does not ensure that publications are actually any good.
  
  Reply ↓
- jd on June 11, 2020 12:51 PM at 12:51 pm said:
  
  https://dash.harvard.edu/bitstream/handle/1/42669767/Satellite_Images_Baidu_COVID19_manuscript_DASH.pdf?sequence=3&isAllowed=y
  
  I think this is the study. It was linked from the ABC News article. https://abcnews.go.com/International/satellite-data-suggests-coronavirus-hit-china-earlier-researchers/story?id=71123270
  
  I don’t think they show the same pictures in the paper as the ABC news pictures…?
  
  Reply ↓
  - Zhou Fang on June 11, 2020 1:33 PM at 1:33 pm said:
    
    In the preprint version they use a small crop of the full image. Specifically a crop of https://s.abcnews.com/images/International/TianyouCars_v02_sd_hpEmbed_16x11_992.jpg
    
    The reason I focus on these and not the preprint images is that clearly the analysis is driven by the full images. If you look up the hospital/date from the ABC report in figure 2a of the preprint, you’ll see the points there, and they are points that clearly have a lot of influence on the result. Further the full images show the real issue of oblique angles and also omitted cars parked elsewhere clearly.
    
    Reply ↓
    - Steve Sailer on June 12, 2020 11:45 PM at 11:45 pm said:
      
      Yeah, parking lot analysis sounds extremely easy to get wrong. There could be a million reasons why a parking lot is busier on a date in one year than in the last.
  - Zhou Fang on June 11, 2020 1:38 PM at 1:38 pm said:
    
    It is possible (but super unlikely in my view) that these images are merely illustrative and in the full analysis they are carefully removed. But those are the images (and car counts!) they chose to supply to ABC news. And if they don’t include those images when talking to academics while they do share those when bigging up their results to the press, so as to hide the problems with their data from reviewers, that’s some pretty serious fraud IMO.
    
    Reply ↓
- Andrew on June 11, 2020 3:15 PM at 3:15 pm said:
  
  Zhou:
  
  OK, my post is here. The authors should just make all their data publicly accessible.
  
  Reply ↓
Jag Bhalla on June 11, 2020 9:11 AM at 9:11 am said:

Agree, need a form of “unpeer review” to escape that circle of collective ignorance or error.
Here’s an example where what passes muster among peers is laughable when seen from the outside, by unpeer experts in related fields
Key Climate Scientists Want Economists Thrown Out of IPCC
https://bigthink.com/ipcc-climate-science

Reply ↓
- Andrew on June 11, 2020 3:19 PM at 3:19 pm said:
  
  Jag:
  
  Yes, it really disturbs me the tolerance that many academics show for crappy work that’s being done in their own subfields. I discussed the climate economics echo chamber here.
  
  Reply ↓
Gregory Sanders on June 11, 2020 9:26 AM at 9:26 am said:

This makes sense to me, but I’ve also been trying to think about best practices in literal self-policing, as I’d like to catch my own mistakes before post-publication review!

Obviously mainly I need to work on developing my expertise. But since I’m at an institution that doesn’t do enough statistics to have that much support capacity, I’m guessing contracting is the way to go. I have contracted with a statistics editor for that sort of thing once before, and I think I benefited from it, but I don’t really have much of a sense of best practices or where to go find new people if the last person I used isn’t available.

Reply ↓
Garnett on June 11, 2020 9:41 AM at 9:41 am said:

I don’t see how “open code, open data, open review” necessarily threatens a journal’s bottom line. It’s not going to require extra resources from the journal, is it? Why wouldn’t they make these a matter of policy? Institutional inertia?

Reply ↓
- Zhou Fang on June 11, 2020 10:24 AM at 10:24 am said:
  
  I suspect the big problem is papers coming from private companies. In those cases the methods and data could be trade secrets – i.e. not legally protected but commercially sensitive.
  
  Reply ↓
  - Garnett on June 11, 2020 11:38 AM at 11:38 am said:
    
    Thanks for this insight.
    I suppose, then, that the journal has no incentive to commit to “open code, open data, open review” unless they want to risk losing submissions from private industry.
    
    However, they could still commit to open review.
    
    You think they’d like that b/c it steers interested persons to the journal’s online content. Isn’t that an opportunity for more revenue?
    
    Reply ↓
Eli Rabett on June 11, 2020 10:03 AM at 10:03 am said:

There have always been junk papers, what has changed is the number of journals and the trollers(in the original and modern sense) who sift through them looking for weaponry and click bait.

Also, FWIW, using conference proceedings as publications is pretty general across engineering, SPIE is over 11000 volumes in their conference proceedings

https://spie.org/publications/conference-proceedings/browse-by-volume-number/browse-list-of-proceedings-for-a-volume-range?start_volume_number=11500&end_volume_number=11523

Reply ↓
jd on June 11, 2020 10:29 AM at 10:29 am said:

>Retraction is taken as some kind of personal punishment meted out to an author and a journal. This frustrates me to no end.
>And the point of science and policy analysis is not to build beautiful careers. The purpose is to learn about and improve the world.

Yes! Conversations with various PI’s have indicated to me that retraction of a paper is a horrible thing and terrible for one’s career. I don’t really know. But this sounds strangely at odds with the spirit of science, because how can one possibly make it through a career mistake free?

I am not even sure I agree with retraction at all. Why should the paper be retracted? How about just a comment at the top of the article indicating where/what the errors are. Then it serves as a good example and/or informs further research of incorrect ideas. I guess this would be a ‘correction’. Why not have a ‘correction’ for everything, rather than a ‘retraction’?

Reply ↓
- Joshua on June 11, 2020 10:47 AM at 10:47 am said:
  
  jd –
  
  > I am not even sure I agree with retraction at all. Why should the paper be retracted? How about just a comment at the top of the article indicating where/what the errors are. Then it serves as a good example and/or informs further research of incorrect ideas. I guess this would be a ‘correction’. Why not have a ‘correction’ for everything, rather than a ‘retraction’?
  
  Yah. I agree with this. I think that in the end, a “retraction” policy more likely reinforces the mistaken notion that error-finding isn’t an integral part of the scientific process, and that instead the futile notion of pursuing scientific “truth” is the proper goal.
  
  Reply ↓
  - Ben on June 11, 2020 9:16 PM at 9:16 pm said:
    
    > How about just a comment at the top of the article indicating where/what the errors are.
    
    Yeah the idea that something will live forever is kind of absurd.
    
    Reply ↓
- Martha (Smith) on June 12, 2020 1:52 AM at 1:52 am said:
  
  Andrew said,
  “What’s important is the science, not the author”
  
  jd said,
  “Why not have a ‘correction’ for everything, rather than a ‘retraction’?”
  
  +1 to both!
  
  Reply ↓
Marques Mendes on June 11, 2020 10:39 AM at 10:39 am said:

Peer reviews and citations remind me o market forecasts. Once everybody starts using them they no longer serve their purpose. From my experience, some editors behave like cult leaders supported by fanatic peer reviewers who want to advance their cause. For example, the last paper I submitted to a journal had nothing to do with behavioral finance but one of the reviewers wanted me include it. Now I only post my papers in SSRN and if any reader wishes to publish them in a journal he is welcome to ask me for permission.

Reply ↓
John Williams on June 11, 2020 10:39 AM at 10:39 am said:

Fortunately, graduate students seem to be conscious of the problem. I’ve given talks on a few campuses about a particularly bad paper in Science claiming that dams on the Mekong River could be good for fisheries there, and a bunch of students pulled out their cell phones and took pictures of my concluding slide:

Conclusion II:
Recognize the dual nature of scientific activity.
On the one hand, we are trying to figure out how the world works. On the other, we are trying to get ahead, to earn promotions or bonuses or better jobs, or to help our graduate students or post-docs. Only an economist could believe that some invisible hand will reliably transmute the latter activity into the former. Rather, we should recognize that both activities go on at once, and should read articles carefully and critically before we take them seriously.
De omnibus dubitandum (Doubt everything).

Reply ↓
- Anonymous on June 11, 2020 1:34 PM at 1:34 pm said:
  
  This sounds like a conspiracy theory. You think the entire scientific community wouldnt stop people from just trying to get ahead?
  
  Reply ↓
  - Ben on June 12, 2020 10:05 AM at 10:05 am said:
    
    Ah, I dunno. I don’t think careerism is a fringe conspiracy. Also conspiracy implies organization, doesn’t it? Careerism is an individual thing, too.
    
    Reply ↓
Joshua on June 11, 2020 10:43 AM at 10:43 am said:

Maybe my own personal observations aren’t instructive, but I see a further function to peer review beyond serving as a fact-check to prevent erroneous material from getting published.

I also see peer review as serving a function to improve research prior to publication – where authors take input from peers seriously and revise and as a result to produce higher quality work. For example, where a peer points out a relevant publication that the author wasn’t aware of, but which provides information relevant to the thesis of the article being reviewed.

Is that really such a rare phenomenon?

Also,

> Instead of thinking of the alternative to peer review as backroom politics, think of the alternative to peer review as post-publication review, which, in addition to all its other benefits (most notably, you can get out of the circle of ignorance of “the peers”), has the benefit of efficiency.

Does this suggest an either/or frame? Why can’t there be an additive benefit to peer review plus post publication review? In holding them up as alternatives, it seems to me that you’re effectively reinforcing the view that post-publication review that uncovers errors is punative, that the processes are in opposition. A view that they’re in opposition is only a mindset, it isn’t a reality. Maybe rather than arguing the relative merits of the two approaches, the focus should be on realizing that the two processes can have an additive effect. Indeed, you argue that post-publication review that finds errors should be looked at positively, so why not then go a step further and frame pre- and post-publication review as inextricably linked stages of an overall process?

Reply ↓
- Andrew on June 11, 2020 11:28 AM at 11:28 am said:
  
  Joshua:
  
  Yes, peer review has helped many of my papers. Peer review can filter out some (not all) of the very worst papers, it can help catch missing references, and at its best it can make good papers better. The thing that peer review can’t do is give a guarantee or near-guarantee of quality. It’s baked into peer review that it typically won’t catch bad practices that are also standard practices within a subfield. Beyond all that, peer review is often used as a tool in academic politics or score-settling, as when Perspectives on Psychological Science publishes lies.
  
  Reply ↓
  - Martha (Smith) on June 11, 2020 10:33 PM at 10:33 pm said:
    
    +1
    
    Reply ↓
John Williams on June 11, 2020 11:13 AM at 11:13 am said:

Joshua,
It surely can function as you suggest. My wife worked very hard at constructive reviews, even doing serious copy editing, especially for foreign authors. She got zero credit for it in the promotion process.

Reply ↓
- Martha (Smith) on June 11, 2020 10:37 PM at 10:37 pm said:
  
  Sad, but all too common. On the other hand, sometimes a review can involve a lot of work, but still not bring the paper up to the standard where one would want to be a coauthor ( which would probably be the criterion for getting credit for it in the promotion process).
  
  Reply ↓
Clyde Schechter on June 11, 2020 11:31 AM at 11:31 am said:

All of the high-profile medical journals, and most of the rest as well, have a stable of statisticians on their editorial board, and quantitative papers are all reviewed by a statistician. But clearly a lot gets by this process. And it isn’t always constructive: the statistician’s review of my most recent submission requested that I put significance stars on a table! On the other hand, there are some statistical reviewers who have made very insightful comments and really improved my work. It’s definitely a lottery.

Part of the problem is that careful review requires time, thought, and effort. And in the current system there is really no recognition for doing this work.

And let’s not forget that bringing in non-peers in post-publication review can also lead to junk reviews. Agreed that 4 MDs publishing a paper might not have the statistical sophistication to properly analyze their data–interdisciplinary teams are a much better idea. But I’ve also worked with statisticians who have no understanding of the subject matter in a paper and devise mathy answers to the wrong question.

Reply ↓
- Martha (Smith) on June 11, 2020 10:39 PM at 10:39 pm said:
  
  Good points.
  
  Reply ↓
Matt on June 11, 2020 11:43 AM at 11:43 am said:

I don’t think Surgisphere is Theranos. Theranos was an extremely slick and well-disguised scam. Outwardly, Theranos looked like a legitimate business. In contrast, a few minutes of Googling would have revealed to anybody that Surgishere is a very dubious outfit. Amateur sleuths showed quickly that it’s essentially a one-man outfit with hardly any employees and certainly none that could have built the amazing databases that it claimed to possess. If leading medical journals and Harvard professors had been duped by Elizabeth Holmes and Theranos, it would have been much less damning than the Surgisphere affair.

Reply ↓
Sameera Daniels on June 11, 2020 11:58 AM at 11:58 am said:

RE: One problem with the attention given to the fatally flawed papers that get retracted is that we forget all the fatally flawed papers that aren’t retracted.
—-

This is so on point.

Reply ↓
paul alper on June 11, 2020 12:00 PM at 12:00 pm said:

Andrew wrote: “led the other Janes Watson to think in 1998”

I suppose it is a typo and “Janes” is supposed to be “James.” In a recent blog, there was another typo, but this time Andrew wrote “Witson” instead of “Watson.” The person in question is the other James Watson and not the double helix guy who is outliving his reputation.

Reply ↓
- Andrew on June 11, 2020 12:23 PM at 12:23 pm said:
  
  Paul:
  
  I’m just glad that people are reading these posts carefully enough to catch typos!
  
  Reply ↓
Matt Skaggs on June 11, 2020 1:24 PM at 1:24 pm said:

General questions for academics:

How much pal review happens as standard practice?

How often do(es) the author(s) get to pick the reviewers?

If you get to choose, what percentage of your papers would be rejected if you chose your three staunchest critics to peer review your papers, versus how many that would be rejected with review from your closest allies?

Do reputable journals seek out your critics for peer review when you submit a paper?

I have written a sum total of one paper as lead author for publication. But the journal we chose asked us for the names of who should review it, and not being in academia, we just threw up our hands. I naively thought that the journal would find reviewers. Is this typical?

Reply ↓
- Chris Wilson on June 11, 2020 1:48 PM at 1:48 pm said:
  
  Yep. Every journal I’ve submitted to asks for a list of potential reviewers and many ask for preferences on subject matter editors too. That said, reviewers often decline, so the list you send in has an impact but doesn’t determine the selection by any means. My impression is that some journals have a policy saying the handling editor has to find at least one reviewer NOT on the provided list. Or maybe it’s just encouraged. I’ll let those with editorial experience comment there.
  
  Reply ↓
  - Martha (Smith) on June 11, 2020 11:03 PM at 11:03 pm said:
    
    “Every journal I’ve submitted to asks for a list of potential reviewers and many ask for preferences on subject matter editors too”
    
    Wow! I don’t recall every experiencing this. I suppose the practice depends on what field you’re in.
    
    Reply ↓
- Raghuveer Parthasarathy on June 11, 2020 1:53 PM at 1:53 pm said:
  
  In Physics & Biology:
  
  “How much pal review happens as standard practice?” — Very little. Except in rare cases, one doesn’t pick one’s reviewers, and one doesn’t know who one’s reviewers are. Also, most fields are big enough that the odds of getting a paper from one’s close friend to review are low. Getting papers by not-very-close friends is possible, and can lead to bias, but it’s also the case that one often uses anonymity to write more critically than one would otherwise.
  
  “How often do(es) the author(s) get to pick the reviewers?” — Directly pick: almost never. Suggest reviewers: that is standard, but one doesn’t know if the journal will pick from those suggestions or not. Also, editors have pointed out to me that (1) they never pick solely from the “suggested” people, and (2) the suggested people are typically as harsh as non-suggested ones — see the last point.
  
  “If you get to choose, what percentage of your papers would be rejected if you chose your three staunchest critics to peer review your papers, versus how many that would be rejected with review from your closest allies?” and “Do reputable journals seek out your critics for peer review when you submit a paper?” — in reality, one doesn’t have “critics” and “allies” in the sense you suggest. Who are my work’s “critics?” It seems a bizarre question. Most of my work looks at the biophysics of the gut microbiome, using imaging as a tool and zebrafish as a model organism. There are “critics” who think that anything done in non-humans is a waste of time (over-simplifying) — would they be the appropriate reviewers? Why?
  
  And yes, you should know your field better than the editors of the journal, and should be able to suggest reviewers.
  
  Reply ↓
Paolo Inglese on June 11, 2020 2:26 PM at 2:26 pm said:

I would be for a paid open review process.
Review is fully open “during” the process, not after accepting the paper. People should be able to know that there’s a work being reviewed. This would also reduce the scooping. You have proofs that you submitted a paper on date X.

Reviewers should get paid for their reviews, but they also would respond for the quality of their work. One of the key issues of current model is that reviewers are not getting credit for their work. No credit, no responsibility. A lot of people use the mantra: “reviewers do it for free, so you get what you get”.

Reply ↓
Wonks Anonymous on June 11, 2020 3:47 PM at 3:47 pm said:

One alternative to the current setup would be for people to make bets on whether findings will replicate.

Reply ↓
Richmond on June 11, 2020 3:53 PM at 3:53 pm said:

I cannot understand the metaphorical meaning of the sentence “An egg is just a chicken’s way of making another egg”. Can anyone help me?

Reply ↓
- Raghuveer Parthasarathy on June 11, 2020 4:13 PM at 4:13 pm said:
  
  Hopefully I’m not mis-interpreting Andrew, but I think the idea is that these particular forms — peer review, open peer review, whatever — are just the vehicles for propagating studies, and we shouldn’t confuse assessments of the vehicles with the assessments of the studies. You can have an excellent system, but it can still propagate bad science.
  
  The analogy is that the egg (or chicken) are vehicles for propagating the chicken’s genes.
  
  (The quote isn’t quite right; “It has, I believe, been often remarked, that a hen is only an egg’s way of making another egg. http://www.online-literature.com/samuel-butler/life-and-habit/8/ )
  
  We’ll see if my interpretation is correct!
  
  Reply ↓
- Andrew on June 11, 2020 4:18 PM at 4:18 pm said:
  
  Richmond:
  
  What I meant was that the model is the egg and the chicken is the inference. In science, we build a model so we can make inferences, but in many cases the real purpose of the inferences is to work out the implications of the model in enough detail so that we can find flaws with the model and then improve it. The chicken of the inferences lays a new model, or egg.
  
  Reply ↓
Anonymous on June 11, 2020 3:54 PM at 3:54 pm said:

When rubber meets the road, all advances in science amount to individual souls recognizing the truth when they see it. I’m not sure collective, sociological-type tricks and mechanisms have ever really improved the rate the at which that happens.

Reply ↓
- Martha (Smith) on June 11, 2020 11:08 PM at 11:08 pm said:
  
  I think recognizing the untruth (e.g., flawed reasoning; “that’s the way we’ve always done it”; etc.) is at least of equal importance with “recognizing the truth when they see it”.
  
  Reply ↓
  - Anonymous on June 12, 2020 11:56 AM at 11:56 am said:
    
    Indeed, please change “… recognizing the truth…” to whatever works better.
    
    My emphasis was that whatever it is, it occurs within *individuals*. It’s not a given that *collective* tricks and mechanisms designed to separate out scientific truths work much at all.
    
    Reply ↓
Ben on June 11, 2020 9:25 PM at 9:25 pm said:

> instead they have something like 10 major conferences a year, and the idea is to publish a bunch of papers in each conference

As far as computer science papers go, I don’t really expect them to be correct, or whatever.

Like I kindof expect that the plots are super misleading, so it’s not really too much of an issue when you read them. 10x difference means 2x difference in a certain context.

But I guess this comes back to, who cares what sorta A/B testing Google is doing to sell your kids video games? With drugs and public policy it’s lives on the line!

Reply ↓
- Martha (Smith) on June 11, 2020 11:10 PM at 11:10 pm said:
  
  “But I guess this comes back to, who cares what sorta A/B testing Google is doing to sell your kids video games? With drugs and public policy it’s lives on the line!”
  
  +1
  
  Reply ↓
Bruce Nye on June 11, 2020 9:35 PM at 9:35 pm said:

Read with some excitement. The problems in peer review have been lurking like tuberculosis for a long time, now the infection has become rather florid.

Wrote this back in 2017 as part of a primer on how to spot Junk Science News:

Question: Is the peer review critical?

Answer: Put more weight on critically peer reviewed work. Beware echo chamber reviews.

One of the key things that separate the above publications is peer review. What peer review is supposed to be is an open, critical, dialog between experts in the field regarding the research results presented. Think of it as giving a paper in front of your peers about something new you’ve discovered. Naturally your peers will have questions about your methods, your results, your conclusions. Sometimes their questions will point out something weak, missing, or in error. When that happens, in science, you go back and strengthen, fill in, or correct the findings.

The quality of a journal primarily rests on its peer review process. There are two main problems that occur though. The first problem is one where the reviewers don’t represent a range of views on the subject. Reviewers are invited by the editorial board to review submissions to the journal. If too many reviewers of a particular viewpoint are selected it creates an uncritical review panel for publications that share their viewpoint, and an overly critical panel for publications that may go against their views.

The second problem comes from the reviewers themselves. Being a reviewer adds to one’s resume so it is a good thing to be selected. But as a reviewer it becomes difficult to openly criticize colleagues. This conflict of interest should be obvious. Often this leads to a brief, cursory review of a paper. Questions which should be asked don’t get posed. When this happens the authors don’t get the feedback they need to fix or withdraw a paper until after it is published, when the rest of the experts get a chance to read the work, and find the faults that should have been discovered during peer review. I’m not talking about just subtle faults, but glaring errors in method, data, results.

Journalists know very little about peer review and the public only knows that “Scientists discover …”. By the time the errors are found the “study” is part of the public conscious and there is no good way to remove it. There is an old saying: “What is heard cannot be forgotten”. That is why junk-science propaganda becomes entrenched in the public mind, and public policy gets supported. All because critical peer review was ignored.

Reply ↓
- Martha (Smith) on June 11, 2020 11:15 PM at 11:15 pm said:
  
  Bruce said,
  “The second problem comes from the reviewers themselves. Being a reviewer adds to one’s resume so it is a good thing to be selected. But as a reviewer it becomes difficult to openly criticize colleagues. This conflict of interest should be obvious. Often this leads to a brief, cursory review of a paper. Questions which should be asked don’t get posed.”
  
  This is very disturbing.
  
  Reply ↓
Ron Kenett on June 11, 2020 10:18 PM at 10:18 pm said:

My response ti the thread on twitter:

This thread is somehow myopic. It assumes peer review is structured and with uniform outcomes. Not so. see https://content.iospress.com/articles/statistical-journal-of-the-iaos/sji967 and https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3591808

Reply ↓
Felipe Pait on June 12, 2020 6:23 AM at 6:23 am said:

The purpose of peer review is not to ensure that everything published is completely correct.

The purpose of peer review is to ensure that every publication is written in a meaningful way, so that researchers can read it and confirm, use, extend, or contradict its results. Without peer review we might have lots of meaningless rants without clear arguments, with no rhyme or reason. Journals would lose the purpose of communication if they were overwhelmed with papers full of statements that are not even wrong, cannot be confirmed nor denied.

Which, come to think about it, is just what this blog post is. What does it even say?

Reply ↓
- Ben on June 12, 2020 12:07 PM at 12:07 pm said:
  
  > a meaningful way, so that researchers can read it and confirm, use, extend, or contradict its results
  
  But then once flaws have been pointed out in something, it shouldn’t just stay with the flaws for someone else to find (what if they don’t know things are flawed?)!
  
  This is attached to:
  
  > One problem with the attention given to the fatally flawed papers that get retracted is that we forget all the fatally flawed papers that aren’t retracted.
  
  And
  
  > The purpose of peer review is not to ensure that everything published is completely correct.
  
  We don’t have to solve all scientific problems at review. Indeed! Post-publication review happens after the review.
  
  Reply ↓
fin on June 12, 2020 7:27 AM at 7:27 am said:

I think there is an unstated assumption in your argument here, Andrew: that if your paper is reviewed by peers whose positions are very close to yours (all sociologists who like evo psych, all psychologists who like just-so stories etc), then your paper is going to get an easy ride. That seems true for certain types of paper (ones that support the common assumptions of you and your peers) but my impression is that peers can also be more critical of papers in “their” area, rather than going easy on them. This can happen for reasons of intergroup competition, interpersonal animosity, gamesmanship, protectionism etc (“Competition in academia is so vicious because the stakes are so small.”).

If you are selecting people to review a paper, you really want to find people who are going to be as critical as they can be: and it can make sense to select “close peers” for just this reason.

Reply ↓
Luca La Rocca on June 12, 2020 10:22 AM at 10:22 am said:

I think it’s worth going back to the basics: a Journal is a place where an Editorial Board selects contents and Peer Review is a tool to help them select valid and interesting ones (no more, no less); it is an imperfect tool (are there perfect ones?) and it doesn’t need to be the only one, but without content selection (before publication) there is no Journal. It is an Editorial Board’s responsibility to choose whether to use Peer Review or not (as well as to dictate its rules) and it is an Editorial Board’s responsibility to choose whether publication should or should not be followed by Post-Publication Review (as well as how much it should be moderated). Ideally, my preference goes to a Journal using Peer-Review and open to Post-Publication Review, but I am not part of any Editorial Board and, as a reader, I can only compomise. However, I do think it’s good to have a variety of approaches, and this is why I am stressing the role of each Editorial Board in the decision process.
Preprint Archives are different, in that there is no Editorial Board (at least not one responsible for their contents) and no content selection (only some sort of authors’ accreditation). As a matter of fact, I still prefer to follow Journals, rather than Preprint Archives, because the world is wide and I find content selection useful. It is true that Post-Publication Review can select contents in a Preprint Archive, but I think it’s safer to complement the Wisdom of the Crowd with the wisdom of an Editorial Board. In the end, I think both strategies of selection can be useful and there’s room for both of them. I do download from and upload to Preprint Archives.
My two cents.

Reply ↓
zbicyclist on June 12, 2020 10:59 AM at 10:59 am said:

Two interesting items, unrelated to each other.

1. The Surgisphere website is down. https://surgisphere.com/

“The hosting account for surgisphere.com is suspended.
The hosting account for surgisphere.com has been suspended.
If you are the owner of this hosting account, please contact [email protected].”

2. This post has been picked up by The Browser (a subscription site which picks 4 items of interest each day). Here’s how they blurbed it:

Peer Review
Andrew Gelman | Statistical Modelling | 11th June 2020
Nine problems with peer review. The first is there in the name: The peers. “Who are the ‘peers’ of four MDs writing up an observational study? Four more MDs who know just as little as the topic. Peer review is fine for what it is. It tells you that a paper is up to standard in its subfield. Peer reviewers can catch missing references in the literature review. That can be helpful! But if peer review catches anything that the original authors didn’t understand — well, that’s just lucky. You certainly can’t expect it” (1,860 words)

Reply ↓
Mark Dynarski on June 12, 2020 11:47 AM at 11:47 am said:

For some years I was an associate journal editor, and the role meant that I needed to identify peer reviewers for submissions and also synthesize their comments so that authors could focus on the important comments in revising their papers.

The peer reviews I got were all over the map. Some reviewers dug in hard and went page by page in their comments. Others sent two paragraphs saying something like ‘pretty good paper, might want to run the model a few other ways.’ Some had useful and sometimes brilliant insights that improved the paper. Some wanted to complain about why the paper had been written at all.

But consider that the peer reviewers and I were doing difficult work for free. I was not in a position to say a particular review was inadequate and needed to be done better. Really I had no recourse but to accept the peer review (which often was late and in response to messages urging them to finish) and move on.

Peer review was and is a flawed system. There are many moving parts inside any empirical study, and expecting peer reviewers to spot mistakes seems to be asking too much. But paying reviewers will probably help. One organization I know had a contract with a Federal agency and the work called for a lot of reviews. They paid reviewers a full amount for reviews within six weeks and less after that. My sense is the payment and the penalty for being late really worked. But academic journals have a different model. More’s the pity.

Reply ↓
- Ben on June 12, 2020 12:21 PM at 12:21 pm said:
  
  > But consider that the peer reviewers and I were doing difficult work for free.
  
  I think this is compatible with what Andrew is saying (peer review isn’t working). What you’re saying is, as hard as you try, you can’t make it work.
  
  I assume lots of reviewers are putting in honest effort and they just can’t keep up. I don’t think it’s disparaging to those good intentions to say that, given the difficulty people have experienced so far (the peer review system has been The System for however long), it’s unlikely, however good the intentions are, people will be able to ever keep up.
  
  Delegation, which I think post-publication review is, is one way to try to solve this. We could try to pump money into the review system so people can spend more time on it, but that’s going to run up against getting money out of publication (open access stuff).
  
  Reply ↓
HK on June 12, 2020 2:17 PM at 2:17 pm said:

Regarding point #7, from what I know, computer science operates with very different logics. If you design a new algorithm that improves upon existing algorithms by X%, it counts as a contribution. Thus, the standards for gauging what constitutes a conceptual contribution are quite objective. Very unlike as in the social sciences, where it is way more subjective, and it is probably easier to cover up favoritism just on the grounds of “taste”. Herbert Simon, in his “The Sciences of the Artificial” when he distinguishes between artificial (or design-based) sciences such as engineering which are applied, and aim at optimizing, and natural sciences such as pure science which aim to present a theory of how things are. Social scientists often try to aim for both, the practical impact of the former, and the legitimacy of the latter, and often fail at both. I am not making a value judgment here; it is just the way things are.

Reply ↓
Atat on June 14, 2020 11:30 PM at 11:30 pm said:

Post publication review isn’t going to improve situation by much. Not even for research that addresses fundamental question in an area. For highly publicized research on Twitter? may be.

Consider this 2014 paper by high profile researchers in area of Autism heritability. They conducted one of the largest twin/familial studies

Came up with a ~50% number. Bear in mind that, this is against the backdrop of huge debate around antivaxxers, vaccines, autism etc and so environment vs gene question is as fundamental as it gets. And along comes this study with largest ever sample.
https://jamanetwork.com/journals/jama/fullarticle/1866100

They also mention that they are using a more modern method for determining concordance/discordance (basic measure in any twin study) of occurrence of autism in both twins. Before this most heritability estimates are upwards of 70%, except Halmayer et al., and it’s expected that it will be proven that autism is almost entirely genetic.

But they got 50%. Not a whimper. All goes well. 759 citations.

In 2017, the same researchers “republish” the exact same paper, exact same data, with 90% heritability. The abstract tells “oops we were stupid and are going back to conventional methods of determining concordance/discordance”. They got huge coverage in media. It was a blockbuster. It was hailed as settling the autism/vaccine debate from first principles. If autism is itself genetic, then environment and hence vaccines are in admissible.
https://jamanetwork.com/journals/jama/fullarticle/2654804

But what happened for three years? Black hole of post publication peer review. For one of the most high profile, closely scrutinized areas of research.

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

bla bla bla PEER REVIEW bla bla bla

61 thoughts on “bla bla bla PEER REVIEW bla bla bla”

Leave a Reply to Phil Cancel reply