“With that assurance, a scientist can report his or her work to the public, and the public can trust the work.”

Dan Wright writes:

Given your healthy skepticism of findings/conclusions from post-peer-reviewed papers, I thought I would forward the following from Institute of Educational Sciences.

Here is a sample quote:

Simply put, peer review is a method by which scientists who are experts in a particular field examine another scientist’s work to verify that it makes a valid contribution to the evidence base. With that assurance, a scientist can report his or her work to the public, and the public can trust the work.

Unless, of course, it’s published in the Journal of Personality and Social Psychology, or Psychological Science, or the PPNAS, or . . .

32 thoughts on ““With that assurance, a scientist can report his or her work to the public, and the public can trust the work.”

  1. > “With that assurance, a scientist can report his or her work to the public, and the public can trust the work.”

    It’s fine to have utopian visions but I’d be happy with a much lower (and readily achievable) standard, “Peer review is a method by which scientists who are at least reasonably knowledgeable in a particular field examine another scientist’s work to establish that there it does not constain obvious errors or plagiarism.”

    • This will never happen, but it would have been extremely useful to see what the reviews of the Lacour Science study looked like.

      I’m trying to think of cases where I caught abuse/misconduct/fraud or something close to it. It was only once in 15 years: a scientist trying to re-publish old data as new (the author didn’t even bother to change the initials of the subjects, so I am unsure whether they knew they were doing anything wrong). Of course, there must be cases that I missed. I suspect though that several people have just started demanding that the editor exclude me as reviewer ;). A lot of papers come out that directly address my work but I never get to review them… I’m excluding abuse of statistics as abuse, because that’s the standard situation.

      • I think we are seeing a dilution of editorial / reviewer responsibility. And that is sad.

        We really need more conscientious reviewing & editors that actually care & stand by the quality of work that they publish.

  2. On the other hand, editors and reviewers are forced to take time off from their own research to review others’ work. It would help a lot if the data and code were released as a matter of course along with the paper. If I had data and code with a paper, I would definitely look at these, not just the paper. However, most published claims don’t hold up under such close scrutiny. Nothing like an outside party to vet your code and data.

    • Well, other researchers have to take time off to review your work too at some point. It’s quid pro quo.

      Maybe we should institute a system of reviewer credits? Unless you review enough you don’t get people to review your papers.

      Alternatively, just move to a system of paid review. There’s enough funding money floating around. One less silly sociology / psych study might free enough $$ to pay 100 reviewers.

  3. “This will never happen, but it would have been extremely useful to see what the reviews of the Lacour Science study looked like.”

    Probably the reviews do not raise any questions about the legitimacy of the data. I wouldn’t expect them to. Have you read the technical report from Stanford that pointed up the problems? The problems were uncovered in the course of attempting to replicate the study findings, and rely on an extremely detailed analysis of the data: there was nothing obviously wrong with the summary results in the article. An ordinary peer reviewer would, a) probably not suspect fraud based on the summary results, b) would ordinarily not have the data available, and c) would probably not invest the time needed to do this kind of detailed exploration even if he/she did have the data.

    Just to do a conventional review of a paper, without any attempt to get the data and replicate it myself, typically takes me about 20 hours of work. I need to read the paper several times to be sure I understand what it is saying. I need to check key references. I need to identify any obvious inconsistencies in the reported results and any overreaches in their reported interpretation. Then I have to write it all up in a way that the editor and authors will be able to understand and respond to. If I were expected to additionally rework the data myself and replicate the analyses of every paper I review, I would probably just stop doing peer review altogether. Maybe that would be a good thing for the world, but I don’t know how many people you will find who would take the time necessary to do such reviews at all. We do all have day jobs, after all.

    “Nothing like an outside party to vet your code and data.”

    I’m not so sure about that. Even assuming the code is in a language the reviewer uses, vetting other people’s code is an extremely difficult task, far more so than writing your own code to accomplish the same purpose in most cases. And the original author of the code probably understands that code far better than any second reader ever will.

    For all of these reasons, I think that Chris G is really quite right. What he/she outlines is probably the most we can expect from any sustainable peer-review process and we need to eradicate the fantasy that it can accomplish anything more than that.

    • I’m not sure. This wasn’t an ordinary result. To anyone familiar with the field LaCour’s results were quite spectacular. There were obvious red flags even without detailed analysis. e.g. high compliance rates of his surveys & very good correlation on being re-surveyed.

      Any good, conscientious referee should have been skeptical & tried to dig deeper. Ask questions. I’m not saying it was an obvious fraud that must have been caught. But it would be interesting to see how much due diligence the referees did.

      • I only want to know if any reviewer said something like what Krosnick said when he heard about this work from This American Life: “that’s very surprising and doesn’t fit with a huge literature of evidence. It doesn’t sound plausible to me.”

    • “Probably the reviews do not raise any questions about the legitimacy of the data. I wouldn’t expect them to. Have you read the technical report from Stanford that pointed up the problems?”

      “An ordinary peer reviewer would, a) probably not suspect fraud based on the summary results”

      Clyde, you say that reviewers would not have caught anything odd in LaCour’s study. But I see this in a recent article:

      “But even before Broockman, Kalla, and Aronow published their report, LaCour’s results were so impressive that, on their face, they didn’t make sense. Jon Krosnick, a Stanford social psychologist who focuses on attitude change and also works on issues of scientific transparency, says that he hadn’t heard about the study until he was contacted by a “This American Life” producer who described the results to him over the phone. “Gee,” he replied, “that’s very surprising and doesn’t fit with a huge literature of evidence. It doesn’t sound plausible to me.” A few clicks later, Krosnick had pulled up the paper on his computer. “Ah,” he told the producer, “I see Don Green is an author. I trust him completely, so I’m no longer doubtful.” ”

      Source:
      http://nymag.com/scienceofus/2015/05/how-a-grad-student-uncovered-a-huge-fraud.html

      If this Stanford prof is not confabulating doubts post-hoc and was really skeptical until he saw Green’s name on the paper. The fact that he just saw Green’s name on it and immediately decided to trust the result is actually part of the problem in peer review. You see a famous name on the paper, and immediately decide to trust it. I can see the logic of that: a fresh grad student’s work is likely to have mistakes and so on, a seasoned researcher’s not. But the reality does not match that plausible scenario. Experienced researchers often can’t do even basic things right. I routinely read papers by world authorities who play with stopping rules, run 8000 t-tests and setting p-values at 0.001, report null results on low powered studies as if they were positive findings (i.e., don’t know what a p-value tells you), selectively publish results that magically and always support their own theoretical position, etc etc etc. Very recently I saw an experienced researcher, a full professor with 15-20 years of experimental work behind them, telling me that they don’t know how to do a paired t-test and was doing to ask a colleague about the details (there was an incorrect analysis done by a post-doc on a paper I was co-author on the paper with this person).

      “Even assuming the code is in a language the reviewer uses, vetting other people’s code is an extremely difficult task, far more so than writing your own code to accomplish the same purpose in most cases. And the original author of the code probably understands that code far better than any second reader ever will.”

      You are right that reading other people’s code is hard. But that is because people write bad and careless code (to his credit, Andrew is pretty open about his messy code, in that he refuses to release his code because it’s too messy; my code is also messy but I am trying to improve) and that is part of the problem with doing science. If they are forced to release code+data with the paper for review, even if busy people like you are unwilling to go through with examining the data, this will force the submitter to do a second check to make sure they didn’t mess up something and to document their code. By the way, if anyone were to release their data and code with their paper in a review, I would definitely go through it. It would add some hours to the review process, but it would be time well spent. Note that many don’t even release their data after publication, I would say around 75% do not even if asked to.

      None of this means Chris G is not right—I agree with him that his minimum achievable good is laudable (because it seems we can’t even meet that relatively low criterion with peer review).

      A further point, related to the first point, is reviewer ignorance. I have had reviewers complain about the relevance of replicating old results, since they add nothing new to the debate, and have seen papers (others’) rejected because reviewers think that combining information from past studies is just cheating. Editors of major journals think that lower p-values give stronger evidence for the particular alternative hypothesis being pushed by the authors. Peer review from such people? No thanks. If I could give a failing grade to my fellow reviewers when I do peer review, I would. Unfortunately, there is no culture of commenting on other people’s reviews. You just get the decision letter from the editor and it’s already over. I have often been amazed by the editor’s decision to reject someone’s paper because some reviewer came up with a comment that reflects lack of basic understanding about statistics.

      • “The fact that he just saw Green’s name on it and immediately decided to trust the result is actually part of the problem in peer review. You see a famous name on the paper, and immediately decide to trust it.”

        Peer reviews are anonymous (both for the reviewer and the submitter) -the draft submitted to reviewers does not contain any information that could expose the author(s)’s identity. In reality, bc so many working papers are circulated around that it’s not all that uncommon for a reviewer to have a pretty good guess at who the author might be (and perhaps it’s not all too hard for submitters to make a reasonable guess as to who might referee their papers). But in the case of the LaCour and Green paper who knows. Im sure the two anonymous reviewers aren’t eager to expose their own oversight.

        • “Peer reviews are anonymous (both for the reviewer and the submitter) -the draft submitted to reviewers does not contain any information that could expose the author(s)’s identity.”

          That’s simply not true. There are fields where this happens, but it is by no means the norm, e.g., in psychology and psycholinguistics. I rarely review papers in which the authors’ names are not released, and I (almost) always sign my reviews (after I got tenure, always).

      • I didn’t realize Andrew refuses to release code? Are you sure?

        My impression was that Andrew was pretty much in favor of open data / accessible code in the interest of replication.

        • Rahul:

          I don’t generally refuse to release code; I think what Shravan is talking about is that for many of my projects, the code is so messy that there’s nothing that I can easily release. I’m trying to do better, but for the old projects the code is a mess.

        • For my part, I am trying now to release all data and code with each publication on my home page. Sometimes, though, when most of the work was done by a student, I have difficulty enforcing that rule, so I also sometimes fail to release all data and code automatically with a paper when it is published. The messiness of my code is a perpetual problem that I have not yet solved; the issue usually is lack of time. I imagine that is also the source of Andrew’s difficulties with adding a data+code link to every published paper on his home page.

        • The one way to make this succeed is for the Journals and Editors to take a strong stand. No publication till the data / code has been deposited online. That might provide more motivation.

        • One difficulty is when the data are individual data covered by FERPA or other regulatory bodies, or owned by people other than the researchers. Often data can be requested, but then like with the census, error might added (or other tricks to make identification impossible). Sometimes the contract between the researchers and the organizations from whom the data are being collected requires the names of all the people who will have access to the individual data.

          A problem with editors needing to look at the data and code is how to pay for this (assuming something is done with them). Would this mean the authors paying for it (even when a paper is rejected) or raising the price of journals or just assuming that the ever increasing number of submissions will be far out-numbered by the number of methods folks wanting to check code for free?

        • To be fair, I program for a living ( around 1/2 to 3/4 of each day) and it’s difficult to do. The best way is to use a source control system like git and have others review every code check in (or pull, depending on what your sc calls it.) On my team, we require 2 reviewers for every commit. It ensures that at least someone can read your code and a good reviewer can help a lot.

  4. The problem with views like this reported by the IES is that many people believe not just what they read in Science and Nature, but often believe papers in journals that are weak as if they were in Science or Nature. It would be nice if everyone believed what Chris G. writes above.

    On the LaCour & Green reviews, are the reviewers allowed to release them?

      • Allowed by the journal. I know most journals are not allowed to release, but I know where people have asked to publish reviews (even anonymously) they ask the reviewers. I guess it depends what the reviewers’ said whether each would want their review published. When I review and am then sent all the reviews I usually read them. Some are poor quality, and I assume all readers of this blog know that. It does show that peer review is not perfect.

  5. Unfortunately, it seems, the answer is that there is “no assurance” that the public can trust the work.

    It is called “regulatory capture” in other industries.

  6. I believe I have a “solution” to these problems. The old system of peer-review is clearly broken. A journal could have an initial editorial decision regarding whether a paper appears appropriate and potentially interesting enough for further review. Passing this stage would put the paper out for public review – with a requirement that data and code also be publicly provided. Authors could put this on their resume for credit towards promotion and tenure (if such things continue to exist). But they only count as having created something potentially worthwhile.

    Next a period of public review – anonymous or not – exits for a few months. At the end of this period, the editors review all the comments and make a further decision. They may decide to count attributed comments more than anonymous ones – or not. They may ask for further revisions – or not. When they are through, if the paper is accepted for more official publication, this will count more heavily as a successfully peer-reviewed publication. It will have passed the “wisdom of crowds” which, of course, includes plenty of unwise comments, along with wise ones.

    I think this system would perform far better than the existing one. However, I am not naive. For this to be implemented, it will take a strong editorial board – there is a lot more work for them to do as well as more professional exposure (since they will be required to exercise more overt judgement). And, those with professional stature have more to lose through this system than they stand to gain – hence, I am skeptical that this will ever be adopted. I have had such a system in mind for 20 years now but don’t know how to make it happen. I don’t have the academic stature myself and doubt that those that could make this happen are willing to do so. But I believe that incremental changes to the current system will not achieve much and that a more radical approach is needed. I think this system would result in generally higher quality of published work, far more effort at replication (including the efforts of students to get involved in the process, which has its own motivational rewards), and more timely publication of research. More widely read research, as well.

    Any takers among the esteemed readers of this blog?

    • Bill:

      The last line of the story is interesting. He quotes Kenneth Rogoff on his notorious “Excel error” paper: “He said in an email message that he did not think peer review would have could the error.”

      Probably not, but some sort of open-data rule would’ve allowed it to have been caught much sooner.

        • Thanks Bill

          “Another answer to the problem of fraudulent research, though, might be more research. The federal government could sponsor studies to determine how much cheating goes on, how much harm it causes and how best to combat it.”

          Interesting and sounds like auditing to me (and about time) although its the erroneous results that are likely far more important.

        • I had a quick look http://www.nsf.gov/pubs/2014/nsf14546/nsf14546.htm

          Any idea if they are actually going to get access to raw data and study documentation?
          (They might have funded researchers who are compelled to provide that under their funding agreement.)

          It would seem the first step would be to get a sense of how often various errors occur along with “occasional” fraud.

Leave a Reply

Your email address will not be published.