Skip to content

A world of Wansinks in medical research: “So I guess what I’m trying to get at is I wonder how common it is for clinicians to rely on med students to do their data analysis for them, and how often this work then gets published”

In the context of a conversation regarding sloppy research practices, Jordan Anaya writes:

It reminds me of my friends in residency. Basically, while they were med students for some reason clinicians decided to get them to analyze data in their spare time. I’m not saying my friends are stupid, but they have no stats or programming experience, so the fact that they are the key data analyst for these data sets concerns me, especially considering they don’t have much free time to devote to the work anyways. So I guess what I’m trying to get at is I wonder how common it is for clinicians to rely on med students to do their data analysis for them, and how often this work then gets published.

I asked Jordan if I could post this observation, and he added:

Here’s some more details. I recently got an email from a friend saying they needed my help. They had previously taken a short introduction to R course and used those skills to analyze some data for a clinician. However, recently that clinician sent them some more data and now their code no longer worked so they asked me for help [“I need to do a proportion and see if it is significant . . .”]

They gave me the file and I confirmed R couldn’t read it (at least with the default read.csv—I’m not an R expert), so I looked at it in Python. There were extra rows, some missing data, and the job variable sometimes had commas in it so you couldn’t use comma as a delimiter. Anyways, I provided them with the information they wanted, and didn’t mention that some of their requests sounded like p-hacking.

My other friend is familiar with my Wansink work and what p-hacking is, and he told me he recently ran a bunch of tests on a data set for a clinician so I told him he p-hacked for that clinician, and he said that yes, he did.

Since I didn’t do my internship years I don’t know how these collaborations come about. I imagine a clinician needs help analyzing data and doesn’t have grad students so they turn to med students, and the med students don’t want to turn down a chance at getting their name on a paper.

As with Wansink, the big problem is not p-hacking but just a lack of understanding of the data, a disaster of the data collection/processing/analysis steps, and a lack of adult responsibility throughout.

P.S. Anaya adds:

I have another update for you. I talked with my friend today and they have refused to do any more work on the project! This was supposed to be a short analysis during their 4th year of med school but I guess there was some data handling problems so the clinician got someone else to get the data out of the database which I guess is what I saw last year (which is when the med student was now a first year resident and didn’t have time to work on the project). Anyways now the med student (now a second year resident) doesn’t trust the data enough to devote any more of their time to the project! So we have people who can’t handle data giving their pu pu platter to a med student (now second year resident) who doesn’t know how to program. I’m really starting to understand what happened with some of Wansink’s papers now.


  1. Fortunately there are good researchers, but they are very hard to identify.

    Not only is what Jordan worried about rather common, it apparently has been a common practice for clinician researchers to have three people analyse their data separately so they can choose the most publishable “mistaken analysis”. Often they are even paying them, so apparently its worth the cost to them.

    Sadly, I worked with a med student who on their research year complained too publicly about all the nonsensical analyses they were asked to do and was “fired”. They had to repeat that year of their program. They chose a non-research stream. Now, the primary investigator of that research group received a prestigious award a few years later specifically for their kind mentoring of young clinical researchers.

    To get a sense of the full problem this research manager ( )once told me they did not bother to check the analyses their bio-statisticians did for their group because they would have no idea if it was correct or not.

    (The clinicians I worked with would get second opinions on my work – at least once in a while.)

    • Zad Chow says:

      > “… once told me they did not bother to check the analyses their bio-statisticians did for their group because they would have no idea if it was correct or not.”

      I cannot help but wonder whether if this is a problem in research and statistics, or generally just a human problem. And I wonder about this because I was watching a series about medical diagnoses not too long ago and remember the show mentioning how cancer patients rarely consult other oncologists for a second opinion after getting a diagnosis, etc.

    • BenK says:

      The quip – not quite a joke – was that MDs were never good researchers; which I suppose makes identifying good researchers easier because of self-identification.

  2. Andrew Halim says:

    About the ‘Wansink’ in medical research, it is more common than you think if not the norm. All med students are required to write (not publish) a paper as part of their graduation requirements. In a year, you would get 12,000 med students and each one of them gets one senior clinician as their supervisor which roughly equates to >12,000 papers written annually. Why more than >12,000? Because the teaching clinicians are supposed to write paper as well but a lot of them will ask their med students to do it. More often than not, you will also be coerced to produce a statistically significant result out of rubbish study design/data. As a med student, you cannot refuse if you want to stay in med schools and it was ‘fun times’.

    Roughly, <5% of those senior clinicians would understand the basic concept of statistics and only a few universities have access to statisticians/biostatisticians. Even then, med students have to pay a huge price if they want to see a statistician. In the end, you have sleep-deprived med students who got no clue about statistical analysis writing seemingly legitimate 'research' papers while trying to survive med schools. If they have always done bad science for the sake of fulfilling administrative requirements and the persistent cult-like behaviors in the medical sector, I don't think this pattern will go away anytime soon.

    P.S. those are based on my observation in Indonesia, it may not apply elsewhere. Thoughts?
    I am an ex-doctor (General Practitioner) from Indonesia and now I work as an information analyst/statistician in NZ.

    • E says:

      Yes, this is very likely true in the US too. Research is often a “requirement” of residency — maybe one project each year.

      Physicians want an analysis that will give them something interesting, satisfy the requirements to graduate, not embarrass them too much in a 10 minute talk to the department, and possibly be presented as a poster or written up. The usual statistics education that doctors have may be one (or two) lectures while in medical school on sensitivity and specificity. So 2-by-2 tables from which these can be calculated or t-tests from Excel are the norm.

      Even when the department (or clinician) has access to a statistician, there is a horrible language barrier between the two. Residents in training have too much on their plate to attempt to learn and understand the analysis (which they have to do on the evenings and weekends) and the consulting statistician may not have an incentive to teach or do more than a simple NHST analysis.

    • Andrew says:

      Andrew H., E:

      It sounds like the problem is not the requirement of research, but (a) the requirement that this research be summarized quantitatively rather than qualitatively, and (b) the implicit requirement or pressure or demand for statistically significance.

    • Zad Chow says:

      I suspect that a lot of med students, and physicians with little knowledge of statistics will often consult the first few results of Google when conducting such quantitative analyses, so perhaps, part of the solution may be flooding the front page of Google for certain search queries with results like “Best Statistical Practices” etc, etc

  3. Terry says:

    Somebody needs to do a survey of medical research and produce a handy guide to which results should be believed and which should not. Or some sort of more nuanced evaluation scheme. Or anything at all.

    How is anyone on the outside supposed to come to grips with the upshot of this blog post? Vague complaining is not good enough. Produce something we can use.

    • Malcolm says:

      It’s called “critical appraisal”, and it’s an industry in itself.

      And the Cochrane Collaboration aims to be the handy guide you seek. Opinions differ on how successful it’s been, but I’d classify it as serious effort.

      • Terry says:

        Thanks. Glad to see someone is doing something about this.

        So maybe this problem is not such a problem after all. Maybe the decisions that matter are not being polluted by these problems.

      • Carlos Ungil says:

        Regarding the Cochrane Collaboration, the dropped the “collaboration” bit from the name a few years ago. There have been also internal disagreements about how the organization is managed and last year one prominent member was expelled and many others resigned in protest.

        • If I remember Keith O’Rourke was kicked out of Cochrane for basically being too pessimistic about mechanically doing meta-evaluations the way the rest of the group wanted? This is just my memory of something he posted here so I hope he’ll correct me if I’m wrong, the point being that politics and money and etc pollute medicine everywhere you turn even supposedly independent non-profit analyses.

          • Actually I was banned from the email discussion group of Cochrane’s Statistical Methods Group. Not so much politics and money as politics and prestige.

            The followers of the “thought leaders” of the group as well as the “thought leaders” themselves were annoyed at my critical technical comments which they claimed were overly provocative.

            I think this is just group dynamics heading towards group think – “thought leaders” emerge, others start to follow them and want to believe they are almost never wrong and then the “thought leaders” start to believe that themselves and start to think they need to enforce their views.

            The Peter Gotzsche thing seems to also involve money which may be what made it more serious.

    • Anoneuoid says:

      You cant trust anything produced by that process. Why do you think people live just as long in Cuba as in the US despite chronic shortages of modern drugs and equipment?

      Health insurance != healthcare != health

  4. Clyde Schechter says:

    In my experience at two medical schools since the mid 1980’s, this is pretty common.

    Most of the clinicians would prefer to have professional statistical support for their projects, but it is really not very available. The well-trained statisticians generally are available mainly to those who have grant funds to support them. Medical students are absolutely not doing the analyses on grant funded projects. But lots of clinicians try to write papers based on data they can gather from their practices, and the older ones have zero familiarity with statistics. They also know that, at least currently, medical schools do require a one-semester course in biostatistics/epidemiology, so having a medical student do the analysis is a (small) step up from trying to do it on their own. The quality of those courses is variable, but rare is the medical student whose statistical comprehension goes beyond the quest for p < 0.05. And, in fact, in many of those courses the educational objectives explicitly focus on preparing the students to be able to read journal articles somewhat critically, not to be able to do statistical analyses themselves.

    Some medical centers have some people with serious statistical knowledge make themselves available on a (very) limited basis. Typically they might be assigned to give a few hours of time to a project for free, and charge a relatively nominal fee for a few hours more. But that's it. If you want serious statistical support, you have to pay professional-level fees. At many medical centers, no unpaid professional statistical support is available at all. As research is at best a minor sideline in the careers of most clinicians, the result is what your friend observed.

    The other side of this is that many of the studies these clinicians do are poorly designed, and cannot be salvaged with analysis. So the powers that be may well be correct in their judgment that they should not have their statistically savvy employees spend much time on them.

  5. Jim Hatton says:

    Fifty years ago I worked for a doctor in a teaching hospital as a programmer/math person. He asked me to do a t-test for a sample size of 7 (actually one of the dogs had died), so I was to use a sample size of 6. I refused. I thought the outlier was relevant. He got an even less knowledgeable person than me to do it.

    • Andrew Halim says:

      Well, I suppose nowadays most people will try to get a sample size of at least 30.
      It is an over-generalization but then again the changes are widespread.

      Perhaps, just like crafting public health messages, it is better to spread a simple statistical messages to a lot of clinicians rather than expecting them to have a decent statistical knowledge.

    • Jon says:

      That is a terrible approach to science to refuse cooperation only based on sample size. Sample size on its own says very little. If you can give us a context. In many clinical situations sample size = 6 would be very informative and could provide serious insights.

  6. Jordan Anaya says:

    I was talking with a faculty member at Hopkins and unprompted he started talking about how clueless the doctors were when it came to research. I asked him what the motivation was for the doctors to publish, and apparently publications are tied to their promotions.

    • jd says:

      Not surprising.
      I believe in med school there is a biostat ‘lecture’ and the main concern is how many questions from biostat appear on Step 1 (I think it was 2?). As a resident, research is a requirement, which, in my opinion should be re-thought, considering how exhausted and over-worked residents are. They are either tacked onto an existing project or try to do their own in the brief time they have (which makes for an extremely limited study).
      So it’s requirements all the way down.

Leave a Reply