It can be hard to get the data from a published study. Here are 2 frustrating examples:

In comments to a recent post, Unanon wrote:

I [Unanon] don’t understand the difficulty in getting vaccine data. A lot of posters and readers in this blog are affiliated with universities—prominent ones, at that. Also, it looks like Pfizer has a decent track record in sharing their clinical trial data:

https://cdn.pfizer.com/pfizercom/research/research_clinical_trials/CTDS_Current_Pfizer_Clinical_Trial_Data_Sharing_Metrics_2021_Q4.pdf

It shouldn’t be impossible to obtain the information from them. If folks are truly earnest in getting the data, why not work on the data access request? It makes sense there should be *some* amount of red tape to jump through (nominal costs help weed out bad actors, etc). https://www.pfizer.com/science/clinical-trials/trial-data-and-results/data-requests

To which Dale replied:

Since you linked Pfizer’s report on data requests, you must have read their policy document. Among the lengthy description of the hoops you need to jump through, there is this:

“Data from these trials will be made available 18 months after the primary study completion date.”

That would appear to make it difficult to get the data Daniel is seeking in the near future. Still, your point about the readers of this blog is well taken. Many are well connected and might be successful at jumping through Pfizer’s hoops. Unfortunately, since I am not at an Ivy League institution, my efforts have not been successful in the past. Pfizer portrays there policies as going “above and beyond” what is required. Unfortunately, that is a very low bar, indeed.

I have no experience with Pfizer or vaccine data but I can share two of my experiences regarding data availability. The short answer is that affiliation with prominent universities is no guarantee of success.

1. In 1989, as a graduate student, I contacted a research group at the Environmental Protection Agency to ask about some raw data related to a series of papers they’d published. They refused to send it to me. Many years later I wrote up the story, and several years after that I reanalyzed the published data in a way that seemed good to me. I’d still like to see the raw data but I guess at this point it would be impossible; the lab notebooks have probably been thrown out. I wish I’d made a formal Freedom of Information Act request back in 1989; I was just too accepting of the idea that they could just say no to a data request. Just to be clear: I’m not saying that I should’ve had some sort of priority because of my university affiliation; rather, I think they should’ve sent the data to anyone who requested it. It’s not like their mailbox was filling up with data requests, right? And . . . the data were measurements on dead chickens, so I don’t think we have to worry about privacy or confidentiality.

2. Last week someone pointed me to a published article and asked me what I thought. The article seemed ok and I wanted to check out some things on the data, so I clicked through to the data availability statement and it said that the data would be shared in ClinicalTrials.gov after publication. But the paper was published and the data weren’t there! I sent an email to the authors, who replied, “The data will be forthcoming in ClinicalTrials.gov, as stated in our data sharing agreement.” A couple days later I checked again and the repository had appeared on the website, with lots of details of the study but no data yet. But I’m encouraged that the webpage has appeared. I’ve scheduled this post on the usual lag, so we’ll see if the data appear before this post does!

P.S. I noticed this post was about to appear so I sent off another email to the authors of the paper described in item 2 above. No response yet; we’ll see what happens.

P.P.S. On 23 Feb 2022 they replied to my question with the response, “The data is forthcoming soon, and we will let you know as soon as it is posted.” I followed up on 5 Sep 2022 asking for updates, and on 7 Sep they replied that the data were entered on the website and they were waiting for an administrative review. I’ve not heard anything since then. Just to be clear, I’m not suggesting anything nefarious here; it’s just the usual story that data sharing is not anyone’s first priority.

16 thoughts on “It can be hard to get the data from a published study. Here are 2 frustrating examples:

  1. In a sane world people receiving public funding to produce public goods (knowledge) who hoard that good after receiving the funding would get jail time. That’s called defrauding the federal govt.

    To get an exception would require explicitly stating the exception in the grant application and would automatically lower your grant score.

    We don’t live in a sane world.

    • Daniel:

      Example #1 above is even worse because they were actually Federal employees!

      That said, I disagree with you regarding jail. Jail seems like a barbaric punishment and there should be many other better ways to do this.

        • “The economist” is much too general here. Robin Hanson has a reputation for out-of-the-box and attention-grabbing ideas. They are often “nuts” although, as Andrew suggests, compared to what? Many of our existing policies are nuts as well. I rarely agree with Hanson’s proposals, but I think he is correct to focus on incentives. That is the problem I have with the JAMA announcement – they say lots of nice things about data sharing, but then state that there is no penalty for those who choose not to share. Much of my economics training I’d like to forget, but the focus on the primacy of incentives is not one of those things. If there is no penalty for not sharing data, then, as they say, “solve for the equilibrium.”

      • The good: lots of stuff released. I believe there are 414 documents and plenty of data files.
        The bad: I can’t find anything resembling a data dictionary, nor anything that indexes all those documents.
        Perhaps I haven’t sorted it out yet, and if anyone can help steer me to the relevant documents (for exploring the data on vaccine effectiveness, for example), I’d appreciate that.

        Given that this information release was the result of a court case, it is perhaps understandable that it isn’t provided in a more friendly format. On the other hand, it is a missed opportunity to establish a reputation for openness and engagement with communities of interest. For now, it strikes me as an adversarial statement. After all, methodological terrorists we are.

  2. In this case, I don’t think they have the data you’d want. The nature of the measurements is too imprecise to justify more than 2-3 doses roughly scaled by body mass.

    Think about just comparing weight to a neutralizing antibody assay. First of all, the measured levels are a snapshot of something constantly waning but also fluctuating daily and due to various environmental factors (stress, exposure to similar epitopes). Then there are problems with the assay itself. Eg what type of cell, how much ACE2 is expressed, how about furin, etc? Then are they counting number of foci or what, how do they deal with synticia formation?

    Then the assay is only a proxy anyway. What you really want to know is whether the person is protected against severe covid (I’m guessing you are only looking at blood Ab levels here). So really you also want to know how the T/B- cells react, which brings a whole new set of errors like for the neutralization assay.

    And the virus is always mutating, which will also add uncertainty. This is all before getting into accounting for side effects and other costs, or errors in administration.

    So whatever dosing table got developed is going to be based more on questionable assumptions than reality.

  3. If you are concerned that there is a potentially fundamental flaw with the paper discussed in #1, you could post your concerns to PubPeer. (link below). That paper is still getting cited, as recently as 2021 (by both you and someone else). As you say, it is certainly too late to see the data itself, and it would be a hassle for you to write it all up, but who knows.

    Link: https://pubpeer.com/publications/46049A0F852FE1925BA0CD14F2E85F

      • Ahh, I see.

        Also, I agree that the authors were wildly unethical for not providing data. It’s so common for groups to publish what looks like important work, and then refuse to share data.

        “Oh my gosh, telephone poles cause cancer!”
        “That’s nuts! Can I see the data?”
        “Nah, I’d rather just keep this important data all to myself, thanks.”

  4. This is a notice to readers of this post. The editors of JAMA have just announced two new policies for their journals — one deals with article access for the public and the other with data access. . Here are links to both announcements:

    https://jamanetwork.com/journals/jama/fullarticle/2799369

    https://jamanetwork.com/journals/jamaophthalmology/fullarticle/2799717?guestAccessKey=ee9f732d-5c18-4164-aa4b-807ab052c9f9&utm_source=silverchair&utm_medium=email&utm_campaign=article_alert-jamaophthalmology&utm_content=olf&utm_term=121422

    • Welcome sentiments, but I find this a bit lacking:
      “While the JAMA Network journals endorse the principle of data sharing and have adopted this policy to encourage the sharing of data, sharing is not mandated by the JAMA Network. An author’s intention to share data or not will not be considered in the editorial decision.”

      After the lengthy description of the history of data sharing and numerous statements about how they support it, the quoted line is what I believe is of importance. It is still up to the authors and it won’t affect publication. I’m not sure why they couldn’t be a bit stronger – even accepting that there are reasons for data not to be shared, why shouldn’t it be considered in the editorial decision?

  5. > The short answer is that affiliation with prominent universities is no guarantee of success.
    Probably there is no deny that the affiliation with prominent universities would still increase the chance of getting EPA attention. Maybe it is a more objective metric for U.S. News to adopt to rank colleges?

Leave a Reply

Your email address will not be published. Required fields are marked *