Should drug companies be required to release data right away, not holding data secret until after regulatory approval?

Dale Lehman writes:

The attached article from the latest NEJM issue (study and appendix) caught my attention – particularly Table 1 which shows the randomized vaccine and control groups. The groups looked too similar compared with what I am used to seeing in RCTs. Table S4 provides various medical conditions in the two groups, and this looks a bit more like what I’d expect. However, I was still a bit disturbed, sort of like seeing the pattern HTHTHTHTHT in 10 flips of a coin. So, there is a potential substantive issue here – but more importantly a policy issue which I will get to shortly.

The Potential Substantive Issue

I have no reason to believe the study or data analysis was done poorly. Indeed, the vaccine appears to be quite effective, and my initial suspicions about the randomization seem less salient to me now. But, just to investigate it further, I looked at the confidence intervals and coverage if the assignment had been purely random. Out of 35 comparisons between the control and vaccine groups (some demographic and some medical – I made no adjustment for the fact that these comparisons are not independent), 17% fell outside of a 95% confidence interval and 40% outside of a 68% (one standard deviation) confidence interval. This did not reinforce my suspicions, as the large sample size made the similarities between the 2 groups less striking than I initially thought.

So, as a comparison, I looked at another recent RCT in the NEJM (“Comparative Effectiveness of Aspirin Dosing in Cardiovascular Disease, NEJM, May 27, 2021). Doing the same comparisons of the difference between the control and treatment groups in relation to the confidence intervals, 36% of the comparisons fell outside of a 95% confidence interval and 68% outside of the 68% confidence interval. This is closer to what I normally see – it is difficult to match control and treatment groups through random assignment, which is why I always try to do a multivariate analysis (and, I believe, why you always are asking for multilevel studies).

So, this particular vaccine study seems to have matched the 2 groups closer than the second study, but my initial suspicions were not heightened by my analysis. So, what I wanted to see was some cross tabulations to see if the two groups similarities continued at a more granular level. Which brings me to the more important policy issue.

Policy Issue

The data sharing arrangement here stated that the deidentified data would be made available upon publication. The instructions were to access it through the Yale Open Data Access Project site. This study was not listed there, but there is a provision to apply for access to data from studies not listed. So, I went back to the Johnson & Johnson data sharing policy to make sure I could request access – but that link was broken. So, I wrote to the first author of the study. He responded that the link was indeed broken, and

“However, our policy on data sharing is also provided there on the portal and data are made available AFTER full regulatory approval. So apologies for that misstatement in the article we are working to correct this with NEJM. For trials not listed, researchers are welcome to submit an inquiry and provide additional information for consideration by the YODA Project.”

I requested clarification regarding what “full regulatory approval” meant and the response was “specifically it means licensure in the US and EU.”

I completely understand Johnson & Johnson’s concern about releasing data prior to regulatory approval. Doing otherwise would seem like a poor business decision – and potentially one that would stand in the way of promoting public health. However, my concern is with the New England Journal of Medicine, and their role in this process. Making data available only after regulatory approval seems to offer little opportunity for post-publication (let alone, pre-publication) review that has much relevance. And, we know what happened with the Surgisphere episode earlier this year, so I would think that the Journal might have a heightened concern about data availability.

I don’t think the issue of availability of RCT data prior to regulatory approval is a simple one. There are certainly legitimate concerns on all sides of this issue. But I’d be interested if you want to weigh in on this. Somehow, the idea that a drug company seeking regulatory approval will only release the data after obtaining that approval – and using esteemed journals in this way – just feels bad to me. Surely there must be better arrangements? I have also attached the Johnson & Johnson official policy statement regarding data sharing and publication. It sounds good – and even involves the use of the Yale Open Data Access Project (an Ivy League institution, after all) – but it does specify that data availability follows regulatory approval.

I agree that it seems like a good idea to require data availability. I’m not so worried about confidentiality or whatever. At least, if I happened to have been in the study, I wouldn’t care if others had access to an anonymized dataset including my treatment and disease status, mixed with data on other patients. I’m much more concerned the other way, about problems with the research not being detected because there are no outside eyes on the data, along with incentives to do things wrong because the data are hidden and there’s a big motivation to get drug approval.

9 thoughts on “Should drug companies be required to release data right away, not holding data secret until after regulatory approval?

  1. I’d go farther. If you want an exclusive monopoly on a drug or medical device you MUST provide 100% of the data you will rely on in a machine readable format on a government run data repository downloadable by any citizen prior to applying and you MUST get 100% pre-approval from all your patients to release their data. End of story.

    Fuck anyone who thinks they can get the government to shoot or jail their competitors without the public OWNING 100% of the data… And yes enforcing patents means the government will use whatever force is necessary to ensure no one competes with you for 20 years up to and including killing anyone who resists arrest with sufficient force. It’s not an exaggeration and we should treat patents with a LOT of skepticism and a demand that it’s the public that OWNS the information, the company just gets a brief period to rent it from us exclusively.

  2. Apologies if I’m missing something, in the paper they say they used a stratified randomization, doesn’t that entirely explain why the arms on this study more closely match on certain key demographics than you would expect a non-stratified randomization?

    “Participants
    were randomly assigned in a 1:1 ratio, with the
    use of randomly permuted blocks, to receive ei-
    ther Ad26.COV2.S or saline placebo. Randomiza-
    tion was conducted with an interactive Web-re-
    sponse system and stratified according to trial
    site, age group, and the presence or absence of
    coexisting conditions that have been associated
    with an increased risk of severe Covid-19.”

  3. Finally, a topic that I know a little about!!! When I was working at PhRMA as head of regulatory affairs, we undertook a project to come up with a set of principles for the conduct of clinical trials and communication of clinical trial results. We had a small working group of seven company reps and I was joined by a PhRMA attorney who supported our division. It took us just short of a year to reach agreement and get all the necessary approvals, but we were both surprised and pleased by the ultimate work product: https://phrma.org/resource-center/Topics/Cost-and-Value/PhRMA-Principles-on-Conduct-of-Clinical-Trials (the first version was completed in 2008 and modified several times since then). We did have a lot of discussion about the sharing of data sets and there was never a consensus on how far the document should go in that regard. There were some companies who argued for transparency and availability upon approval of the drug but not pre-approval.

    I’m sure that there are a number of blog readers here that do not think the document goes far enough. That’s OK with me BUT what really galls me is that this somehow impugns the work the FDA statisticians and data analysts do during their review of the license application. FDA also does a lot of work (as do companies), monitory data integrity and compliance. It’s not all about patient confidentiality.

    During my career, I was the principal spokes person on clinical trials and drug safety. I spent a lot of time doing media on SSRI inhibitor issues which was always challenging (especially the lack of understanding about subjective versus objective clinical trial endpoints).

    As an aside, I was the project manager for the Observational Medical Outcomes Partnership (OMOP) which ultimately morphed into he Observational Health Data Sciences and Informatics (OHDSI, pronounced “Odyssey”). PhRMA provided the initial funding for this and David Madigan, then at Columbia, was one of the early OMOP investigators.

    • You are certainly correct to point out the role of the FDA analysts and I don’t mean to impugn them. But when they are the only non-industry people reviewing the data, then the rest of us are put in the position of “trust us.” I am not comfortable being put in that position. In any case, the reason for my sending this issue to Andrew concerns the role of the NEJM in this process. When they publish the study results without the data, they are implicitly affirming the existing review process and safeguards. After Surgisphere and (many) other episodes, I was hoping for a more proactive role on their part. Naive of me, I guess.

      I guess I’d like to put the burden of proof on pharma, the FDA, and the journals. Exactly why must the trial data be withheld until regulatory approval? It’s not hard to think of reasons, but at the very least, they should have to make that case.

      • In a previous career. I worked in pharma but left that over a decade ago so my knowledge may be dated. At that time, NEJM required a de-identified dataset and analysis programs with the submission of a manuscript describing results of a pharma sponsored clinical trial. There was a stat reviewer whose role was to reproduce results in the manuscript. Presumably, if they objected to the analysis itself, they were free to voice those.

        My conjecture on the approval date is that for a new molecular entity, until approval is granted, the availability of the drug is very limited and pharma companies probably feel that an open forum of the trial results isn’t necessary.

        Let’s face it though, pharma is extremely reluctant to release individual patient data for any of its trials. It won’t happen unless pressure is put on them to do so.

        • Maybe someone can clarify: all I find on the NEJM website regarding clinical trial data is

          “A protocol document (or equivalent) and a statistical analysis plan (SAP) should be submitted with each manuscript. The SAP should contain enough detail to enable someone else to replicate the analysis in a similar data set. Original and final versions of these documents are important to include if the documents have evolved. Documents should be dated, and changes in the study plans and accompanying rationale should be described. For clinical trials, formal protocol amendments and corresponding changes to the SAP will typically be available and should be included with the manuscript submission.”

          Also, the provision of the data clearly doesn’t apply generally, or the Surgisphere episode would never have happened. It is possible that there are special requirements for pharma clinical trials but I’m not seeing it. Again, maybe someone can clarify whether this was, or is, the policy at NEJM.

  4. Dale: in 2016, the NEJM Editor in Chief wrote that availability of data six months after publication would be a requirement for authors for publication.

    DOI:10.1056/NEJMe1601087

    However that Editor retired in 2018. I don’t know if the new editor supports this or not. Worthwhile to ask him in my opinion. I support your efforts to make individual level data available publicly – it is the responsible thing to do.

Leave a Reply

Your email address will not be published. Required fields are marked *