Controversies in vaping statistics, leading to a general discussion of dispute resolution in science

Episode 2

Brad Rodu writes:

The Journal of the American Heart Association on June 5, 2019, published a bogus research article, “Electronic cigarette use and myocardial infarction among adults in the US Population Assessment of Tobacco and Health [PATH],” by Dharma N. Bhatta and Stanton A. Glantz (here).

Drs. Bhatta and Glantz used PATH Wave 1 survey data to claim that e-cigarette use caused heart attacks. However, the public use data shows that 11 of the 38 current e-cigarette users in their study had a heart attack years before they first started using e-cigarettes.

The article misrepresents the research record; presents a demonstrably inaccurate analysis; and omits critical information with respect to (a) when survey participants were first told that they had a heart attack, and (b) when participants first started using e-cigarettes. The article represents a significant departure from accepted research practices.

For more background, see this news article by Jayne O’Donnell, “Study linking vaping to heart attacks muddied amid spat between two tobacco researchers,” which discusses the controversy and also gives some background on Rodu and Glantz.

I was curious, so I followed the instructions on Rodu’s blog to download the data and run the R script. I did not try to follow all the code; I just ran it. Here’s what pops up:

This indeed appears consistent with Rodu’s statement that “11 of the 38 current e-cigarette users were first told that they had a heart attack years before they started using e-cigarettes.” The above table only has 34 people, not 38; I asked Rodu about this and he said he that the table doesn’t include the 4 participants who had missing info on age at first heart attack or age at first use of e-cigarettes.

How does this relate to the published paper by Bhatta and Glantz? I clicked on the link and took a look.

Here’s the relevant data discussion from Bhatta and Glantz:

As discussed above, we cannot infer temporality from the cross‐sectional finding that e‐cigarette use is associated with having had an MI and it is possible that first MIs occurred before e‐cigarette use. PATH Wave 1 was conducted in 2013 to 2014, only a few years after e‐cigarettes started gaining popularity on the US market around 2007. To address this problem we used the PATH questions “How old were you when you were first told you had a heart attack (also called a myocardial infarction) or needed bypass surgery?” and the age when respondents started using e‐cigarettes and cigarettes (1) for the very first time, (2) fairly regularly, and (3) every day. We used current age and age of first MI to select only those people who had their first MIs at or after 2007 (Table S6). While the point estimates for the e‐cigarette effects (as well as other variables) remained about the same as for the entire sample, these estimates were no longer statistically significant because of a small number of MIs among e‐cigarette users after 2007. . . .

And here’s the relevant table (from an earlier version of Bhatta and Glantz, sent to me by Rodu):

699 patients with MI’s, of whom 38 were vaping.

Table 1 of the paper shows the descriptive statistics at Wave 1 baseline; 643 (2.4%) adults reported that they had a myocardial infarction. Out of those 643 people, a weighted 10.2% were former e-cigarette users, 1.6% some day e-cigarette users, and 1.5% some-day cigarette users. 1.6% + 1.5% = 3.1%, and 3.1% * 643 = 20, not 34 or 38. It seems that the discrepancy here arises from comparing weighted proportions with raw numbers, an issue that often arises with survey data and does not necessarily imply any problems with the published analysis.

But Rodu’s criticism seems more serious. Bhatta and Glantz are making causal claims based on correlation between heart problems and e-cigarette use, so it does seem like it would be appropriate for them to exclude from their analysis the people who didn’t start e-cigarette use until after their heart attacks. Even had they done this, I could see concerns with any results—the confounding with cigarette smoking is the 800-pound gorilla in the room, and any attempt to adjust for this confounding will necessarily depend strongly on the model being used for this adjustment—but removing those 11 people from the analysis, that seems like a freebie.

Is it appropriate for Rodu to describe Bhatta and Glantz’s article as “bogus”? That seems a bit strong. It seems like a real article with a data issue that Rodu found, and the solution would seem to be to perform a corrected analysis removing the data from the people who had heart problems before they started vaping. This won’t make the resulting findings bulletproof but it will at least fix this one problem, and that’s something. One step at a time, right?

Episode 1

Rodu has had earlier clashes with this research group.

Last year, he sent me the following email:

An article recently published in the journal Pediatrics claimed that teen experimental smokers who were e-cigarette triers or past-30-day users at baseline were more likely to be regular smokers one year later than experimental smokers who hadn’t used e-cigs. The authors used regression analysis of a publicly available longitudinal FDA survey dataset (baseline ~2013, follow-up survey one year later). Although the authors used lifetime cigarette consumption to restrict their study to experimental smokers at baseline (LCC ranging from one puff but never a whole cigarette to 99 cigarettes), they ignored this baseline variable as a confounder in their analysis. When I reproduced their analysis and added the LCC variable, the positive results for e-cigarettes essentially disappeared, negating the authors’ core claim.

I [Rodu] called in my blog (here and here) for retraction of this study because the analysis was fatally flawed, and I published a comment on the journal’s website (here). The authors dismissed my criticism, responding with the strange explanation that LCC at baseline is a mediator rather than a confounder. The journal editors apparently believe that the authors’ response is adequate; I believe it is nonsensical.

I believe that this study uses faulty statistics to make unfounded causal claims that will be used to justify public health policies and regulatory actions.

Rodu added:

In my second blog post (here), I stated that “Chaffee et al. called our addition of the LCC information a ‘statistical trick.’” They used that term in a response appearing on the Pediatrics website from March 30 to April 23 (here, courtesy of Wayback Machine). Yesterday a completely new response appeared with the same March 30 date; “statistical trick” disappeared and “mediator” appeared (here).

I agree with Rodu that in this study you should be adjusting for lifetime cigarette consumption at baseline. How exactly to perform this adjustment is a statistical and substantive question, but I’m inclined to agree that not performing the adjustment is a mistake. So, yeah, this seems like a problem. Also, a pre-treatment exposure variable is not a mediator, and “statistical tricks” are OK by me!

I was curious enough about this to want to dig in more—if nothing else, this seemed like a great example of measurement error in regression and the perils of partial adjustment for a confounder. It can be good to work on a live example where there is active controversy, rather than reanalyzing the Electric Company example and the LaLonde data one more time.

So I asked Rodu for the data, and shared it with some colleagues. Unfortunately we got tangled in the details—this often happens with real survey data! We contacted the authors of the paper in question to clear up some questions, and they, like Rodu, were very helpful. Everyone involved was direct and open. However, the data were still a mess and eventually we gave up trying to figure out exactly what was happening. As far as I’m concerned, this is still an open problem, and a student with some persistence should be able to get this all to work.

So, for now, I’d say that Rodu’s statistical point is valid and that the authors should redo the analysis as he suggests. Or maybe some third party can do so, if they’re willing to put in the effort.

Where there’s smoking, there’s fire

Tobacco research is a mess, and it’s been a mess forever.

On one side, you have industry-funded work. Notoriously, in past decades the cigarette industry was not just sponsoring biased studies (forking paths, file drawers, etc.); they were actively spreading disinformation, purposely polluting scientific and public discourse with the goal of delaying or reducing the impact of public awareness of the dangers of smoking, and delaying or reducing the impact of public regulation of cigarettes and smoking.

On the other side, the malign effects of smoking, and the addictive nature of nicotine, have been known for so long that anti-smoking studies are sometimes not subject to strict scrutiny. Anti-smoking researchers are the good guys, right?

There’s still a lot of debate about second-hand smoke, and I don’t really know what to think. Being trapped in a car with two heavy smokers is one thing; working in a large office space where one or two people are smoking is something much less.

There are similar controversies regarding studies of social behavior. When, a couple decades ago, cities started banning smoking in restaurants, bars, and other indoor places, there were lots of people who were saying this was a bad idea, Prohibition Doesn’t Work, etc.—but it seems that these indoor smoking bans worked fine. Lots of smokers wanted to quit and didn’t mind the inconvenience.

So, moving to these recent disputes: both sides are starting with strong positions and potential conflicts of interests. But these data questions are specific enough that they should be resolvable.

How to resolve scientific disputes?

But this gets us to the other problem with science, which is that it does not have clear mechanisms for dispute resolution. As we’ve discussed many times in this space, retraction is not scalable, twitter fights are a disaster, we can’t rely on funding agencies to save us—certainly not in this example!

I get lots of emails from people who see me as a sort of court of last resort, a trusted third party who will look at the evidence and report my conclusions without fear or favor, and that’s fine—but I’m just one person, and I make mistakes too!

One could imagine some sort of loose confederation of vetters—various people like me who’d look at the evidence in individual disputes. But is that scalable? And if it became more formal, I’d be concerned that it would be subject to the same distortions regarding the power structure. Can you imagine: a dispute-resolution committee in social psychology, under the supervision of Robert Sternberg, Susan Fiske, and the editorial board of Perspectives in Psychological Science? Fox in the goddamn chicken coop.

It may be that, right now, Pubpeer is the best thing going, and maybe it can be souped up in some way to be even more useful. I have some concern that Pubpeer can be gamed in the same way as Amazon reviews—but even a gamed Pubpeer could be better than nothing.

26 thoughts on “Controversies in vaping statistics, leading to a general discussion of dispute resolution in science

  1. Regarding the last section and scientific disputes – shouldn’t everything be left open to dispute? Is resolution actually necessary? This seems like a good checks and balances, as far as science goes.

    Perhaps you are speaking more in terms of policy, since often these science matters have policy implications? In any case, I would fear a powerful “dispute resolution committee” composed of ‘scientific experts’, deemed virtually all-knowing in the public eye due to “science”.

    • Jd:

      Yes, everything is open to dispute at some level, but we should be able to come to tentative conclusions. For example, after this investigation and all the direct evidence we’ve seen, I’m not inclined to take “power pose” seriously. And, for statistical reasons, it seems pretty unambiguous that those beauty-and-sex-ratio claims are nothing but noise. (The underlying theories could be correct, or incorrect, or partially correct, on a qualitative level, but the articles on the topic offer essentially zero empirical evidence regarding these theories.) On the other hand, I have no idea what to conclude from this bird fight.

      • I agree that some sort of consensus or tentative conclusions (or even stronger than tentative) should (and does) occur among experts in a field. This is how science moves forward, right?

        However, I would argue that reaching this via some sort of formal dispute resolution committee would be the downfall of any dissent or reasonable scientific discourse whatsoever. Even if it started with a nice panel, it wouldn’t last for too long before becoming politically or personally motivated, with that kind of power attached. I think it would always end with foxes in the chicken house.

        At any rate, what would be the end product? That you could take something like power pose to the committee and make it go away?

        • Jd:

          I agree that there could not be such a formal committee—and, if there were such a committee, it should probably be disbanded immediately! See second-to-last paragraph of my above post. My point in that discussion is that informal dispute resolution will be happening whether we want it or not, so it could be worth thinking about.

        • Yes, I am responding to that paragraph.
          First, I wanted to throw my blog commenter opinion out there that anything formal is a bad idea. I agree with you on this.

          Second, I am wondering if “dispute resolution” in the sense that I think you are describing is necessary. As a “trusted third party” aren’t you simply a peer review of sorts? It’s like taking a manuscript to a second peer review after it is published. Why would we need anything that acts as “dispute resolution”. Such language has a certain finality to it.

          Perhaps I am misunderstanding, but when you use words like “court of last resort” and “dispute resolution” (even if “informal”) it has sort of a ring of finality to it. And that finality bit bothers me.

        • Jd:

          It’s not that I think that dispute resolution is necessary or even that it’s a good idea. What I’m saying is that there’s enough of a demand for dispute resolution that, in some form or another, it will continue to happen.

          I don’t consider myself a “court of last resort” (or even the police); these are terms that other people use, sometimes in a hopeful way (someone appealing to me because they feel this is their only remaining hope for justice) and sometimes disparagingly (calling me or others “replication police” or “Stasi” or whatever, even though we do not ourselves claim this role in any way).

          I think there’s a demand for dispute resolution, and I think dispute resolution is going to happen. With that in mind, I think that those of us who are called upon to resolve disputes should make our uncertainties clear at all stages.

  2. “However, the data were still a mess and eventually we gave up trying to figure out exactly what was happening.”

    What do you mean by “a mess”? Poorly organized, so no one can really tell what’s there? Or inherently complicated?

    Whichever in this case, and though things are steadily changing for the better, we need to keep pushing the open data / quality data meme so that we’re totally rearranging our thinking about research. People are stuck on the idea that the “outcome” of research is some conclusion. That’s wrong: conclusions are ephemeral.

    The highest priority “outcome” of any research should be a well documented data set that can analyzed again by others if occasion demands. IMO that would go a long way to preventing problems in the first place.

        • Daniel:

          It was the usual issues: data not in Ascii format, issues of which variables and which data points were excluded from different analyses, variables constructed by combining other variables, different survey questions measuring the same thing or similar things, dropout and other missing data, survey weights, maybe some other issues that I don’t recall right now.

      • :) There’s publication pressure for sure and agenda driven people too, but I think it’s also a natural tendency for honest people to want to get to the “tell your story” part. That’s the fun part for curious, interested and intelligent people, right? It is for me anyway.

        I watched an interview with Red Hot Chile Peppers last night. The interviewer asked how they keep interested after all these years, and Anthony said it’s because when you’ve created something at the end of the day that’s just super cool. I think researchers want to do that too. I think that’s the real motivation for most people, creating something cool and sharing it. So I guess one way to drive better data is to get people to think of the data set as the immutable fundamental aspect of their creation – the blocks that make the pyramid that protects the stunning gold artifacts.

  3. >>some sort of loose confederation of vetters—various people like me who’d look at the evidence in individual disputes. <<

    Isn't that the role played by journal editors and reviewers at least once upon a time? Or basically the confederation already exists but we've got to change their current work profile.

    • Anon:

      But journal editors very rarely take on the job of adjudicating controversies. Almost always they just pass judgment on individual papers, and, even then, they’re typically not in the job of assessing the correctness of the claims. Also the whole system of peer review is more about enforcing conformity than anything else. (Yes, journals are aware of this issue, hence they’ll publish papers such as Bem’s 2011 ESP paper specifically because they don’t want to be censoring bold, speculative ideas—but this is just another form of conformity, in that they will publish bold, speculative ideas of a certain sort.)

      In any case, I would not want scientific journals to take on the job of adjudicating disputes. Can you imagine Psychological Science or PNAS issue a ruling on the correctness of the ovulation-and-voting or himmicanes studies? It would all get very political very fast.

  4. There are similar controversies regarding studies of social behavior. When, a couple decades ago, cities started banning smoking in restaurants, bars, and other indoor places, there were lots of people who were saying this was a bad idea, Prohibition Doesn’t Work, etc.—but it seems that these indoor smoking bans worked fine. Lots of smokers wanted to quit and didn’t mind the inconvenience.

    This is the kind of thing that really, really bugs me. Indoor smoking bans worked fine for some segment of the population. Prohibition always does that. But people lost their life’s work, businesses, and livelihoods. There was a general loss of personal liberty and smokers have be shut out into the cold and demonized generally. It started the slippery slope that led to smoking bans in parks and other public places, that leads the the homeless and poor being punished. Now we have vaping banned in the bars and restaurants, just because smoking is and vaping looks the same. The entire anti-science charade behind SHS, and the allowable falsification of risks and dangers “for the betterment of society” is bringing us an elitist cabal of progressive scientists who believe that anything they are allowed to do is right and moral. In the end, they’ve latched onto vaping and are attacking it because there whole house of cards would come crumbling down and their reputations permanently ruined if any of this were to come to light in this generation. They prefer the tens to hundreds of millions of deaths from cigs to allowing this new vice to flourish. My god, they are actually seriously proposing limiting what flavors people may enjoy! Orwell would have blushed at that as a bridge too far in credibility if he had put it in 1984.

    So don’t tell me or lie to yourself that not being allowed to smoke in bars and restaurants has people happy, it only has the people who are left going to bars and restaurants happy – and the bars and restaurants that survived the purge are happy for the increased business.

    • PNB:

      Is it really true that, as a result of smoking bans in restaurants, bars, and other indoor places, that “people lost their life’s work, businesses, and livelihoods”? Was there really a “purge” of bars and restaurants? I know that, before the bans, there were people saying this would happen, that the smoking bans would be bad for business. But I didn’t hear that these anticipated negative effects actually happened.

      • Me either, in fact my recollection was that in the medium and long term in CA, attendance at bars and restaurants rose, but I’ll admit to it just being a vague recollection.

        The idea that liberty was reduced because people were no longer allowed to emit smoke into the environment Willy nilly is stupid. whether or not it was fatal or whatever was irrelevant, emitting smoke into a room is pollution straight up. smokers were given privileged preference to be able to pollute everyone’s environment because the tobacco industry paid a lot of money to normalize that scourge. People should be allowed to pollute their private dwellings perhaps, but restaurants, parks, hospitals, and airplanes is just insane. Imagine if you could go into a hospital with heavy incense burning pots and swing them all over the place… no different.

        I will admit to hating these anti-vapers, who seem to want to demonize a thing that absolutely must be the best harm reduction strategy we have for smokers. I’m fine with limiting the sales of nicotine containing products to people over 18 though.

        • Yes, it’s true. Notice the term “attendance rose.” This is one of those clues that show you exactly how you’re being manipulated. In the long term, attendance will always rise as pop grows. Attendance may have risen, but what about the places the “poorer people go” – you don’t see those, but those are vibrant, verdant public spaces. They went away and the well funded and higher-end spot survived. The bars closed and the restaurants took the former business dollars. And everyone always says “the number of bars and restaurants stayed the same.” But whenever you combine things it gives rise to inaccuracies. The number of bars goes down, the number of restaurants rises. Look at the traditional English pub. It’s holding on by a thin thread, much of the death of it due to anti-smoking actions.

          The business moved from the lowly and low-brow to the well-financed and rich. There are very few here who understand the lower classes and their struggle.

          Ask yourself this one single thing. If it was truly just a health issue, why wouldn’t they simply make a requirement for air quality? Something that could be measured like any other workplace? If it was simply to combat SHS issues, instead of a malicious attempt to punish smokers and businesses that catered to them, why wouldn’t they simply treat it like any other workplace and test the air quality? Those businesses that wanted to bring in smokers could have spent the money on air filtration and whatnot and either met the standard or not.

        • If you have some data on this is be happy to look at it. Here in CA the cost of everything has been skyrocketing for decades. The struggle of poor people is real for sure. I don’t think that the problem of not being able to go drink $12 beers and eat $17 nachos and smoke cigarettes at local English pubs rates in even the top 100 problems for Spanish speaking construction day laborers or any of the residents of Watts. The local English pubs near me do fine as long as they offer personal sized arugula pizzas and fish and chips with a side of pesto aioli. The one native Brit i know had a good time there and seemed amused when the owners dog relieved himself in the corner of the room.

          Cigarettes are an addictive drug with straightforward air pollution externalities. Some Places *have* set up negative pressure isolated smoking rooms. I’ve also seen hooka shops and things. The fact of the matter is rents are so high you can’t offer an inexpensive “working man’s pub” in places like LA or the bay area. Its the rent, not the no-smoking policy.

    • The impetus for second-hand smoke was political: it offered a non-paternalist reason to restrict smoking. Advocates of smoking bans could claim that they were not trying to discourage smokers from harming themselves (which was considered extreme, at the time), but rather preventing smokers from harming others: people who didn’t wish to be exposed to smoke. This is a reasonable argument, although one can disagree about what constitutes actionable harm.

      Ultimately, the issue was won partly on the basis of the science, which seemed to suggest that non-smokers were exposed to unacceptably high risk by visiting and working at businesses that allowed smoking.

      It now appears that the risk of occasional SHS exposure was exaggerated. But since smoking bans are already a fait accompli in much of the world, there are few incentives for researchers to seriously study the issue anymore. (Indeed, rather than carefully review SHS, some are instead pursuing even less plausible risks, like “third-hand” smoke.)

      While the motivations of campaigners may’ve been good, and reduced smoking is obviously a win for health, the movement’s politics have left a legacy of illiberal public policy – a set of precedents for steamrolling over individual preferences and implementing the wishes of the elite. They’ve left a tobacco-control field where research is often conducted merely to give plausible “sciencey” cover to some pre-selected legal or regulatory measure.

      That dynamic is spreading, as exemplified by the campaigns against vaping, energy drinks, snack foods, soft drinks, etc. In all these cases, support for officials’ preferred policy moves is manufactured by weak, politically-motivated research, boasting conclusions that claim or suggest far more than their study methods can support. The shoddy research is regurgitated by reporters, most of whom have no scientific background and cannot judge the strength of the work. This process continues until eventually the “facts” become instilled by sheer repetition, paving the way for the policy change. For example, after years of misinformation, the US public now widely believes that vaping is as risky as smoking. Regulators are, accordingly, moving to crack down.

      Science that is more concerned with getting a particular answer than getting at the truth is junk, and to counter it, you need critics who are motivated to find flaws. But who will play that role? Industry?

      Nobody wants a return to the days of the tobacco industry lying to consumers. But on the other hand, there are few people within the field willing to stick their necks out and criticize bad science. Even egregious errors are rarely called out when they are believed to serve a “good” cause.

      Perhaps open science is the answer to this. Peer review could be effectively crowd-sourced. It will take a huge culture shift, however, to seriously tolerate criticism that comes from outside the academy.

      Nobody wants a return to the days of the tobacco industry lying to consumers. But on the other hand, there are few people within the field willing to stick their necks out and criticize bad science. Even egregious errors are rarely called out when they are believed to serve a “good” cause.

      Perhaps open science is the answer to this. Peer review could be effectively crowd-sourced. It will take a huge culture shift, however, to seriously tolerate criticism that comes from outside the academy.

  5. There are two mechanisms that already address scientific disputes.

    1. Peer review.
    2. History.

    In the first case, front-loaded discrimination helps keep research on the straight and narrow. Okay, it isn’t perfect, reviewers being human. That’s why science is iterative, leading to the second case, the weight of history.

    Twitter fights aren’t a problem to be reduced. They’re great! They reveal someone is paying attention.

    It doesn’t matter if retraction isn’t scalable. It isn’t about controlling someone else to make them say the thing you want. What do you do if you don’t like how a test turned out? Test again! Reproduce the research, fix it, make it better, get the refinement peer-reviewed and published.

    Just like legislation, it isn’t supposed to be easy. With laws, you want it to be hard to make them and to change them. All the contention in Congress and across its houses isn’t bad. You want to slow things down and compel rigorous scrutiny. Dispute is good.

    Similarly, scientific disputes are valuable, even precious. The dialectic is precisely why science works. The way to resolve disputes is with science itself.

    Jai Jeffryes

Leave a Reply

Your email address will not be published. Required fields are marked *