Skip to content

A rise in premature publications among politically engaged researchers may be linked to Trump’s election, study says

A couple people pointed me to this news story, “A rise in premature births among Latina women may be linked to Trump’s election, study says,” and the associated JAMA article, which begins:

Question Did preterm births increase among Latina women who were pregnant during the 2016 US presidential election?

Findings This population-based study used an interrupted time series design to assess 32.9 million live births and found that the number of preterm births among Latina women increased above expected levels after the election.

Meaning The 2016 presidential election may have been associated with adverse health outcomes of Latina women and their newborns.

Hmmm, the research article says “may have been associated” but then ups that to “appears to have been associated.”

On one hand, I find it admirable that JAMA will publish a paper with such an uncertain conclusion. On the other hand, the conclusions got stronger once they made their way into news reports. In the above-linked article, “may have been associated” becomes “an association was found” and then “We think there are very few alternative explanations for these results.”

There’s also a selection issue. It’s fine to report maybes, but then why this particular maybe? There are lots and lots of associations that may be happening, right?

Let’s look at the data

In any case, they did an interrupted time series analysis, so let’s see the time series:

I don’t think the paper’s claim, “In the 9-month period beginning with November 2016, an additional 1342 male (95% CI, 795-1889) and 995 female (95% CI, 554-1436) preterm births to Latina women were found above the expected number of preterm births had the election not occurred,” is at all well supported by these data. But you can make your own judgement here.

Also, I’m surprised they are analyzing raw numbers of pre-term births rather than rates.

In general

Medical journals sometimes seem to show poor judgment when it comes to stories that agree with their political prejudices. See for example here and here.

Look. Don’t get me wrong. This topic is important. We’d like to minimize preterm births, and graphs such as shown above (ideally using rates, not counts, I think) should be a key part of a monitoring system that will allow us to notice problems. It should be possible to look at such time series without pulling out one factor and wrapping this sort of story around it. I think this is a problem with scientific publication, that journals and the news media want to publish big splashy claims.


  1. Adan Becerra says:

    I just wanted to say great title! And yes without seeing proportion/rates, hard to make any conclusions.

    • Adan Becerra says:

      I haven’t actually read the paper but a perinatal epidemiologist I trust read the methods carefully and according to her the total number of births is taken into account in each monthly estimate of the expected.

      Still could be issues with confounding

      • elin says:

        They have a note in the supplement that says “Regressions are adjusted for population at risk and preterm births to non-Latina women. Includes 103 months ending July 2017.”

  2. Thanatos Savehn says:

    I’m a perma-never-Trumper … but… Down here in Texas we’ve had numerous media stories about more anencephalic babies being delivered along the border; and the claim is that it’s all Trump’s fault. The better explanation is, it seems to me, that poor people, from poor countries with no routine/free access to folic acid, who walk 600 miles to deliver their babies in America, are more likely to have babies with anencephaly – Trump whether or no.

  3. Anonymous says:

    Other potential causes of the rise in premature births among Latina women?

    02 Nov. 2016: Chicago Cubs first World Series win in 108 years
    06 Nov. 2016: Indian government declares levels of air pollution in Delhi an emergency situation
    08 Nov. 2016: High temp in Albuquerque, NM reaches 64°, weather partly sunny
    09 Nov. 2016: Order of Nunavut welcomes three new members: Louie Kamookak, Ellen Hamilton and Red Pedersen
    09 Nov. 2016: Wolfmother live in Budapest!
    13 Nov. 2013: the closest, brightest, and biggest full Moon since January 26, 1948

    So I guess now it’s just down to the significant digits on the date….

  4. yyw says:

    It seems that a model trained using Obama years data doesn’t predict the last couple of years’ data very well.

    • elin says:

      Yes, as a monitoring model it’s good to point out what look like changes so that clinicians and others can be alerted, just like they should be alerted of an out of expected range number of flu cases. If I were writing that up I’d let other people do the speculation on causes. I don’t think I would have analyzed the data the way they did, however.

      • Andrew says:


        I’d start with a graph of rates. I think the authors of the above-linked paper are relying way too strongly on their model. I had similar concerns about the paper discussed here, where again there was a time series that showed no clear pattern, followed by a model with all sorts of dramatic claims. Or here: not a time series in that case, but another example of a graph that showed nothing much, followed by a model that gave a result labeled as statistically significant.

        Researchers seem to have the idea that they can routinely use statistics to find significant patterns even when nothing much seems to be happening in the data. I’m skeptical.

        • Elin says:

          What I meant by my comment is that I don’t have a problem with basically an automated analysis of epidemiological data noticing a blip. But that’s a far thing from making a cause and effect statement. Also … what’s with the 2009 data? Given 2009 why are the predictions (and numbers) for 2010 so low?
          Something seems not right.

          I feel like time series models are very easily abused and subject to a lot of cook book analysis. One thing that I’d do is analyze the Hispanic and non Hispanic rates separately instead of using the non Hispanic numbers as part of the Hispanic model. Likewise I’d separate it out by region or state if possible because that would get you a sense about consistency. That is, treat it like the hierarchical data that it is. But most of all I’d want a lot more post-interruption data. I wouldn’t be at all surprised that there was more emotional distress and fear of going to public clinics for prenatal care, but translating that to actual preterm deliveries seems …premature.

  5. Ralph Giorno says:

    Man you guys nailed it! But here’s my huge worry: What else is JAMA publishing for political reasons that might have deleterious effects on patients because it is published in such a ‘prestigious’ journal? I fear the rate of politically motivated publication in this and other high-flying medical journals might be very high.

  6. Fr. says:

    Might be of related interest to some here:

    > US infant mortality and the President’s party
    (International Journal of Epidemiology, 2014)

    With a thoughtful editorial on the same topic in the issue.

  7. elin says:

    I was writing a comment but in looking at their pages and pages of results, this is much worse than you realize. I believe that the plots are based on the differenced data (not the absolute counts) found for example on pp 26-27 for the males (MOBS and MEXP). I don’t see similarly named variables for females but the NSFHISPS and NSMHISPS minimum and maximum seem to match the graph for female observed (the ones on page 20, not elsewhere). Really … I was going to originally complain that the y axis should be in the correct units (the numbers of premature births to Hispanics per month seem to be between 35k and 42k) but cutting off a digit does not give you the numbers on the graph.

    Meanwhile my tremendously insightful comment about including pre-ACA data with no indicator for that interruption now seems a waste of time.

    • elin says:

      This is why a simple table of descriptives is so useful.

    • Carlos Ungil says:

      > (the numbers of premature births to Hispanics per month seem to be between 35k and 42k)

      Where are you getting those figures from? There were 3,855,500 births in the US in 2017 (321k per month), 898,764 of Hispanic origin (75k per month).

      • jim says:

        Carlos Ungil says: “Where are you getting those figures from?”

        I believe elin refers to the number of **premature** births as given in the paper at issue

        • c says:

          My point is the number of premature births to Hispanics cannot be between 35k and 42k (unless around half of the births are premature!).

          According to the paper: “Our analyses included 16 825 845 live male and 16 034 882 live female singleton births (32 860 727 live births) from January 1, 2009, through July 30, 2017; nearly one-quarter of these births (23.5%) were to Latina women. Preterm infants represented 11.0% of male and 9.6% of female births to Latina women and 10.2% and 9.3% of those to other women.”

          32.86 million singleton births in 8 years and 7 months are 319k per month. If 23.5% are to Latina women that’s 75k per month (coincidentally the same figure I gave above for 2017). If premature births are around 10% (11.0% and 9.6%, depending on the gender) that’s around 7-8 thousand (in line with the combination of the male and female charts).

          • elin says:

            From their computer outputs.

            • Carlos Ungil says:

              You may be misunderstanding those computer outputs. You mention MOBS and MEXP and for some reason that escapes me you say those are differences, but as far as I understand the are births counts. MOBS corresponds to the last 9 observations of PSMHIS (pages 37-39).

              You mention NSFHISPS and NSMHISPS which are also births counts. The later is equal to PSMHIS and NSFHISPS is equal to PSFHIS. These are monthly numbers of premature births to Hispanic women and seem to correspond to the points in the charts.

              Perhaps you’re looking at the term births, I don’t know.

  8. Dale Lehman says:

    I downloaded the data and did some time series modeling on the rate (not number) of preterm births and the actual data falls within the 90% confidence interval for the forecast. This was a forecast based upon the pre-Trump election Obama years and the comparison for the following 14 months. I’m sure they did something more sophisticated than I did (although my modeling was not overly simple), but I just don’t think their story is in the data. When the modeling must get that complicated to uncover the headline story, I get increasingly suspicious. A few simple graphs (of the right measures) is often a good cure for these over-hyped claims.

    • Martha (Smith) says:

      “When the modeling must get that complicated to uncover the headline story, I get increasingly suspicious. A few simple graphs (of the right measures) is often a good cure for these over-hyped claims.”


    • Andrew says:

      Dale, Martha:

      Further discussion on that general issue here. That episode frustrated me because I saw people explicitly push back against the idea that looking at the graph could tell us anything. There were people who wanted to make the claim that, even though the graph looks like there’s nothing going on, we should believe the claim because it is supported by a robustness study. To me, this was a case of, “Who ya gonna believe, me or your lyin’ eyes?” But lots of people in econometrics seemed to be trained to believe the output of the computer program (if it’s statistically significant), graphical evidence be damned.

      • Martha (Smith) says:

        “But lots of people in econometrics seemed to be trained to believe the output of the computer program (if it’s statistically significant), graphical evidence be damned.”

        Not just in econometrics. I’ve seen a lot of education and psychology papers that do things like believe the computer output even when the data are so extremely skewed (and the measure has a “floor” and range that insure that the variable must have a strongly skewed distribution, and the sample size is small) that the analysis assuming something close to normality is totally inappropriate.

    • Carlos Ungil says:

      I have the feeling that by looking at year-over-year differences (*) they capture in the constant term a linear trend and thay may explain at least in part why the model doesn’t work well when extrapolated into the future (they do indeed find a negative constant term in their regressions). Maybe if they estimated the model using the first 85 months of Obama’s presidency (instead of the 94 months they used, to analyse months 95-103), the period 86-94 would also look abnormal? Note that this is pure speculation, I don’t understand well what they do and I’ve not tried to replicate the result.

      (*) it’s unclear to me why they would do that, because it seems it may also add undesirable noise to the birth counts one month and two months later that are also included in the regression.

    • elin says:

      I do not think they did anything super cutting edge … their software is about 20 years old.

Leave a Reply to Dale Lehman