In an otherwise unrelated thread on Brutus vs. Mo Willems, an anonymous commenter wrote:
Researchers found that the risk of autism in twins depended on the month they were born in, with January being 80% riskier than December.
The link is from a 2005 article in the fun magazine New Scientist, “Autism: Lots of clues, but still no answers,” which begins:
The risk of autism in twins appears to be related to the month they are born in. The chance of both babies having the disorder is 80 per cent higher for January births than December births.
This was one of the many findings presented at the conference in Boston last week. It typifies the problems with many autism studies: the numbers are too small to be definitive – this one was based on just 161 multiple-birth babies – and even if the finding does stand up, it raises many more questions than it answers.
The article has an excellently skeptical title and lead-off, so I was curious what’s up with the author, Celeste Biever. A quick search shows that she’s currently Chief News and Features editor at Nature, so still in the science writing biz. That’s good!
The above link doesn’t give the full article but I was able to read the whole thing through the Columbia University library. The relevant part is that one of the authors of the birth-month study was Craig Newschaffer of the Johns Hopkins School of Public Health. I searched for *Craig Newschaffer autism birth month* on Google Scholar and found an article, “Variation in season of birth in singleton and multiple births concordant for autism spectrum disorders,” by L. C. Lee, C. J. Newschaffer, et al., published in 2008 in Paediatric and Perinatal Epidemiology.
I suppose that, between predatory journals and auto-writing tools such as Galactica, the scientific literature will be a complete mess in a few years, but for now we can still find papers from 2008 and be assured that they’re the real thing.
The searchable online version only gave the abstract and references, but again I could find the full article through the Columbia library. And I can report to you that the claim that the “chance of both babies having the disorder is 80 per cent higher for January births than December births,” is not supported by the data.
Let’s take a look. From the abstract:
This study aimed to determine whether the birth date distribution for individuals with autism spectrum disorders (ASD), including singletons and multiple births, differed from the general population. Two ASD case groups were studied: 907 singletons and 161 multiple births concordant for ASD.
161 multiple births . . . that’s about 13 per month, sounds basically impossible for there to be any real evidence of different frequencies comparing December to January. But let’s see what the data say.
From the article:
Although a pattern of birth seasonality in autism was first reported in the early 1980s, the findings have been inconsistent. The first study to examine autism births by month was conducted by Bartlik more than two decades ago. That study compared the birth month of 810 children diagnosed with autism with general births and reported that autism births were higher than expected in March and August; the effect was more pronounced in more severe cases. A later report analysed data from the Israeli national autism registry which had information on 188 individuals diagnosed with autistic disorder. It, too, demonstrated excess births in March and August. Some studies, however, found excess autism births in March only.
March and August, huh? Sounds like noise mining to me.
Anyway, that’s just the literature. Now on to the data. First they show cases by day:

Ok, that was silly, no real reason to have displayed it at all. Then they have graphs by month. They use some sort of smoothing technique called Empiric Mode Decomposition, whatever. Anyway, here’s what they’ve got, first for autistic singleton births and then for autistic twins:


Looks completely random to me. The article states:
In contrast to the trend of the singleton controls, which were relatively flat throughout the year, increases in the spring (April), the summer (late July) and the autumn (October) were found in the singleton ASD births (Fig. 2). Trends were also observed in the ASD concordant multiple births with peaks in the spring (March), early summer (June) and autumn (October). These trends were not seen in the multiple birth controls. Both ASD case distributions in Figs. 2 and 3 indicated a ‘valley’ during December and January. Results of the non-parametric time-series analyses suggested there were multiple peaks and troughs whose borders were not clearly bound by month.
C’mon. Are you kidding me??? Then this:
Caution should be used in interpreting the trend for multiple concordant births in these analyses because of the sparse available data.
Ya think?
Why don’t they cut out the middleman and just write up a bunch of die rolls.
Then this:
Figures 4 and 5 present relative risk estimates from Poisson regression after adjusting for cohort effects. Relative risk for multiple ASD concordant males was 87% less in December than in January with 95% CIs from 2% to 100%. In addition, excess ASD concordant multiple male births were indicated in March, May and September, although they were borderline for statistical significance.
Here are the actual graphs:

No shocker that if you look at 48 different comparisons, you’ll find something somewhere that’s statistically significant at the 5% level and a couple more items that are “borderline for statistical significance.”
This is one of these studies that (a) shows nothing, and (b) never had a chance. Unfortunately, statistics education and practice is focused on data analysis and statistical significance, not so much on design. This is just a ridiculously extreme case of noise mining.
In addition, I came across an article, The Epidemiology of Autism Spectrum Disorders, by Newschaffer et al. published in the Annual Review of Public Health in 2007 that doesn’t mention birth month at all. So, somewhere between 2005 and 2007, it seems that Newschaffer decided that, whatever birth-month effects were out there weren’t important enough to include in a 20-page review article. Then a year later they published a paper with all sorts of bold claims. Does not make a lot of sense to me.
Shooting a rabbit with a cannon?
Ok, this is getting ridiculous, you might say. Here we are picking to death an obscure paper from 15 years ago, an article we only heard about because it was indirectly referred to in a news article from 2005 that someone mentioned in a blog comment.
Is this the scientific equivalent to searching for offensive quotes on twitter and then getting offended? Am I just being mean to go through the flaws of this paper from the archives?
I don’t think so. I think there’s a value to this post, and I say it for two reasons.
1. Autism is important! There’s a reason why the government funds a lot of research on the topic. From the above-linked paper:
The authors gratefully acknowledge the following people and institutions for their resources and support on this manuscript:
1 The Autism Genetic Resource Exchange (AGRE) Consortium. AGRE is a programme of Cure Autism Now and is supported, in part, by Grant MH64547 from the National Institute of Mental Health to Daniel H. Geschwind.
2 Robert Hayman, PhD and Isabelle Horon, DrPH at the Maryland Department of Health and Mental Hygiene Vital Statistics Administration for making Maryland State aggregated birth data available for this analysis.
3 Rebecca A. Harrington, MPH, for editorial and graphic support.
Drs Lee and Newschaffer were supported by Centers for Disease Control and Prevention cooperative agreement U10/CCU320408-05, and Dr. Zimmerman and Ms. Shah were supported by Cure Autism Now and by Dr Barry and Mrs Renee Gordon. A preliminary version of this report was presented in part at the International Meeting for Autism Research, Boston, MA, May 2005.
This brings us to two points:
1a. All this tax money spent on a hopeless study of monthly variation in a tiny dataset is money that wasn’t spent on more serious research into autism or for that matter on direct services of some sort. Again, the problem with this study is not just that the data are indistinguishable from pure noise. The problem is that, even before starting the study, a competent analysis would’ve found that there was not enough data here to learn anything useful.
1b. Setting funding aside, attention given to this sort of study (for example, in that 2005 meeting and in the New Scientist article) is attention not being given to more serious research on the topic. To the extent that we are concerned about autism, we should be concerned about this diversion of attentional resources. At best, other researchers will just ignore this sort of pure-noise study; at worst, other researchers will take it seriously and waste more resources following it up in various ways.
Now, let me clarify that I’m not saying the authors who did this paper are bad people or that they were intending to waste government money and researchers’ attention. I can only assume they were 100% sincere and just working in a noise-mining statistical paradigm. This was 2005, remember, before “p-hacking,” “researcher degrees of freedom,” and “garden of forking paths” became commonly understood concepts in the scientific community. They didn’t know any better! They were just doing what they were trained to do: gather data, make comparisons, highlight “statistical significance” and “borderline statistical significance,” and tell stories. That’s what quantitative research was!
And that brings us to our final point:
2. That noise-mining paradigm is still what a lot of science and social science looks like. See here, for example. We’re talking about sincere, well-meaning researchers, plugged into the scientific literature and, unfortunately, pulling patterns out of what is essentially pure noise. Some of this work gets published in top journals, some of it gets adoring press treatment, some of it wins academic awards. We’re still there!
For that reason, I think there’s value in taking a look at a clear case of bad research. Not everything’s a judgment call. Some analyses are clearly valueless. Another example is the series of papers by that sex-ratio researcher, all of which are a mixture of speculative theory and pure noise mining, and all of which would be stronger without the distraction of data. Again, they’d be better off just reporting some die rolls; at least then the lack of relevant information content would be clearer.
P.S. One more time: I’m not saying the authors of these papers are bad people. They were just doing what they were trained to do. It’s our job as statistics teachers to change that training; it’s also the job of the scientific community not to reward noise-mining—even inadvertent noise-mining—as a career track.