Happy birthday

Posted on December 19, 2013 5:01 PM by Andrew

The above is Aki’s decomposition of the birthdays data (the number of babies born each day in the United States, from 1968 through 1988) using a Gaussian process model, as described in more detail in our book.

19 thoughts on “Happy birthday”

D.O. on December 19, 2013 8:47 PM at 8:47 pm said:

It’s a bit surprising that significant dips on Labor day and Thanksgivings do not have bumps before or after. Probably they are eaten up by the smoothing procedure; maybe because both LD and Thnxgiving are on the fixed days of the week.
- Andrew on December 20, 2013 1:38 AM at 1:38 am said:
  
  D.O.:
  
  Yes, as we discuss in the book, the model could be improved by replacing the daily spikes by little functions with “ringing” so that a dip on a particular day corresponds to smaller increases on the days right before and after. In the above graphs, I think that some of the daily effects have been inappropriately absorbed into the seasonal effect.
Rahul on December 20, 2013 12:17 AM at 12:17 am said:

One minor question: Why normalize to an arbitrary 100 scale? Wouldn’t the graph be a tad more informative if you kept actual “num. of birth units”.

e.g. How many actual births do happen on a average Friday?
- Andrew on December 20, 2013 1:42 AM at 1:42 am said:
  
  Rahul:
  
  Good point. I think it would make sense to put the top graph (trends) on an absolute scale (perhaps #births per day, as you suggest) and the others on relative scales. Also, looking at the description in the book, it appears that we fit an additive model directly on the data, but now I’m thinking it would make more sense to work on the log scale.
  - Rahul on December 20, 2013 1:45 AM at 1:45 am said:
    
    Or exploit the unused right hand y-axis. You could relabel that in #births?
Aki Vehtari on December 20, 2013 3:36 AM at 3:36 am said:

D.O: Bumps before and after Labor day and Thanksgiving can be seen if we plot each year separately. In that plot you could also see that the effect for Labor Day and Thanksgiving is about the same size as for Independece day. Andrew preferred this plot where we show the effect for fixed yearday, and so the effect of fixed weekday is spread in this plot.

Rahul: Scale is not arbitrary. I first used absolute scale, but since I was interested in comparing the sizes of the different effects, it required extra mental effort to calculate whether the relative changes are big or not. I used % scale, because it looks prettier than having decimals (0.8 0.9 1.0 1.1). This scale has also benefit that when I made similar figure for Finland, I could immediately see that the size of the relative effects were similar. During these years on a average Friday there were about 10,000 births. We could have the absolute scale on the right.

Andrew: the data is count data, but with so high mean counts it can approximated very well with a Gaussian model. Log scale is not needed to ensure positivity and would transform the distribution away from Gaussian.
- Andrew on December 20, 2013 8:31 AM at 8:31 am said:
  
  Aki:
  
  But wouldn’t the log scale help when considering the long term trend (which moves about 20% from min to max)? Put it this way: suppose there is a fixed multiplicative effect of day of year or day of week or whatever. In the additive model, this will show up as a larger effect in 1976 (when the total #births is lowest). And, indeed, if you look at the day-of-week effects, the curve for 1976 is pretty high. It’s not the highest—1988 is the highest, presumably because there were real changes during this period with more scheduled births—but it’s up there, perhaps an artifact of the additive model for what fundamentally is a multiplicative process.
  
  Regarding the Gaussian approximation, I wonder if there would be a way to do a multiplicative model by fitting an additive model on the log of the raw data and just adjusting the data variance accordingly. So the computation would be just as easy, it’s just that instead of approximating the binomial density with a Gaussian, we’d be applying the Gaussian approx to the density of the log of a binomially-distributed random variable.
Pingback: To Be Born on a Christmas Morn | Preezly
Pingback: Take a Number: To Be Born on a Christmas Morn : One Caribbean Radio | The Global Mix
Art on December 24, 2013 1:55 AM at 1:55 am said:

I find it interesting that the actual number of births is highest in late September. Does that imply that while there may be fewer mothers giving birth on Christmas Day, prospective parents are busier conceiving on (or around) Christmas Day? :)
stringph on December 25, 2013 1:34 PM at 1:34 pm said:

What, no error bars or bands?

.. and no line connecting Sunday with Monday — the biggest difference out of all adjacent days?
Tehpet on December 26, 2013 12:46 AM at 12:46 am said:

Are the numbers for leap year day births normalized for the infrequency of that day?
- Andrew on December 26, 2013 3:32 AM at 3:32 am said:
  
  y
Rahul on December 26, 2013 4:06 AM at 4:06 am said:

Can you see any blips for 9/11 or that big North East Blackout of 2003 etc.? I wonder.
Pingback: Somewhere else, part 104 | Freakonometrics
Pingback: LOL! Birthdays not preferred on holidays? | 'Enjoying the Hi-5s of Autism'
Pingback: Spring forward, fall back, drop dead? « Statistical Modeling, Causal Inference, and Social Science Statistical Modeling, Causal Inference, and Social Science
Pingback: Cross-validation, LOO and WAIC for time series - Statistical Modeling, Causal Inference, and Social Science Statistical Modeling, Causal Inference, and Social Science
Pingback: Modeling statewide presidential election votes through 2028 - Statistical Modeling, Causal Inference, and Social Science

Comments are closed.