Identify the time series: a little statistics puzzle

The graph above comes from the article, Branum, A. M., Parker, J. D., and Schoendorf, K. C. (2009). Trends in US sex ratio by plurality, gestational age and race/ethnicity. Human Reproduction 24, 2936-2944.

The four dashed lines show the proportion of boy births each year, among white single births, white multiple births, black single births, and black multiple births. (Ethnicity is defined as that of the mother.) The solid curves are smoothed versions for each group.

The challenge is to identify which of the four lines is which.

This could be good for an introductory statistics course.

23 thoughts on “Identify the time series: a little statistics puzzle

  1. I understand how to distinguish multiple/single births from the graph, but not how to distuingish black/white. I don’t see how you could without prior knowledge about boy birth rates across race… maybe that’s the point? I agree it’s a good question for beginner statisticians.

    • I think it’s simply a question of sample size, although it helps if you know that twins are quite rare (roughly 1%) and the black population is smaller than the white population (roughly 10% vs 60% IIRC) but you definitely know you see more black people than twin people (just look around your hypothetical classroom), so there can’t be more white multiples than black all-births.

      So I reason that white single births > black single births > white single births > black single births in sample size, and then I simply map that onto the ordering of how wiggly the lines are, because I know bigger = less wiggly.

      And that’s also the top-down order: the top black line looks least wiggly, so that’s the largest group “white singles”; the second gray line is a bit more wiggly (see the first/last and then that little pyramid in the middle), so that’s “black singles”; the next black third line down looks more wiggly than that but less wiggly than the final line, so that’s “white multiples”; then the most bottom black line looks extremely wiggly, so I infer that must be the smallest group (“black multiples”).

      (As for why the sex ratio would appear to systematically be more female-skewed for black populations compared to white populations and almost in violation of Fisher, I can’t guess from this graph but presumably the original research article goes into that topic.)

      • Yeah, that’s what I gathered about the differences between multiple/single births. I didn’t really look at the graph closely enough to notice the slight difference in variation between the top two lines before I made that comment. But good to know I was on the right track anyhow.

  2. I think top black dashed line is single white, then grey dashed is single black. It’s hard to tell the next two lines apart, but I think the upper one is white multiple and then black multiple. I find the sawtooth patterns in the lower two interesting and wonder what’s driving that and what the mechanism might be.

    • The sawtooth (seasonal, though not typical seasonality) is strange – but I also go with both of those being the multiple births. I’d think multiple births are fewer than single births, so the greater variability I’d expect with the smaller sample size. I have no prior regarding whether the proportion of males is higher among whites or blacks. The sawtooth pattern is even more puzzling in that they appear to be positively correlated for the first half of the data series and negatively correlated in the second half.

        • By “seasonal” I only meant a recurring pattern. But I believe the data points are every two years judging by the points where it changes. And my guess is that it is not pure noise (since the slopes seem to alternate so regularly), but I don’t know what it is.

      • So, any idea why the white proportions are higher than the black proportions? Biology, choice, or something else? And, any idea for the sawtooth pattern in multiple births?

        • Dale:

          The general idea is that girl fetuses are hardier than boy fetuses, so anything associated with a more difficult pregnancy will be associated with a relatively higher proportion of girl births. Problems with births are higher among blacks than whites, so it makes sense that the proportion of girl births is higher among blacks. Problems with births are much higher for multiple births, so it makes sense that the proportion of girls is much higher for multiple births. And it is.

        • Andrew, it should be possible to estimate the number of difficult pregnancies (lost fetuses) it would take to account for the observed differences, and whether this number is consistent with known rates of fetal loss.

        • Joe,

          Oh, there’s a whole literature on the topic. Differences between groups are very small, so it’s hard to study. The proportion of girl births is about 48.5% to 49%, and it doesn’t vary much by group. The higher rate of girl births among twins is well known and is one of the largest effects out there.

        • “The general idea is that girl fetuses are hardier than boy fetuses, so anything associated with a more difficult pregnancy will be associated with a relatively higher proportion of girl births. Problems with births are higher among blacks than whites, so it makes sense that the proportion of girl births is higher among blacks.”

          I was in the process of writing this type of response, but then deleted it, because it seems to contradict the graph, which shows male birth proportion exceeding 0.500 for single births. If male births are so problematic (and they are), why don’t we see the female birth proportion exceeding 0.500?

        • Unanon:

          I recall reading somewhere that males are something like 55% at conception and then it goes down from there; at every point after conception, survival probability is lower for the male embryo or fetus than for the female.

        • Andrew: in following up on this post, I read a different explanation

          Apparently a study showed that sex/gender at conception is equal, as would be expected from a simple understanding of it. However, there are differences in which genders survive different periods of the pregnancy. Apparently males are most vulnerable in the 1st / 3rd trimester, but females are more vulnerable in the 2nd, and the termination of pregnancies with female fetus overall is greater than that of male fetus.

          Oh, here’s where I read that: Wow, this is great, an NPR story about a paper in PNAS!! :)

          https://www.npr.org/sections/health-shots/2015/03/30/396384911/why-are-more-baby-boys-born-than-girls

        • Andrew:

          Interesting! Is this the final word?

          I saw that the NPR article was a little old. But I love how NPR always explains science all matter-of-factly: this study “shows” this and that study “shows” that. Science is indisputable!!! :)

  3. My analysis:
    1) The top two probably are single births, the bottom two probably are plural. There are many more single births than plural births so the variability of the ratios for multiple births should have greater year on year variability.
    2) The bottom two are likely to be tied to the corresponding top two. A quick calculation shows that the reduction of the chosen single male birth rates, approximately 0.514 and 0.508, by 0.98 gives the bottom two eyeball averages of 0.503 and 0.498 respectively. The simplest hypothesis is that the reduction is the same.
    3) Squinting at the two top values seems to show a greater variance from the smoothed data for the lower one. Since the black population is smaller than the white population one might expect a greater variance in the year on year data as a reasonable guess.

Leave a Reply

Your email address will not be published. Required fields are marked *