Amusing example of the fallacy of controlling for an intermediate outcome, or, the tyranny of statistical methodology and how it can lead even well-intentioned sociobiologists astray

I’m sorry to see that the Journal of Theoretical Biology published an article with the fallacy of controlling for an intermediate outcome. Or maybe I’m happy to see it because it’s a great example for my classes. The paper is called “Engineers have more sons, nurses have more daughters” (by Satoshi Kanazawaa and Griet Vandermassen) and the title surprised me, because in my acquaintance with such data, I’ve seen very little evidence of sex ratios at birth varying much at all. The paper presents a regression in which:

– the units are families

– the outcome is the number of sons in the family (they do another regression where the outcome is the number of dauhters)

– the predictor of interests are indicators for whether the parent’s occupation is “systematizing” (e.g., engineering) or “empathazing” (e.g., nursing)

– the regression also controls for parent’s education, income, age, age at first marriage, ethnicity, marital status, and number of daughters (for the model predicting the # of sons) or number of sons (for the model predicting # of daughters).

The coefficients of “systematizers” and “empathizers” are statistically significant and large (i.e., “practically significant”) and the authors conclude that systematizers are much more likely to have boys and empathizers are much more likely to have girls, and then back this up with some serious biodetermistic theorizing (e.g., referring to occupations as “brain types”!).

But, but, but . . . their results cannot be interpreted in this way!

The problem is that their analysis controls for intermediate outcomes (a problem which I noted in another study of sons and daughters, although in that case the fallacy had only minor effects on the results). Controlling for total #kids of the other sex is a huge problem since different people may very well go for another kid or not after having one son, or one daughter, or whatever. In addition, income is an intermediate outcome (to the extent we think of occupation as a “treatment”) so it does not make sense to compare two people with different occupation-classes and the same income. Also, occupation itself is intermediate, in that one might, for example, decide to become a social worker after having a girl. Divorce is another intermediate outcome (as has in fact been studied).

The funny thing is that the authors almost . . . almost . . . saw the problem, when they wrote:

when we control for all the variables included in our equations, those who have more biological daughters have fewer biological sons, and those who have more biological sons have fewer biological daughters. This seems to suggest that parents specialize in producing children of one sex or the other, some producing mostly or exclusive boys, and others producing mostly or exclusively girls.

Well, yeah. No biological explanation needed here: just think (for simplicity) of what the data would look like if every couple had 1 kid, or 2 kids. The actual data are a mixture of different numbers of kids, and there’s no reason ahead of time to expect these regression coefficients to be zero–even if (as is roughly the case), the sex of babies born is completely random.

I’m sorry to see that this slipped by peer review, but I’m really sorry to see that it got slashdotted.

P.S. As a bonus point, the paper presents regression results to four significant digits (e.g., 03498 +/- .1326).

P.P.S. Juicy sociobiological quote: “Thus, it pays parents in good conditions to bet on male rather than female offspring.” Perhaps they should have done a study in Las Vegas and checked out the actual betting odds…

P.P.P.S. OK, I’m sorry for making fun of the study. I’ve proven a false theorem myself (well, I guess “proved” isn’t the right word), glass houses and all that.

P.P.P.P.S. Just to bring this to a close: the authors make “the following novel prediction: individuals who have Type S brains should have more sons than daughters, while individuals who have Type E brains should have more daughters than sons” [italics in the original]. But they never check this (as far as I can see)! They just run regressions.

This is truly an example of the tyranny of statistical methodology: if only these biologists (well, actually one of them is at an Institute of Management and another is at a Center for Gender Studies) had calculated some simple averages, they might have learned something, but through fancy regression they were led astray.

17 thoughts on “Amusing example of the fallacy of controlling for an intermediate outcome, or, the tyranny of statistical methodology and how it can lead even well-intentioned sociobiologists astray

  1. This seems to me to be an example of collinearity, where intermediate outcomes bias the regression weights due to their relationship with the predictor of interest. Is the intermediate outcomes problem simply a special case of collinearity, or is there something else going on here that I am missing?

    Thanks for the interesting topics, and keep up the good work.

  2. Looks like you could get a cheap paper in JTB. :-)

    Seriously, it would be good to have a rebuttal. Otherwise some biologists will believe it.


  3. It's more a problem of endogeneity, not collinearity (although if the intermediate outcome were orthogonal to all of the other variables, then it wouldn't be a problem, but that's true for many regression problems including omitted variable bias)

  4. P.S. As a bonus point, the paper presents regression results to four significant digits.

    Why a bonus point? Jacob Cohen: "A less profound application of the less-is-more principle is to our habits of reporting numerical results. There are computer programs that report by default four, five, or even more decimal places for all numerical results. Their authors might well be excused because, for all the programmer knows, they may be used by atomic scientists. But we social scientists should know better than to report our results to so many places. What, pray, does an r = .12345 mean? or, for an IQ distribution, a mean of 105.6345? For N = 100, the standard error of the r is about .1 and the standard error of the IQ mean about 1.5. Thus, the 345 part of r = .12345 is only 3% of its standard error, and the 345 part of the IQ mean of 105.6345 is only 2% of its standard error. These superfluous decimal places are no better than random numbers. They are actually worse than useless because the clutter they create, particularly in tables, serves to distract the eye and mind from the necessary comparisons among the meaningful leading digits. Less is indeed more here."
    Things I have learned (so far).
    By Cohen, Jacob
    American Psychologist. 45(12), Dec 1990, 1304-1312.

    From Andrew's favorite psychology journal! Although I suggest you should switch your preference to "Current Directions in Psychological Science".


  5. Denis,

    I was being ironic about the bonus point. Or, to put it another way, I was using a golf-like scoring system in which more points is bad.

  6. I rarely read papers in whole, but I like to see their results, e.g., Tables 1 & 2. While the linear regression models are nasty, practically all applications of linear regression fail to properly deal with correlated variables. The causal issues are valid, but it would be quite hard to do the proper analysis that would involve tracing the employment and marriage history of each person in the study.

    The results are interesting: married people have more sons, blacks have more daughters, education reduces the number of children, systematizing occupations increase the number of sons, and emphatizing occupations increase the number of daughters; overall, emphatizing occupations produce more children.

    Similarly, Table 2 says that blacks have larger families, mother's emphatizing occupation contributes a lot to the number of children, especially daughters, mother's education and especially mother's systematizing occupation suppresses children.

  7. Aleks,

    I agree that this is a fun dataset and that the regression results are intriguing. I disagree with your implication that the causal issues are so difficult as to be hopeless. This paper made some clear, and avoidable mistakes.

    To start with, they should present some aggregate statistics. For example, a table (or, better still, graph) of aggregate percentages of girls (i.e., total #girls/(total #girls + total #boys)) for each combination of parents' occupations. Then, if they want to do regressions, fine: control for ethnicity and maybe education, but don't control for #kids of the other sex. This is where all the problems come in. Well, that and the fact that they use number of girls (or boys) as an outcome rather than proportion.

    Another way to see it is to break it up into 2 problems: (a) probability of going for another kid, and (b) probability that a given kid is a girl or boy. My understanding from what I've read is that (b) is almost entirely random. Hence, although they're talking about (b) all over the paper, the actual results are mostly due to a complex manifestation of (a) in the regression context.

    To put it another way: yes, there's interesting info there about family size being predicted by various parental characteristics. That's also well known and wouldn't have resulted in a published paper (let alone slashdotting!). What was new (but, I think, false) about this paper was the claim about the sex ratio. So, no, I don't think they should get credit for performing an analysis that takes a well-known pattern, pipes it through a regression with huge causal problems, and then comes up with a result which they can misinterpret.

    P.S. You write "it would be quite hard to do the proper analysis that would involve tracing the employment and marriage history of each person in the study." Well, yeah, but that's why studies that do this sort of hard work get extra respect!

  8. Andrew writes:

    "even if (as is roughly the case), the sex of babies born is completely random."

    "…(b) probability that a given kid is a girl or boy. My understanding from what I've read is that (b) is almost entirely random"

    This would come as a surprise to at least the last two generations of biologists. You might want to read a bit more before holding forth so confidently.

  9. Anon,

    I'd recommend our new book, but it won't be out until October. For something at a more basic level, I'd recommend The Statistical Sleuth by Ramsey and Schafer (although I don't know if they address this particular issue).


    I admit to ignorance of biology. I looked up the sex ratio stuff a few years ago when discussing it in a book, and my impression was that Pr(girl) is between .48 and .49 under a wide range of conditions. I wouldn't be shocked to see subgroups where Pr(girl) could differ by 1% from this average rate, but I'd be surprised to see effects of 10% or 20% or more, which seemed to be implied by the paper discussed in the blog entry.

    On the other hand, I certainly believe that two generations of biologists know more than I do on the matter, so if there's evidence of large variations in the human sex ratio, please point me to the appropriate references and I can correct it in the next printing of our book.

  10. The intermediate outcome issue was raised (using means) by H. C. Lombard, a Swiss physician, in 1835. His data base was 4,000+ death certificates that had age at death and profession on them. Lombard calculated average age-at-death for each profession and discovered that the most dangerous profession was that of "student" with an average-age-at-death of 20.2! He recognized the problem for students, but believed all the other results.
    (safest professions were chocolatieres and retirees).

  11. It seems like either physiological or choice factors could be influencing the results. Did the authors state a beleif that it was one or the other? Can you post a pdf for us lowly blogsurfer nonacademic civilians?

  12. Excellent. This is science at its greatest. Studies and rebuttals and a vigorous, objective debate as to the exact methodologies and whether it is correct.

    I once heard someone say the greatest moment in science is when a person who has stuck to his beliefs all his life sees new data, and congratulates his opponent on making an excellent point/discovery.

    Hope the ensuing debate is healthy AND objective. We are all after the right answer after all. Not to be right. Thanks for the interesting read.

  13. Just so everyone's clear, it's the causation that's the big thing, as Andrew points out. "Engineers have more sons, nurses have more daughters" can be true in a statistical sense, but not necessarily with a biological cause, if engineers are more likely to "keep trying" until they get a son whereas nurses are more likely to "keep trying" until they get the daughter which they want.

    After all, we don't assume that the fantastically improbable sex ratios in India (which are worse among the educated) that occur the more daughters a family has without a son is because the parents' bodies react to conceive a son. No, we recognize that sex-selective abortion goes on.

    This study did correct for ethnicity, which might reduce any hypothesized effects along the lines of recent Asian and Middle Eastern immigrants disproportionately favoring sons and disproportionately being engineers. But social effects could still happen.

    The "parents specialize" bit is especially silly. Of course a larger # of daughters will be correlated with a smaller # of sons and vice versa.

    Also, occupation itself is intermediate, in that one might, for example, decide to become a social worker after having a girl.

    That presumably would go to a study similar to the politics one that you mentioned before. (Which also had issues, of course.)

    OTOH, some of the reports I've read on this study are claiming that it found a fairly large (even significant) effect one's first child. Is that claim true? It would seem to be more significant.

    I'm baffled as to why they didn't include the overall aggregate proportions of kids of each sex in there. You're absolutey right there.

  14. Ah, wait. I was getting this article confused with another one in the Journal of Theoretical Biology by the same author. Now Kanazawa claims that according to the National Longitudinal Study of Adolescent Health data that people judged as attractive are 36% more likely to have a daughter as their first child.

Comments are closed.