Skip to content

How not to analyze noisy data: A case study

I was reading Jenny Davidson’s blog and came upon this note on an autobiography of the eccentric (but aren’t we all?) biologist Robert Trivers. This motivated me, not to read Trivers’s book, but to do some googling which led me to this paper from Plos-One, “Revisiting a sample of U.S. billionaires: How sample selection and timing of maternal condition influence findings on the Trivers-Willard effect.”

This paper is really bad. It has a bunch of fatal statistical errors.

The paper is not on a particularly important topic, it seems to have received little or no scientific influence or media coverage, and it was published in a non-prestiguous journal.

So this post is not about casting doubt on some Ted talk or whatever.

Rather, consider this as a case study in statistical errors. For this purpose, perhaps it’s a good thing that the paper in question is obscure. Statistical errors occur all over the place—indeed it is reasonable to suppose they are more common in obscure work.

It just happens that this particular paper is on a topic with which I’m already familiar, so it’s particularly easy for me to spot the errors. When you read an bad paper on a familiar topic, the errors just pop right out, it’s as if you were wearing 3-D glasses.

The paper

Here’s the abstract:

Based on evolutionary theory, Trivers & Willard (TW) predicted the existence of mechanisms that lead parents with high levels of resources to bias offspring sex composition to favor sons and parents with low levels of resources to favor daughters. This hypothesis has been tested in samples of wealthy individuals but with mixed results. Here, I argue that both sample selection due to a high number of missing cases and a lacking specification of the timing of wealth accumulation contribute to this equivocal pattern. This study improves on both issues: First, analyses are based on a data set of U.S. billionaires with near-complete information on the sex of offspring. Second, subgroups of billionaires are distinguished according to the timing when they acquired their wealth. Informed by recent insights on the timing of a potential TW effect in animal studies, I state two hypotheses. First, billionaires have a higher share of male offspring than the general population. Second, this effect is larger for heirs and heiresses who are wealthy at the time of conception of all of their children than for self-made billionaires who acquired their wealth during their adult lives, that is, after some or all of their children have already been conceived. Results do not support the first hypothesis for all subgroups of billionaires. But for males, results are weakly consistent with the second hypothesis: Heirs but not self-made billionaires have a higher share of male offspring than the U.S. population. Heiresses, on the other hand, have a much lower share of male offspring than the U.S. average. This hints to a possible interplay of at least two mechanisms affecting sex composition. Implications for future research that would allow disentangling the distinct mechanisms are discussed.

Set aside the theoretical problems with this work, as I’ll just be talking about the statistics.

Dead on arrival

The biggest error in the paper, the error that makes the whole thing worthless, is that the noise is so much larger than the signal.

Let’s do the math. N=1165 children in the study. The comparison with the least uncertainty would be simply to take the raw proportion of girls in this sample and compare to the known proportion in the general population. The standard error is simply .5/sqrt(1165) = .015, that’s 1.5 percentage points.

Now, effect sizes. The difference in proportion girl births, comparing billionaires to the general population, has to be much much smaller than this. Compare billionaires to other white people, it will be even smaller. It’s really hard to imagine any “billionaire difference” to be anywhere near the difference in proportion girl births, comparing white and black Americans, which is around .005.

Suppose the true effect size is .002. (I actually think it’s less.) Even if it’s as large as .002, if the standard error is 0.015, that’s basically impossible to detect. We’re in kangaroo territory.

Let’s do the design analysis:

> retrodesign(.002, .015)
[1] 0.052

[1] 0.35

[1] 17.6

That’s right, if the true effect size is .002, this study has a power of 5.2% (that is, a 5.2% chance of getting a statistically significant p-value), a type S error rate of 35% (that is, a 35% chance that an estimate, if statistically significant, would be in the wrong direction), and an exaggeration factor of 17 (that is, an estimate, if statistically significant, would be on average 17 times larger than the true effect).

Or what if you wanted to make the bold, bold claim that billionaires differ from the general population in their sex ratio by the same rate at which whites differ from blacks. Run the program, and you still get a power of only 6%, a type S error rate of 17%, and an exaggeration factor of 7.

In short, such a study is hopeless no matter what. It’s dead on arrival. It’s a wild throw of the dice to even attain statistical significance, but it’s worse than that, as any statistically significant estimate would be essentially noise.

Deader on arrival

But what about the other analyses in the paper, for example the comparisons between subgroups of billionaires? For these comparisons, the statistics are even worse!

Let’s consider a best-case scenario, comparing two groups that are (essentially) equal-sized: 582 babies in one group, 583 in the other. The difference in proportion girls in these groups will have standard error sqrt(.5^2/582 + .5^2/583) = .029, that’s twice the standard error from above. (That’s the general pattern, that comparisons or interactions have twice the standard error of averages or main effects: you get a factor of sqrt(2) from the halving of the within-group sample size and another factor of sqrt(2) from the differencing.)

So this aspect of the study is even more useless. Again, let’s consider a hypothetical effect size of .002:

> retrodesign(.002, .029)
[1] 0.05

[1] 0.42

[1] 33.9

Power of 5%, type S error rate of 42%, exaggeration factor of 33. You can’t get much noisier than that.

Researcher degrees of freedom

The paper has other errors, of course. It almost has to, given that statistical significance was found under such inauspicious conditions.

The most obvious problem is multiple comparisons: the researcher has many degrees of freedom in deciding what to look at, hence he can keep looking and looking until he finds something statistically significant. In the paper at hand, we see:

– Billionaires compared to the general population,
– Heirs compared to self-made billionaires,
– Comparison just of male billionaires,
– Comparison of heiresses to the general population,
– Comparison of heiresses to self-made billionaires,
– Comparison of heiresses to heirs.

The author does a multiple comparisons correction and finds no significance, which is kinda funny because then he reports the differences as if they reflect real patterns in the population.

In any case, the multiple comparisons correction understates the problem because (a) there are lots of other comparisons floating around in the data that the researcher could’ve noticed and surely would’ve reported had they been notable, and (b) there are a bunch more researcher degrees of freedom in the data-exclusion and data-classification rules (for example, the division of heiresses into those who inherited from parents and those who inherited from spouses).

Again, given the variation and sample size in the context of possible effect sizes, the study had no chance of succeeding in any case, so I don’t don’t don’t recommend anyone try a preregistered replication. The point of the above discussion of forking paths and degrees of freedom is just to explain how the researcher could’ve found statistical significance out of what is essentially pure noise.

Interpretation of results

Finally, the paper at hand also demonstrates several standard mistakes associated with p-values:
– The use of one-sided tests in a context where departures in either direction would be notable,
– The reporting of a p-value near .05 as “almost statistical significance,”
– A “robustness check” that is almost identical to the original analysis (in this case, a logistic regression instead of a comparison of proportions),
– Selected non-significant differences interpreted based on their signs as being “consistent with the stated hypothesis,”
– An observed proportion being reported as “considerably lower than that of the general population,” without noting that this difference is entirely explainable by chance,
– A non-significant difference being taken as evidence of the null hypothesis (“Given that this difference is not statistically significant, it speaks against the first hypothesis that billionaires have a higher percentage of male offspring than the general population.”),
– And, of course, comparisons between significance and non-significance.

Followed by tons of storytelling. It’s tea-leaf-reading without the tea.

I have no desire to pick on this particular researcher—that’s why I have not mentioned his name in this post. The name is no secret (you can find it by just clicking on the link above that has the research article), but I want to focus on the very very common statistical errors rather than on which faceless scientist happened to be making them that day.


These are real errors, and they’re avoidable errors. But you’ll make them too, over and over again, if you do statistics using the grab-some-data-and-look-for-statistical-significance approach.

The point of this post is not to pile on and criticize an obscure paper in an obscure journal by an author we’ve never heard of. The point is to help you and your colleagues avoid these same errors in your own work, errors you might well make in higher-stakes situations where you’re under pressure to find results and where you might not see the forest for the trees.

The paper discussed above is almost a laboratory setting of statistical misunderstanding, where a researcher was able to use standard statistical tools to wrap himself in a web of confusion. Again, it’s nothing personal—statistics is hard, and I’m sorry to say that we in the statistical profession often sell our methods as a way of distilling certainty from noise.

The author of this paper inadvertently made a whole bunch of errors all in one place. As discussed above, it is no coincidence that these errors occurred together. When you start with hopelessly noisy data and you add to this the practical necessity to obtain statistical significance, all hell will break loose. It’s kinda sad to have to admit that the dataset you spent so many months painfully constructing, does not have enough information to answer any of your research questions—but that’s how it goes sometimes. Just too bad nobody told this guy about these issues before he started his study.

So remember these statistical errors here, in this clean setting, and watch out for them in your world.


  1. John Hall says:

    I really like this post. You should think about turning it into a short paper.

  2. Bill Harris says:

    Have you thought of incorporating retrodesign into the arm package to make its use a bit more convenient?

  3. Bill Harris says:

    You note, “Selected non-significant differences interpreted based on their signs as being “consistent with the stated hypothesis,” but that sounds at least related to a statement I recall from Gelman & Hill that it’s often useful to include non-significant predictors that have the expected sign. Yes, I think you and Jennifer made that in a predictive sense, and I sense the author is writing about inference, but perhaps that distinction is worth making explicit to avoid confusion.

  4. Jeff McLeod says:

    The same problem occurs in medical research when assessing the relative risk of acquiring a rare disease by analyzing the risk ratios between the two groups. The ratio’s standard error is proportional to 1/N, so the same problem exists. However the risk ratio itself has a very salient meaning, easily communicated. An RR = 1.5 means that the index group has 1.5 times the risk of acquiring the disease compared to the comparison group.

    Editors of prestigious medical journals (JAMA, Lancet, etc) say that if you don’t have a relative risk ratio of at least 4.0, forget it. They implicitly weigh the impact of the result on public health policy before publishing the article.

    Here, the editors are the gatekeepers between signal and noise. Of course because it’s medicine, the public health implications of effect sizes are easier to envision. How do you set the criteria of importance for social science effect sizes? I’m convinced it is possible, but it would require some discipline.

    • Garnett says:

      Wow! relative risk of at least 4 for publication??? What if the consequences of getting that rare disease are very, very serious?

    • george says:

      Editors of prestigious medical journals (JAMA, Lancet, etc) say that if you don’t have a relative risk ratio of at least 4.0, forget it. They implicitly weigh the impact of the result on public health policy before publishing the article.

      Good story but not actually true. Look in Table 2 here for some RRs much smaller than 4, published in JAMA. Search the JAMA archives for “relative risk” and you can find a bunch more.

      Also note Andrew’s “exaggeration factor”; seeing a larger-than-expected effect size in a published low-power analysis, alone, is not good evidence of an actually-large effect. Reported RR=4 may not be as impressive as it initially seems.

      • Anoneuoid says:

        This. I looked it up earlier and saw the same thing, it seems untrue. Second, that scheme actually does not encourage the submission of good scientific reports any more than the statistical significance obsession. It actually seems like an even more direct method at producing the same problems, it is still based on NHST thinking.

        It isn’t effect size that indicates a finding is interesting. It is a high ratio of p(Hi)p(O|Hi)/sum(p(H[0:n])p(O|H[0:n])), ie that the observations are much more likely under one hypothesis than any other people can come up with. Optimize for that and you will be on the right track. Unfortunately (apparently, since most people do not seem to enjoy that type of task) that means you will have to spend some effort on actually turning your speculations into a quantifiable model.

  5. Jonathan says:

    Ironically enough this was posted the same day that this was posted on a prominent blog:

  6. gdanning says:

    The billionaires study was discussed in the Wikipedia page on the Trivers-Willard effect, which is not a very long article, so it stood out. I say “was discussed” because I just edited the page to delete that paragraph, and cited this post in the “explanation of edits” section. We will see what happens.

  7. Lauren says:

    In an otherwise good paper, what’s wrong with “The reporting of a p-value near .05 as “almost statistical significance””?

    • Paul says:

      Probably because .05 is an arbitrary cut-off and your conclusion shouldn’t really change if your p-value is .045 vs .055.

    • I understand it like this: The principle behind NHST is that it is supposed to be a minimum criterion for whether any conclusions can be drawn, from which follows that a result is either significant or it is not. Only if the result is significant can we continue the analysis and draw any conclusions, weak or strong. You can’t start by fixing alpha to, for instance, 0.05 and then say that a p=0.06 is “almost significant” and still be working within the NHST framework. In a sense, there is no such thing as “almost significant” within the NHST paradigm.

  8. Lauren says:

    In my view, it’s good on you for not mentioning the scientist’s name.

  9. Adam says:


    I wonder if you’d be willing to compile these errors into a longer list that we could use as a checklist of things to avoid in our own research, or even in our teaching.

    For me as a non-statistician scientist, I think it would be very useful to help me avoid wasting my time on bad research endeavors.

    • I appreciate the desire to avoid wasting time and resources, but I think the goal of using a “checklist” to see if you’ve avoided all the pitfalls is maybe the wrong way to view statistics. One of the big problems with statistics is the “canned one size fits all software pushbutton” approach. I don’t even personally use the term statistics that much, to me, the question is in building mathematical models, statistics is just one of the tools for doing that the way calculus or algebra or graph theory is… focus your efforts on learning how to actively DO GOOD MODELING rather than avoiding a set of traps in a pushbutton approach.

  10. Ironically this paper cites your work twice, in footnotes 16 and 39.

    16. Gelman AB, Weakliem DL (2009) Of beauty, sex, and power: Too little attention has been paid to the statistical challenges in estimating small effects. American Scientist.

    39. Gelman A, Hill J, Yajima M (2012) Why we (usually) don’t have to worry about multiple comparisons. Journal of Research on Educational Effectiveness 5: 189–211. doi: 10.1080/19345747.2011.618213

    It seems that the author believes that he has dealt with the issues of small effects and multiple comparisons–but this seems dubious, especially in the second instance. He’s actually going so far as to *justify* the lower p-values in terms of the multilevel models.

    “In a simultaneous coefficient test that corrects for multiple comparisons, the difference between heiresses and heirs remains significant in a one-sided (p≈.032) and almost significant in a two-sided test (p≈.064). The slightly lower p-values may be explained by the higher power of multilevel models [39] – although here that effect should be small, given the extremely low degree of within-parent correlation.”

    I may be misinterpreting this, but it seems that he’s saying, “Well, because of the multilevel models, we expect lower p-values anyway, although the effect of the multilevel model should actually be small in this case.” This seems both contradictory and inconsistent with the argument of the paper cited.

    • I realized afterward that I could make sense of the quote only in the context of the whole paper. When I started piecing the argument together, though, I saw a big underlying problem.

      The entire reason for subdividing the billionaire population into heirs, heiresses, and self-made billionaires is that–supposedly–the heirs and heiresses were already wealthy, or knew of their future wealth, at the time of conception:

      “In order to fill this gap and to analyze the role of timing, I revisit data on the richest 400 US Americans [19], amended with information on whether wealth was inherited or self-made, an information that allows one to distinguish the intergenerational timing of wealth accumulation: A much higher percentage of heirs will already (know to) be wealthy when conceiving their first and all subsequent children, whereas self-made billionaires may not have achieved high-wealth status upon conception of at least some of their children.”

      The author does not actually know how many heirs and heiresses in this study *did* know of their wealth at the time of conception. Sure, one can imagine that if their parents are already billionaires, they can expect a lot of money–but it’s quite possible that the percentage of exceptions (in this small sample) is significant, which throws off the entire premise.

      Then, on top of that, we’re looking at only 37 heiresses here–and the author brings it down to 26:

      “If the inheritance comes from an heir, wealth-status was achieved before the childbearing years. Therefore, these women should be treated like those married to a living heir. In contrast, if the inheritance comes from a self-made billionaire, wealth-status may not have been achieved upon conception of at least some of the children of these women. In this case they should be treated like women married or partnered to a living self-made billionaire. Therefore, the 11 heiresses identified above who inherited from their deceased partner should be reassigned accordingly. This leaves the following dyadic constellations that are used in subsequent analyses: 262 self-made billionaires with wives or partners, 86 heirs with wives or partners, and 26 heiresses who inherited from their parents with husbands or partners.”

      So, now we’ve got 26 heiresses. Supposedly they have a male offspring of 42.7%. Suppose they have an average of two children. That’s approximately 22 boys and 30 girls in all. If you add just one heiress with a boy, you’re already up to 43.4%–which just suggests how unstable this is.

      If you consider, on top of this, that we really don’t know how many of these heiresses actually knew of their wealth (and were confident in it) at the time of conception, then these numbers tell us even less.

  11. Radford Neal says:

    I don’t understand why you think the power analysis should be conducted assuming that the effect, if it exists, is small.

    The hypothesis is that natural selection has lead to wealthy people having more sons, because investment in sons can produce a huge payoff in number of grandchildren, much more than for daughters (assuming a society with polygamy, lots of illegitimate children, or some such…)

    If the hypothesis is true, it would be because some genes exist that implement this conditional policy preferring sons. For these genes to be selected for, I would think the effect of these genes would need to be large, especially since they have an effect only in the intrinsically rare situation of high relative wealth.

    So a study looking for a large effect seems fine to me. Except, maybe, that we’ve already eliminated such a large effect based on casual observation…

    • Andrew says:


      All things are possible, but I doubt there could be any large difference in Pr(girls) comparing billionaires to others. You’re talking about something that would happen over many many generations, in which case there’s nothing so special about people who are currently billionaires. Researchers have studied sex ratio in all sorts of ways and have not come up with much. Just about the only difference of more than a half a percentage point comes when looking at extreme poverty or famine. I just don’t think the math adds up to allow a big difference for billionaires or any other such socially-defined group.

      • Michael says:

        I think according to the author’s hypothesis, this is not something that happens over many generations, it is a programmed response to the environment that is in everyone’s DNA, and would be triggered by wealth (somewhat similar to how water fleas can grow protective shells or not, depending on the presence of chemicals that indicate the presence of predators during early development). That would mean that you would want to look at children conceived while their parents are billionaires.

        I agree with your overall conclusions, though, if there was a large effect to be found, it should have been found by the earlier studies or in the main effect in this one.

    • Jonathan says:

      Whoa! We know the sex ratios of births so natural selection hasn’t biased selection toward boys in general and everything we know about fertility says that poor people have more kids, not fewer because more children have an overall survival advantage in this cruel world and, it seems, that a wealth effect is not “genetic” but rather that people with stuff want to pass enough of it to their progeny that they don’t want to dilute it among a lot of children. Unless you’re breeding warriors, like ants, then more wealth tends to lead to fewer kids. It may be that fewer kids means the number of sons stands out; you may notice the sons who take over businesses because so often women change names on marriage and taking over the business has generally fallen to males. But the number of sons compared to daughters? There is no selection effect for that because survival of your corporate wealth is not the same as survival of your genes, which pass to either sex. That is the silly thing being tested: the idea that natural selection somehow extends to this notion that very wealthy people have more sons than daughters because somehow that is a selected result. But that confuses what selection is: natural selection confers a survival advantage over time and survival means something literal, like your life in your genes, and not the survival of your bank account or other attribute attained during life. In other words, the study actually goes back to the Darwin versus the Lamarckian idea that you inherit acquired traits, which shows how stupid it is. Dressing this dross up in statistical mis-speak gets at a deeper lack of understanding of something that’s not exactly a secret, that characteristics acquired during life aren’t Darwinian.

    • Alex Gamma says:

      People really need to stop thinking about natural selection only in terms of genes, ESPECIALLY in the human case. Cultural evolution is hugely important in humans and it is entirely possible that most of the behaviors and mental traits that matter to us have been shaped by it. The stupid equation of evolution with genes is what you get when people get their education from the selfish gene shit of Dawkins & co.

      Go read some good books:

      Jablonka E, Lamb MJ. Evolution in Four Dimensions. Cambridge: The MIT Press; 2005.
      Richerson PJ, Boyd R. Not By Genes Alone. University of Chicago Press; 2008.
      Oyama S. The ontogeny of information. Cambridge: Cambridge University Press; 1985.

      • Jonathan says:

        Cultural evolution doesn’t rely on natural selection but rather on transmitted acquired characteristics that are shared across populations which change members over time but which cohere from x time to y time thus allowing transmission across members. (See elephant culture disruption by hunting.) Cultural changes can reflect better exploitation of a niche that conveys an advantage over time but it also can reflect worse and can convey a disadvantage over time. That’s a crucial difference: it imposes group behavior that may be good and may be bad, that may confer survival but may cut it short. But in any case, take cultural evolution ideas and apply them: that means they’d be saying the issue is the rich killing daughters pre or post birth. But the rich can afford daughters and daughters actually expand the reach of your influence and fortune through marriage and progeny alliances with other males. The data from China’s 1 child policy says that given 1 child there was abortion of more girls (using ultrasound to identify sex), but that’s a situation where a culture imposes a requirement that forces an ordering to the choice of progeny, either a boy or a girl. Billionaires have no such restriction. What is the cultural evolution restriction imposed that would shape billionaire behavior? I don’t see one. What about for the poor? There is some evidence of abortion to favor sons when resources are scarcer, which is the opposite of what the study proposed. If that is true, then I don’t know the extent to which it may be cultural.

        • Alex Gamma says:

          My comment was a reply to RadfordNeal’s automatic jump from “natural selection” to “genes”, which caused some exasperation. I did not mean to imply there was an explanation of birth rates in billionaires in terms of cultural evolution.

          • Radford Neal says:

            I’m baffled. You’re admitting that cultural evolution is irrelevant to the question at hand, but were exasperated that I didn’t consider it?

            And actually, the relevant question is not what I think of genes vs. culture, but what the people formulating the hypothesis thought.

            I think you and others are basically assuming the hypothesis is false, and on that basis criticizing any test of it.

            • Andrew says:


              The problem’s the math. It’s not about the hypothesis being true or false; it’s that to study it empirically you’d need data on hundreds of thousands of people, and also very strict control of selection bias. N=100 or even N=1000 isn’t close to enough. If the researchers in question wanted to write a purely speculative paper, fine. But once they want to do more, they need a more fully developed theory, some serious data, or some combination of the two. Right now they have neither.

              • Radford Neal says:

                But if the hypothesis were actually true, a huge sample would NOT be needed to verify it, because the hypothesis is plausible only if the effect is large.

              • Andrew says:


                Part of the problem is that there is no precise hypothesis. That is the forking-paths thing that keeps coming up. There are various open-ended evolutionary stories which could manifest themselves in data in many different ways. But whatever the hypothesis is, I disagree completely that it is only plausible if the effect is large. In sex ratios, as in many many other areas of science, there are lots more small effects than large effects out there running around.

              • Radford Neal says:

                You think it’s plausible that natural selection will have developed (and maintained, against random genetic changes) a complex conditional strategy for producing more sons when the parents have high status, when on the rare occasions when it is activated, the effect is to increase the proportion of sons by less than 0.002?

              • Andrew says:


                As various discussants have pointed out, there are are many different processes operating here, some of which might cause upper-class people to have more sons and others of which might cause upper-class people to have more daughters, and also lots of correlational factors such as upper-class people having children at earlier or later ages. We would expect any of these factors to have very small effects on the sex ratio because, empirically, the sex ratio has been very stable across populations and over time. You’re talking about “a complex conditional strategy for producing more sons when the parents have high status,” but what the paper in question is studying is a simple difference in proportions. If you want to study a complex conditional strategy etc. you’ll have to do a lot better than that. Again, kangaroo problem. Tons of noise, a signal that’s extremely close to zero, small sample size, and many potential sources of bias. The the whole thing’s just impossible without much better data and a much better theoretical framework. What these researchers are doing is, effectively, playing with random numbers and using these random patterns to tell stories. It worked fine for Philip K. Dick when he used the I Ching to help write The Man in the High Castle but it’s not such a good way to do science.

              • Keith O'Rourke says:

                Wondering is these issues are somehow involved

                “Thus, our speculation is that because lawyers [colourful academics] train themselves in instrumental [how to persuade], as opposed to inquisitive analysis[how to find out], the result is ingrained arrogance. There is less willingness to acknowledge the kind of uncertainty needed to make an RCT [critical background/context analysis] relevant.”

              • Martha (Smith) says:

                Radford said: “You think it’s plausible that natural selection will have developed (and maintained, against random genetic changes) a complex conditional strategy for producing more sons when the parents have high status,…”

                Wow, this presents a very different view of “natural selection” than mine. I see natural selection simply as the process that has lead to what is the situation at the time the phenomenon is observed. No “strategies”. Just “how this (might have) arisen” — and usually involving a lot of chance.

              • Radford Neal says:

                Martha: Natural selection is quite capable of producing organisms whose genes guide development of a neural/hormonal/whatever system that implements various strategies. Why would you think otherwise? Do you think people and other organisms are incapable of doing anything that involves “strategy” (but then how did the word “strategy” ever get invented…)? Or do you think that evolution/creation/whatever is based on something other than natural selection?

              • Martha (Smith) says:

                Radford said: “Natural selection is quite capable of producing organisms whose genes guide development of a neural/hormonal/whatever system that implements various strategies. “

                I’d say something less strong, such as “Natural selection results in organisms whose genes influence development of various neural/hormonal/whatever systems that implement various mechanisms with some degree of reliability.”

                (Saying “strategy” instead of “mechanism” sounds like anthropomorphizing to me.)

            • Keith O'Rourke says:

              > what the people formulating the hypothesis thought
              Yes, but only if their thought was based on and responsive to experience shared in the scientific community.

              The experience shared in the scientific community that Andrew pointed to did suggest a feather on a bathroom scale is an appropriate metaphor.

      • Brad Stiritz says:

        >Cultural evolution is hugely important in humans..

        Good grief, you talk about cultural evolution and then go on to trash the man who coined the term “meme”?!

        I read quite a bit of “Not By Genes Alone”. IMHO, not remotely in the same league scientifically as Dawkins.

        • Alex Gamma says:

          Statistically speaking, it was certainly very likely that a fan of Dawkins would reply here before I meet anybody who has absorbed Susan Oyama, Paul Griffiths, David Sloan Wilson, Eva Jablonka, R. Boyd & P. Richerson, Mary Jane West-Eberhard, Samir Okasha, Larry Moss, Richard Lewontin, Kevin Laland, etc.

          Anyway, you apparently think that “meme theory” (is there a theory?) makes a substantial contribution to the understanding of cultural evolution. I disagree. I could maybe see a role for it as a mathematical model that describes the dynamics of how certain ideas spread, but without capturing the true underlying causal mechanisms. Something like “obesity spreads by contagion”.

          Our disagreement about the intellectual stature of Dawkins is hardly going to be resolved on this blog. The only thing I can offer are more references.

          • Brad Stiritz says:

            >you apparently think that “meme theory” (is there a theory?) makes a substantial contribution to the understanding of cultural evolution

            Are you aware that the very concept of a “meme” has itself now gone viral, evolving as well into “internet meme”? Google currently has almost 500 million cached references to “meme”, which they define as follows:

            “an element of a culture or system of behavior that may be considered to be passed from one individual to another by nongenetic means, especially imitation.”

            So that’s a one-word neologism from “The Selfish Gene”, with almost half a billion citations. Can you provide any similar examples from your long reading list, with similarly broad influence (say within an order of magnitude on the Google meter?)

            • Alex Gamma says:


              yes, I know that the word “meme” has become a meme that is hugely popular. I don’t know how that would show that it contributes much to a theory of cultural evolution. So far, I’m not aware of any substantial contribution..

              As for citations and Google meters, truths are unfortunately not determined by majority vote.

              • Brad Stiritz says:

                Google defines “theory” as follows:

                “a supposition or a system of ideas intended to explain something, especially one based on general principles independent of the thing to be explained.”

                I think it should be obvious to any casual reader that Dawkins’ chapter on the meme clearly meets this criteria.

                The citation count is prima facie evidence that Dawkins has contributed something of useful semantic and explanatory value.

    • AL says:

      Thoughts from a geneticist:

      Natural selection, if it ever produces more sons, will act over many generations, I mean 10 or 20, not 2 or 3. Number of sons is desirable for survival, so it is constantly under strong selection pressure. No major genes attended (if they exist, they are extremely unfrequent, so the power is close to 0).

      OTOH, wealthy people historically tended to have less sons for social reasons (birth control, age at marriage, etc) but also because this keeps the fortune together. Poor peasants tended to have many sons because this implies cheap workmen. This is nicely explained in Weber “Peasants to Frenchmen”.

  12. Simon Gates says:

    >The stupid equation of evolution with genes is what you get when people get their education from the selfish gene shit of Dawkins & co.

    I’m really not sure what you’re getting at here. Genes have quite a lot to do with evolution, don’t they? Care to elaborate your objection to Dawkins’s ideas?

    • Alex Gamma says:


      Genes are important mechanisms of inheritance, enabling evolution by natural selection. However, and particularly so in humans, genes are not all that is inherited. They are merely one of a number of systems of inheritance that interact to create evolutionary outcomes (read Jablonka & Lamb’s “Evolution in Four Dimensions” and Oyama’s “The Ontogeny of Information” on this).

      Particularly important for humans is cultural evolution which has some importantly different characteristics than genetic evolution: it acts on a faster timescale, is Lamarckian and does not necessarily produce benefits in terms of reproductive fitness. It can probably explain many social behaviors and psychological traits of modern humans. Cultural evolution may also preferentially work by group selection, i.e. to a stronger degree than non-cultural evolution (read Sober & Wilsons’ “Unto Others” on this; needless to say that all forms of Darwinian evolution interact, see gene-culture coevolution etc).

      The notion that genes are selfish, even if not intended literally, has created endless confusion. It has e.g. greatly influenced the debate on the possibility of human altruism and led to views that “selfishness” is some kind of evolutionary biological imperative that would render the existence of altruism a perplexing mystery. This is still a very popular view today.

      Dawkins and the selfish gene idea have also contributed strongly to the unjustified banishment of group selection (“multi-level selection”), which has set back evolutionary research by decades (again see Sober & Wilson’s “Unto Others”).

      By his own admission, the selfish gene idea is not a scientific *theory* but a *perspective* that Dawkins wanted to urge on his audience. It is a perspective that in some cases is very illuminating, but in many others neglects important non-genetic aspects (see e.g. Okasha’s “Evolution and the Levels of Selection” on this).

      Despite his assertions to the contrary, Dawkins’ views do encourage genetic determinism. He says he’s decidedly not a genetic determinist and I believe him, but his constant assertions that traits and phenotypes are all created in the interest of genes and genes alone leaves it a mystery how anything else but genes could be a systematic factor in the development of traits.

      My general diagnosis of the current misère in understanding evolution is that the separation of genotype and phenotype, the juxtaposition of genes to everything else (the “environment”), while initially well intended, has gone spectacularly wrong.

      We should know now that genes are only one part of a causal network of interactions that systematically re-occur in every generation, re-constructing the traits of the members of this generation, thus leading to the phenomenon of inheritance.

      We should know that genes do not determine traits and there is no reason to treat them as special in development (yet we’re still frantically focussing on finding genes for everything and talk about genes containing instructions).

      Behavioral geneticists keep on trying to separate genetic from non-genetic influences, although, given the inextricable interaction between them, this is neither possible nor helpful (even if true, a finding that 80% of the variation in intelligence is “genetic” leads to no actionable insight).

      You can see that I feel strongly about this but I hope I’ve been able to give some valuable pointers.

      • Simon Gates says:

        Alex – thanks for the reply. Really interesting. I used to be an evolutionary biologist a long time ago and it’s really interesting to catch up a bit with the field.

      • Brad Stiritz says:

        >We should know that genes do not determine traits

        “people with blue eyes have a single, common ancestor. A team at the University of Copenhagen have tracked down a genetic mutation which took place 6-10,000 years ago and is the cause of the eye colour of all blue-eyed humans alive on the planet today.”

        • Martha (Smith) says:

          If indeed it is true that there is “a genetic mutation which took place 6-10,000 years ago and is the cause of the eye colour of all blue-eyed humans alive on the planet today,” that still wouldn’t say that the gene *determines* the trait of having blue eyes — there might be people who have inherited the trait but do not have blue eyes. (e.g., I wonder if perhaps people with green eyes also have the trait?)

          • Brad Stiritz says:

            Hi Martha,

            Google defines “determine” as follows:

            “cause (something) to occur in a particular way; be the decisive factor in”.

            If that genetic mutation is the ultimate cause and decisive factor in having blue eyes, then I would submit that the referenced study is supportive of the claim that genes do in fact determine traits.

            • Brad Stiritz says:

     be more precise, some genes do in fact determine some traits.

              • Martha (Smith) says:

                Dictionary definitions are often too fuzzy for talking about scientific topics. I can accept that genes influence traits, but I would need strong evidence that there are not also other factors that affect whether or not the trait in fact occurs. Or put another way, I can accept that genes are part of the cause of traits, but I am very skeptical that they deterministically cause traits. Randomness is ubiquitous in life: when a cell divides, the stuff inside is likely not to be evenly divided, and so different daughter cells may function differently.

      • Adam says:


        Would you agree that cultural evolution depends on having a brain designed to absorb, modify, and transmit “culture?” And given that human culture is just one of many possible cultures that could be acquired, the brain must in some way be structured to acquire, modify, transmit human culture specifically? Hopefully you agree that both of these points are uncontroversial.

        The point is that since the brain is a product of evolution by natural selection, in some sense, cultural evolution is not an independent mechanism of inheritance but itself a product of natural selection. It is the result of a brain shaped by natural selection so that it develops to reliably absorb, modify, and transmit certain types of cultural information.

        TL; DR: No natural selection, no brain. No brain, no cultural evolution.

        • Martha (Smith) says:

          Alex said: “a brain shaped by natural selection so that it develops to reliably absorb, modify, and transmit certain types of cultural information.”

          This sounds like a “purposive” view of natural selection/evolution that I can’t buy into. I’d say something like “a brain that has developed by natural selection and that happens to be reasonably capable of absorbing, modifying, and transmitting some types of cultural information.”

          • Adam says:


            No teleology here. Just linguistic shortcuts, commonly used in evolutionary sciences, to ease the exposition.

            • Martha (Smith) says:

              Unfortunately, linguistic shortcuts often lead to misunderstandings — such as “statistical significance”: bad enough as is, but worse when shortened to just “significance”, and even worse when woven into phrases such as “approaching significance.”

        • Alex Gamma says:


          It sounds reasonable to me to say that cultural evolution depends on having a brain, at least on our planet. Whether the brain has been shaped by natural selection to specifically allow for cultural evolution is hard to know. It’s doubtful we’ll ever be able to know what, if any, the selection pressures were that promoted the building of a brain. As said in another post, this is a historical question and evidence is unlikely to ever be conclusive since the relevant processes don’t leave behind traces.

          Also, it is not a given that the whole brain has some single evolutionary function (like facilitating the generation and transmission of culture). It may be that parts of it are evol. adaptations and others aren’t. It seems overwhelmingly likely that a complex organ like the brain would be able to do countless things which it was not selected for.

          In any case, I don’t see what you think is proven or disproven if the brain has evolved by natural selection and is in turn a necessary condition for cultural evolution. That doesn’t change anything about the importance of cultural evolution.

          • Adam says:


            OK, great that we agree on those basic points. I only raise them because it seems that what follows doesn’t fit with the suggestion, if I read you right, that cultural evolution is an alternative mechanism of inheritance to natural selection. Since we agree on those basic points, it’s not clear how it can be an alternative, since its very existence depends on natural selection: the neural/psychological mechanisms that enable cultural evolution are either adaptations or by-products of them. Cultural evolution is a psychological phenomena proximately caused by the brain and distally caused by evolution. Totally agree that cultural evolution is important, but to treat it as an alternative to NS seems to mix up the levels of analysis (proximal vs distal).

            • Alex Gamma says:

              I never said cultural evolution is an alternative “mechanism of inheritance to natural selection”, even if I interpret your phrase charitably. My claim is that cultural evolution may be much more important to our social behaviors and mental traits than genetic or biological evolution.

              Another question is whether cultural evolution works via natural selection or some other form of selection. That is partly a terminological issue. But I think there are reasons why one might want to consider this a different form of selection (it doesn’t necessarily operate via reproductive fitness, it is Lamarckian, it depends on human intention etc.)

  13. Donald says:

    Dawkins ideas were cutting edge decades ago, where good in the 90′, still mentioned in the new Millennium, and now basically never taught in any in depth manner. I am currently a graduate student in the life sciences at the University of California and Dawkins notions are are more historical than anything.

    • Liqun Luo says:

      Are you sure? do you specialize in EVOLUTIONARY biology? In fact, Dawkins’ idea of selfish genes is only popularization of W.D. Hamilton’s inclusive fitness theory. This theory is still supported by the community of evolutionary biology. See e.g. Abbot,P. et. al. Inclusive Fitness Theory and Eusociality [J]. Nature,2011(471).

      • Martha (Smith) says:

        I’m not an evolutionary biologist, but for several years participated in an evolutionary biology journal club and served on Ph.D. committees in that area. The consensus (substantiated by evidence) there seemed to be that evolution does occur at the level of the individual gene, but at many other levels as well. In other words, Dawkins’ theory captures some of what happens, but is too simplistic to capture it all.

        • Liqun Luo says:

          Surely R.Dawkins (and E.O. Wilson) is a good scholar. But the real heroes are W.D Hamilton, Robert Trivers, George Williams, George Price, etc..

          In China, it seems some biologists also support some kind of multi-level selection. But it seems the advance of science needs reduction.

          Several Oxford evolutionary biologists (and Robert Trivers) are ardent advocates of inclusive fitness theory as modern interpretation of Darwin’s theory. According to them, this viewpoint is also a mainstream viewpoint in evolutionary biology.

          Steven Frank at the UCI seems tolerant of multiple-level selection. I like his works very much.

  14. Paul says:

    In your standard error calculations, where did the .5 in the numerator come from?

  15. Brad Stiritz says:

    Martha>I am very skeptical that [genes] deterministically cause traits

    So I take it you believe it’s not correct to call Down syndrome, Fragile X syndrome, etc genetic disorders?

    What are your thoughts on BRCA gene testing?

    • Martha (Smith) says:

      I have no objection to calling Down syndrome, Fragile X syndrome, etc. genetic disorders, since they involve gene variants; I interpret “genetic disorder” as a disorder involving one or more gene variants. But we don’t necessarily know everything about a “genetic disorder” just by knowing the gene variants involved. Genetic expression varies according to a variety of factors. So it’s not a matter of “the gene is everything”; environment, epigenetic factors, random variation in results of cell division, etc. can plausibly also increase or decrease the risk of actually having a disorder, or affect its severity.

      Similarly, I think making BRCA testing available is a good thing — but we need to bear in mind (and make sure that women receiving the testing are aware) that having a BRCA variant associated with breast or ovarian cancer simply means that the woman has a greater than average risk of contracting breast or ovarian cancer; it does not mean that a woman will definitely contract one of those cancers. We don’t really know all the other factors involved and how they might interact.

      In short: In the phrase you quote from me, “skeptical” refers to the “deterministically” part. Yes, genes are involved in many disorders (and most aspects of life) — but not necessarily deterministically. (I’d even hazard a guess that rarely are they involved deterministically, but instead various other factors are typically also involved.)

      • Brad Stiritz says:

        In yeast biology anyway, it doesn’t sound like genetic determinism is rare. We’ll have to see how the proportion shakes out for humans.

        “In yeast, only one in five genes is essential. If any of the approximately 1,200 critical genes are destroyed (out of 6,000), the result is death.

        ..the yeast work has implications for human cells, which share many of the same biological mechanisms. Errors in multiple genes most likely underlie hereditary diseases that cannot be pinpointed to a single causal gene”

        • Martha (Smith) says:

          I don’t see how your quotes give evidence of genetic determinism. (Or perhaps we mean different things by “genetic determinism”?)

          • Brad Stiritz says:

            The way I interpret the Quanta quote is : for 20% of the yeast genome, a single disorder completely determines the outcome (death). So those genes are in fact “everything”. From the life-or-death perspective of the individual yeast cell, it will die if one of those critical genes is missing or defective, period.

            To me, this is genetic determinism with zero ambiguity. But to your question, we may have to go back to the semantics of what it means for something to be “determined”? In my own words, it means we’re making a statement about inferential knowledge : X implies Y.

            In this case, we have roughly Russian-roulette odds. I’ve seen “The Deer Hunter”, so I wouldn’t take that bet :\

  16. Bill Harris says:

    @Martha: You wrote, “Saying “strategy” instead of “mechanism” sounds like anthropomorphizing to me.” Perhaps so, but this news article about Eshel Ben-Jacob’s work at (I’ve found more technical articles, but this came up first tonight) sounds intriguing in that direction.

    • Bill Harris says:

      A Cyber War on Cancer ( is a bit more specific about these ideas. I don’t know how well Ben-Jacob’s ideas are accepted–or what that says (or doesn’t say) about its validity or utility.

      • Martha (Smith) says:

        From the second link:

        ““All this sounds like an enterprise which is programed, and not just a collection of accidental, opportunistic cells that decide to leave the tumor and just take their chance that they will arrive or not,” Dr. Ben-Jacob said.”

        This sounds like striking down a straw man. Yes, there are forms of communication that have been developed by natural selection (including calcium and sodium ion channels, which are crucial in many life forms, including human). The hypothesis that there are such communication pathways among cells with common ancestry is plausible, and it is plausible that such communication pathways may have developed by natural selection — but “just a collection of accidental, opportunistic cells that decide to leave the tumor and just take their chance that they will arrive or not” is not a description of natural selection.

    • Martha (Smith) says:

      Ben-Jacob wrote: “I proposed there could be pieces of the genome that move and exchange between bacteria giving chemical messages. Thus, not all changes are due to random mutation at replication.”

      I don’t subscribe to the idea that “all changes are due to random mutation at replication,” but because there are other random events that can influence changes — e.g., random events in the environment; randomness in the result of cell divisions.

  17. Liqun Luo says:

    I agree with you about the statistical issues. In fact, I have learned a lot from this post.

    Theoretically, the Trivers-Willard theory holds well in a population in a biological sense. For humans, it is sometimes difficult to discern such a population. If a whole national population is roughly regarded as such a population, probably a very large sample is needed to test the theory.

Leave a Reply