I took the above headline from a news article in the (London) Independent by Jeremy Laurance reporting a study by Jan-Emmanuel De Neve, James Fowler, and Bruno Frey that reportedly just appeared in the Journal of Human Genetics.

One of the pleasures of blogging is that I can go beyond the usual journalistic approaches to such a story: (a) puffing it, (b) debunking it, (c) reporting it completely flatly. Even convex combinations of (a), (b), (c) do not allow what I’d like to do, which is to explore the claims and follow wherever my exploration takes me. (And one of the pleasures of building my own audience is that I don’t need to endlessly explain background detail as was needed on a general-public site such as 538.)

OK, back to the genetic secret of a happy life. Or, in the words the authors of the study, a gene that “explains less than one percent of the variation in life satisfaction.”

**“The genetic secret” or “less than one percent of the variation”?**

Perhaps the secret of a happy life is in that one percent??

I can’t find a link to the journal article which appears based on the listing on De Neve’s webpage to be single-authored, but I did find this Googledocs link to a technical report from January 2010 that seems to have all the content. Regular readers of this blog will be familiar with earlier interesting research of Fowler and Frey working separately; I had no idea that they have been collaborating.

De Neve et al. took responses to a question on life satisfaction from a survey that was linked to genetic samples. They looked at a gene called 5HTT which, according to their literature review, has been believed to be associated with happy feelings.

I haven’t taken a biology class since 9th grade, so I’ll give a simplified version of the genetics. You can have either 0, 1, or 2 alleles of the gene in question. Of the people in the sample, 20% have 0 alleles, 45% have 1 allele, and 35% have 2. The more alleles you have, the happier you’ll be (on average): The percentage of respondents describing themselves as “very satisfied” with their lives is 37% for people with 0 alleles, 38% for those with one allele, and 41% for those with two alleles.

The key comparison here comes from the two extremes: 2 alleles vs. 0. People with 2 alleles are 4 percentage points (more precisely, 3.6 percentage points) more likely to report themselves as very satisfied with their lives. The standard error of this difference in proportions is sqrt(.41*(1-.41)/862+.37*(1-.37)/509) = 0.027, so the difference is not statistically significant at a conventional level.

But in their abstract, De Neve et al. reported the following:

Having one or two allleles . . . raises the average likelihood of being very satisfied with one’s life by 8.5% and 17.3%, respectively?

How did they get from a non-significant difference of 4% (I can’t bring myself to write “3.6%” given my aversion to fractional percentage points) to a statistically significant 17.3%?

**A few numbers that I can’t figure out at all!**

Here’s the summary from Stephen Adams, medical correspondent of the Daily Telegraph:

The researchers found that 69 per cent of people who had two copies of the gene said they were either satisfied (34) or very satisfied (35) with their life as a whole.

But among those who had no copy of the gene, the proportion who gave either of these answers was only 38 per cent (19 per cent ‘very satisfied’ and 19 per cent ‘satisfied’).

This leaves me even more confused! According to the table on page 21 of the De Neve et al. article, 46% of people who had two copies of the gene described themselves as satisfied and 41% described themselves as very satisfied. The corresponding percentages for those with no copies were 44% and 37%.

I suppose the most likely explanation is that Stephen Adams just made a mistake, but it’s no ordinary confusion because his numbers are so specific. Then again, I could just be missing something big here. I’ll email Fowler for clarification but I’ll post this for now so you loyal blog readers can see error correction (of one sort or another) in real time.

**Where did the 17% come from?**

OK, so setting Stephen Adams aside, how can we get from a non-significant 4% to a significant 17%?

– My first try is to use the numerical life-satisfaction measure. Average satisfaction on a 1-5 scale is 4.09 for the 0-allele people in this sample and 4.25 for the 1-allele people, and the difference has a standard error of 0.05. Hey–a difference of 0.16 with a standard error of 0.05–that’s statistically significant! So it doesn’t seem just like a fluctuation in the data.

– The main analysis of De Neve et al., reported in their Table 1, appears to be a least-squares regression of well-being (on that 1-5) scale, using the number of alleles as a predictor and also throwing in some controls for ethnicity, sex, age, and some other variables. They include error terms for individuals and families but don’t seem to report the relative sizes of the errors. In any case, the controls don’t seem to do much. Their basic result (Model 1, not controlling for variables such as marital status which might be considered as intermediate outcomes of the gene) yields a coefficient estimate of 0.06.

They then write, “we summarize the results for 5HTT by simulating first differences from the coefficient covariance matrix of Model 1. Holding all else constant and changing the 5HTT gene of all subjects from zero to one long allele would increase the reporting of being very satisfied with one’s life in this population by about 8.5%.” Huh? I completely don’t understand this. It looks to me that the analyses in Table 1 are regressions on the 1-5 scale. So how can they transfer these to claims about “the reporting of being very satisfied”? Also, if it’s just least squares, why do they need to work with the covariance matrix? Why can’t they just look at the coefficient itself?

– They report (in Table 5) that whites have higher life satisfaction responses than blacks but lower numbers of alleles, on average. So controlling for ethnicity should increase the coefficient. I still can’t see it going all the way from 4% to 17%. But maybe this is just a poverty of my intuition.

– OK, I’m still confused and have no idea where the 17% could be coming from. All I can think of is that the difference between 0 alleles and 2 alleles corresponds to an average difference of 0.16 in happiness on that 1-5 scale. And 0.16 is practically 17%, so maybe when you control for things the number jumps around a bit. Perhaps the result of their “first difference” calculations was somehow to carry that 0.16 or 0.17 and attribute it to the “very satisfied” category?

**1% of variance explained**

One more thing . . . that 1% quote. Remember? “the 5HTT gene explains less than one percent of the variation in life satisfaction.” This is from page 14 of the De Neve, Fowler, and Frey article. 1%? How can we understand this?

Let’s do a quick variance calculation:

– Mean and sd of life satisfaction responses (on the 1-5 scale) among people with 0 alleles: 4.09 and 0.8

– Mean and sd of life satisfaction responses (on the 1-5 scale) among people with 2 alleles: 4.25 and 0.8

– The difference is 0.16 so the explained variance is (0.16/2)^2 = 0.08^2

– Finally, R-squared is explained variance divided by total variance: (0.08/0.8)^2 = 0.01.

A difference of 0.16 on a 1-5 scale ain’t nothing (it’s approximately the same as the average difference in life satisfaction, comparing whites and blacks), especially given that most people are in the 4 and 5 categories. But it only represents 1% of the variance in the data. It’s hard for me to hold these two facts in my head at the same time. The quick answer is that the denominator of the R-squared–the 0.8–contains lots of individual variation, including variation in the survey response. Still, 1% is such a small number. No surprise it didn’t make it into the newspaper headline . . .

Here’s another story of R-squared = 1%. Consider a 0/1 outcome with about half the people in each category. For.example, half the people with some disease die in a year and half live. Now suppose there’s a treatment that increases survival rate from 50% to 60%. The unexplained sd is 0.5 and the explained sd is 0.05, hence R-squared is, again, 0.01.

**Summary (for now):**

I don’t know where the 17% came from. I’ll email James Fowler and see what he says. I’m also wondering about that Daily Telegraph article but it’s usually not so easy to reach newspaper journalists so I’ll let that one go for now.

**P.S.** According to his website, Fowler was named the most original thinker of the year by The McLaughlin Group. On the other hand, our sister blog won an award by the same organization that honored Peggy Noonan. So I’d call that a tie!

**P.P.S.** Their data come from the National Survey of Adolescent Health, which for some reason is officially called “Add Health.” Shouldn’t that be “Ad Health” or maybe “Ado Health”? I’m confused where the extra “d” is coming from.

**P.P.P.S.** De Neve et al. note that the survey did not actually ask about happiness, only about life satisfaction. We all know people who appear satisfied with their lives but don’t seem so happy, but the presumption is that, in general, things associated with more life satisfaction are also associated with happiness. The authors also remark upon the limitations using a sample of adolescents to study life satisfaction. Not their fault–as is appropriate, they use the data they have and then discuss the limitations of their analysis.

**P.P.P.P.S.** De Neve and Fowler have a related paper with a nice direct title, “The MAOA Gene Predicts Credit Card Debt.” This one, also from Add Health, reports: “Having one or both MAOA alleles of the low efficiency type raises the average likelihood of having credit card debt by 14%.” For some reason I was having difficulty downloading the pdf file (sorry, I have a Windows machine!) so I don’t know how to interpret the 14%. I don’t know if they’ve looked at credit card debt and life satisfaction together. Being in debt seems unsatisfying; on the other hand you could go in debt to buy things that give you satisfaction, so it’s not clear to me what to expect here.

**P.P.P.P.P.S.** I’m glad Don Rubin didn’t read the above-linked article. Footnote 9 would probably make him barf.

**P.P.P.P.P.P.S.** Just to be clear: The above is not intended to be a “debunking” of the research of De Neve, Fowler, and Frey. It’s certainly plausible that this gene could be linked to reported life satisfaction (maybe, for example, it influences the way that people respond to survey questions). I’m just trying to figure out what’s going on, and, as a statistician, it’s natural for me to start with the numbers.

**P.^7S.** James Fowler explains some of the confusion in a long comment.

Anonymous:

I started by calling it 0.16 but then my R-squared was 4% not 1% so I realized I'd been doing something wrong. I thought about it and then realized that the correct sd for the numerator was 0.08.

Here's the quick approximate calculation. You start with a predicted value of 4.17 for everyone. Then add the genetic predictor: now the prediction is 4.09 or 4.25. This new distribution has sd 0.08, that is, half of the difference which was 0.16.

There's also the middle group (the people with one allele), but to first approximation we can ignore them because they have essentially no influence on the linear coefficient.

Whoa, that footnote 9 paper has been cited 22k times according to Google scholar!

http://www.snpedia.com/index.php/Rs25531

has a collection of other papers related to this variation.

It does seem to have some cognitive effect, but no one has quite pinned down what that is. Thanks for digging a little deeper.

http://personal.lse.ac.uk/deneve/Rplot.jpg has a plot with the 17% difference.

Sorry for being dense, but why do you divide by 2 in the explained variance calculation (0.16/2)^2?

Dean:

Yeah, I saw that plot too–but where did the numbers come from? I'm also unclear about whether he's using "likelihood" as a synonym for "probability" or whether there is some transformation going on.

It sounds like they modeled the outcome as binary (very satisfied vs. not very satisfied). If they did that, perhaps the 17% comes from the odds ratio, though there is no mention of such a model in the google docs file (only the OLS from what i can see). Working backwards using the divide by 4 rule, exp(.04*4)=1.17. I've seen people interpret the odds ratio as a "percent increase in [some outcome]" before, and without referring to the odds scale.

The "The MAOA Gene Predicts Credit Card Debt" paper has the same type of "likelihood" graph with the same y-axis label. In that paper, it seems the 14% increase is a relative probability of debt for the highest genotype (43.5%) vs. the lowest genotype (38.2%) based on the same type of simulation model. The difference in probability from the simulation corresponds to what they would have gotten by transforming the odds ratio (1.24) to a risk difference. Perhaps they did this in the happiness paper, as well.

Agreed. It doesn't really explain much, but I assume it is part of the embargoed paper…

Depending on the details, this case could be a nice addition to your small effects talk.

Also, Baron & Kenny's effect on the social science literature is an important part of the problems with the social psych literature on processes (my comments on this: http://bit.ly/iOqAIR). When will it stop?

Dan:

I was thinking this too, but in the article they make a point of using least squares rather than probit, and they don't talk about logit at all.

Dear Andrew,

Thanks for this fascinating set of questions. Let me address as many as I can.

First, one source of confusion is that there are two papers. The one that has received the media attention is a solo-authored piece by Jan-Emmanuel de Neve which is due to be published online at JHG by the end of the week. You can either get the paper from Jan or wait until it is posted by the journal.

The other paper coauthored by Jan, Bruno Frey, Nicholas Christakis, and myself can be found at SSRN (see http://papers.ssrn.com/sol3/papers.cfm?abstract_i…

Adding to the confusion, the Googledocs version you found is old, and some of the analysis has changed since then. I'll just focus on the most recent version of the paper in my answers.

In the SSRN-posted paper, we report the results from several OLS regressions that show an extra allele is associated with a 0.06 to 0.08 bump (0.02 to 0.3 S.E.) in the expected well-being score, which ranges from 1 (very dissatisfied) to 5 (very satisfied). We calculated the expected happiness score from these models for each individual by multiplying their observed values times the estimated coefficients and adding them up (and you are right that simulating from the covariance matrix produces identical results to just reading the regression table, but this is an easy way to incorporate additional uncertainty arising from correlation in the coefficient estimates). We then binned people by rounding to the nearest whole number. We can compare the number of people who get binnned in each category when we assume everyone has 0 alleles to the case when we assume everyone has 1 or 2 alleles. The results in Figure 2 summarize this procedure — the number of people in the "very satisfied" bin increases by 9% with 1 allele and by 17% with 2 alleles. To be clear, this is measured as 100*(N|2 alleles)/(N|0 alleles) – 100. In other words, a modified risk ratio.

There are several other ways we could report this. We could just say a 0.06 point bump in a 5-point scale. We could also report the percentage increase in the "very satisfied" group as a fraction of all individuals rather than as a fraction of those we predicted to be in the "very satisfied" group. In that case, the number would be 100*((N|2 alleles)-(N|0 alleles))/N – 100.

It's not always clear what the best way to report a result is. We were hoping to choose the one that was most intuitive to both a scientific and lay audience, but since we confused at least one extraordinarily-capable (and award-winning!) scientist, we may need to rethink this strategy….

Your point about the 4% difference in the raw data being insignificant is correct if we select on the dependent variable and only look at cases where people are "very satisfied" — but instead we assume there is information in the other categories of the DV about a relationship we assume is linear across all categories. We poked at this linearity assumption a bit, for example testing the thresholds estimated in an ordered logit against an assumption of linearity, and we did not find any obvious contradiction to this assumption, but that obviously is also a function of the power we have to detect such effects which is determined by the size of the sample.

At this point, Jan has communicated with The Telegraph about their report on the study, and I understand that they have already corrected the confusing numbers there (which had nothing to with the associations reported in the study), so hopefully they now line up with those in his solo-authored piece.

I'm very glad you emphasize the 1% of variance explained…. I ALWAYS emphasize this with reporters, that any complex social trait is likely to be influenced by hundreds of genes, so it's not right to say we have found "the gene" for happiness. But it gets lost. So repeating it is good.

james

P.S. I think your sister blog wins the battle of the dubious honorifics!

P.P.S. You are not the only one confused by "Add Health" — this is the self-determined name of the study (see http://www.cpc.unc.edu/projects/addhealth) but researchers use any number of variants (AddHealth, AdHealth, Add-Health, etc.), which makes it hard when doing article searches! I assume they wanted to communicate the idea that the point of the study was to "add" to our health.

P.P.P.S. Bruno Frey consistently makes the point that life satisfaction is a better measure of happiness than others for the purpose of measuring individual utility (If you want to see Frey's rationale, look here: http://www.bsfrey.ch/articles/407_04.pdf). In contrast, affect (a psychological measure of happiness) is a more immediate and less cognitive subjective evaluation. I have published articles using both measures (our BMJ happiness paper uses an affect measure from the CES-D), and they are highly correlated.

P.P.P.P.S. The credit card debt paper is here: http://papers.ssrn.com/sol3/papers.cfm?abstract_i… The key sentence from that paper is this: "Holding all other variables constant at their mean and varying the MAOA genotype of all sub jects from high to low would increase the reporting of credit card debt in this population from 38.2% (95%CI: 35.2%—41.1%) to 43.5% (95%CI: 40.5%—46.5%). This implies that the marginal eﬀect of having one or both MAOA alleles of the low efficiency type raises the average likelihood of having creditcard debt by about 14%." So again, the 14% is from a risk ratio, but we also give the numbers there so folks can think of it in other ways (e.g. it would increase the incidence of self-reported credit card debt by 5%). And no, we have not looked at credit card debt and happiness in the same model.

P.P.P.P.P.S. It's true that mediation analyses are observational, and doubly-problematic because they are sensitive to specification, measurement, and other errors at two stages rather than just one. But I do not accept responsibility for Don Rubin's hypothetical gastrointestinal problems! The perfect should not be the enemy of the good.

P.P.P.P.P.P.S. I'm *happy* for the detailed feedback!

P.P.S. Their data come from the National Survey of Adolescent Health, which for some reason is officially called "Add Health." Shouldn't that be "Ad Health" or maybe "Ado Health"? I'm confused where the extra "d" is coming from.I don't have first-hand knowledge (you can ask one of your Columbia colleagues in Soc whether it's true), but the lore goes that in the planning phases of the survey a central goal was was to collect data on adolescent sexual behavior (including sexual networks and STDs), but there were fears that if this were the core of the study its funding would become a political issue and get sunk by unfriendly legislators. The solution—"add health". That is, turn it into a general, comprehensive study of adolescent health which included the controversial stuff as one component of many. And this is what happened. (Add Health did collect the sexual/network data—see e.e. Bearman, Moody & Stovel's well-known AJS paper and many more besides.)