I was just reading an old post and came across this example which I’d like to share with you again:

Here’s a story of R-squared = 1%. Consider a 0/1 outcome with about half the people in each category. For.example, half the people with some disease die in a year and half live. Now suppose there’s a treatment that increases survival rate from 50% to 60%. The unexplained sd is 0.5 and the explained sd is 0.05, hence R-squared is 0.01.

some classic elaborations (of a point that 30 some yrs later still evades the comprehension of far too many econometricians…):

Abelson, R.P. A Variance Explanation Paradox: When a Little is a Lot. Psychological Bulletin 97, 129-133 (1985).

Rosenthal, R. & Rubin, D.B. A Note on Percent Variance Explained as A Measure of the Importance of Effects. J. Applied Social Psychol. 9, 395-396 (1979).

Both among my favourite papers. The Abelson paper also links nicely to the Dickens and Flynn model of the Flynn effect (another of Andrew’s recurring topics ISTR). Rosenthal uses the Salk vaccine trial in a later example for a real world example that is even more extreme (R^2 less than 0.1% and an effect of huge practical importance).

Speaking of recurring topics . . . these references supply further evidence of a longstanding principle of statistics.

Two marketing researchers walk into a bar and start talking about their respective days. The first market researcher analyzes customer satisfaction data. She complains that although she is finding significant effects for individual variables, her R-squared values are too small and she does not have confidence in her results. “I’m just missing some important predictors and cannot explain customer satisfaction.” The second market researcher works in direct marketing. She could care less about R-squared. “I’m looking for anything that will improve my response rate. I‘m not concerned about all that extraneous stuff that determines response rate. I can’t control or affect it. It averages out in the long run.” Neither is quite sure why the other does not understand the point that they are making.

Sorry, how do you calculate the unexplained and explained sd?