James Heckman recently posted this article, which is based on a paper from 1980. (This sort of thing happens; for example, I just published an article based on work from 1986.) Heckman’s tongue-in-cheek article begins:
This paper uses data available from the National Opinion Research Center’s (NORC) survey on religious attitudes and powerful statistical methods to evaluate the effect of prayer on the attitude of God toward human beings.
He sets up a model for the intensity of prayer, given its effectiveness. The key assumption is as follows:
Accept on faith that the conditional density of x [the intensity of prayer in the population] given y [God’s attitude arrayed on a scale ranging from 0 to 1] is of the form g(x|y) = a(y) exp(xy).
That is, the higher y is, the more prayer we’d see, which makes sense. (Heckman labels the function a(y) as “unknown,” but, unless I’m missing something, a(y) is a normalizing constant that can be calculated in closed form by integrating exp(xy) over x. Perhaps this mistake, if it is one, can be caught before the article appears in press.)
Given the reasonable enough model above, Heckman points out that you can differentiate the density of x and learn something about the distribution of y, the effectiveness of prayer.
What does it all mean?
Of course Heckman is joking, but it appears he might be making a more serious point when he comments:
Provided conditional density (1) is assumed, we do not need to observe a variable in order to compute its conditional expectation with respect to another variable whose density can be estimated. For example, one can extend current empirical work in a variety of areas of economics to estimate the effect of income on happiness or the effect of income inequality on democracy.
I don’t think this is literally an issue. True, all four of the variables Heckman mentions—income, happiness, income inequality, and democracy—can only be measured with error, but certainly they can be (and are) measured when they are studied empirically.
But I got a little worried that maybe there’s something more going on here, some reason I should be giving a little less credence to studies linking economics to psychology and political science. Is Heckman implying that those cross-disciplinary studies have, at bottom, no more foundation than his argument on the effectiveness of prayer?
So I went back to Heckman’s article to try to find the flaw in the reasoning. (By “flaw,” I don’t mean that Heckman was making a mistake; rather, I’m speaking of the hidden logical flaw that makes the reasoning flow, just as in those mathematical arguments where you “prove” 1=0 by means of a series of algebraic expressions that include a division-by-zero.)
Rereading carefully, I found the flaw. I actually think this article would be a good one for a take-home exam in a theoretical statistics class. I’ll give the answer below.
The flaw in the reasoning is that the probability algebra assumes, implicitly, that x and y are two random variables defined over a common population. In Heckman’s argument, the distribution of x represents variation across people (as measured, in this case, from survey data). The distribution of y must then, correspondingly, be the distribution of God’s attitude as perceived by the population, which is not so interesting as if it were really a measure of “God’s [true] attitude.” It’s a subtler (and funnier) version of the correlation/causation distinction: population variation should not be interpreted causally. Again, I don’t think this has a huge relevance on studies of economics and happiness and democracy (since these outcomes can be measured directly), but it’s fun to go through the argument and see where the rules are changed in midstream.
P.S. I’m sure some people will get on my case because I’m picking apart a joke. But Heckman’s an interesting thinker, and it’s not a bad idea to take his jokes seriously.
P.P.S. If you do take the model seriously, another amusing point is that the estimate of E(y|x) is negative for some values of x. But y is restricted to fall between 0 and 1; this indicates that the model does not fit the data! (See this article—figure 7 in particular—for similar reasoning in another setting.)
Compare Francis Galton in 1872:
http://www.galton.org/essays/1870-1879/galton-187…
Andrew – thanks for a very interesting post! I think Heckman's main point is a different one though.
The way I see it, what he says is simple: If you *assume* g(x|y) and you have data on the distribution of x, you can work out f(y|x). The problem is that you are of course none the wiser for it as your inference is purely based on an assumption! (in fact, you might as well have assumed f(y|x) to start with and save yourself the hassle) His mention of 'powerful techniques' etc etc serves to highlight that lots of statistical results out there are the result of nothing more than a far-fetched and ridiculously specific assumption disguised by the use of some 'powerful' technique. You can always 'assume' a function out of thin air, and completely drive the results, and the use of data in the whole enterprise only serves to give an air of unearned respectability to the results. (hence his joke in the end: 'we
conjecture that this powerful method can be extended to the more general case when X is not observed either.')
The funny thing here is that this very criticism applies to many of the applications of Heckman's own two-step selection model. In many cases, the model is identified purely through the functional form assumptions (i.e. the fact a probit is used to estimate the selection process and a linear model for the second stage). Whatever results are obtained in this way (i.e. without further information distinguishing selection and second-stage decisions) rest entirely on the assumed functional form – and there's no good reason why these functional forms would be 'correct'.
Which brings me to your PPS: I don't think there's necessarily anything terribly wrong with fitting a function that ranges beyond [0,1] for something that we *know* is constrained between [0,1]. This is a frequent critique of the linear probability model, with people urging that a probit or logit need to be used instead, but of course in many (most?) applications we are interested in a good approximation around the range of interest, and the extremes (i.e. values of x that result in predicted probabilities higher than 1 or lower than 0) are less interesting or important in fitting the model. A linear approximation may well work better than a probit/logit one around the range of interest (or at the very least, there's no reason to expect it would not work as well as a probit/logit), even if we know that for extreme values the LPM cannot possibly be giving correct predictions.
I wrote this comment together in a hurry so I'm not sure if it makes sense; if it doesn't, let me know and we can contunue the discussion.
I don't see why we need to spend so much time figuring out what God thinks. If you ride the A train long you're ill be bound to find out from some deep-voiced preacher for whom error is not a working concept.
Scientific investigation of prayer has always struck me as a sort of fool's errand. The Wikipedia entry on Efficacy of Prayer calls Galton's essay a satire, which is more or less what I concluded after a page or two. At first I though he had fallen victim to a classical epidemiological fallacy in using age at death as an endpoint (see "Longevity of jazz musicians: flawed analysis" by Kenneth Rothman 1992; AJPH 82(5):761), but he was so brilliant that it might have been deliberate.
Wikipedia cites several clinical trials of efficacy of prayer in medical settings. Typically, patients are randomized to get or not get intercessory prayers, with improved health as an endpoint. Most such trials have yielded null results. It has always struck me that the principal design flaw in such trials was failure to account for the possibility that one or more of the patient's heirs, impatient for a share of the estate, might actually be praying to hasten the inevitable along, thereby nullifying or counterbalancing the pro-patient prayers.
Prayer is older than alchemy but equivalent in most respects.
People in the ancient times were afraid and didn't know how the world works. They had no scientific method and their economic growth was basically 0% for millenia with few significant changes to static institutions or social hierarchy. Inertia is a very powerful societal force.
It's time to move on.