A reporter contacted me to ask my thoughts on the recent Nobel prize in economics. I didn’t know that this had happened so I googled *nobel prize economics* and found the heading, “David Card, Joshua Angrist and Guido Imbens Win Nobel in Economics.” Hey—I know two of these people!
Fortunately for you, our blog readers, I’d written something a few years ago on the topic of a Nobel prize in economics for causal inference, so I can excerpt from it here.
Causal inference is central to social science and especially economics. The effect of an intervention on an individual i (which could be a person, a firm, a school, a country, or whatever particular entity is being affected by the treatment) is defined as the difference in the outcome yi, comparing what would have happened under the intervention to what would have happened under the control. If these potential outcomes are labeled as yiT and yiC, then the causal effect for that individual is yiT − yiC . But for any given individual i, we can never observe both potential outcomes yj0 and yj1, thus the causal effect is impossible to directly measure. This is commonly referred to as the fundamental problem of causal inference, and it is at the core of modern economics.
Resolutions to the fundamental problem of causal inference are called “identification strategies”; examples include linear regression, nonparametric regression, propensity score matching, instrumental variables, regression discontinuity, and difference in differences. Each of these has spawned a large literature in statistics, econometrics, and applied fields, and all are framed in response to the problem that it is not possible to observe both potential outcomes on a single individual.
From this perspective, what is amazing is that this entire framework of potential outcomes and counterfactuals for causal inference is all relatively recent.
Here is a summary of the history by Guido Imbens:
My [Imbens’s] understanding of the history is as follows. The potential outcome frame- work became popular in the econometrics literature on causality around 1990. See Heckman (1990, American Economic Review, Papers and Proceedings, “Varieties of Selection Bias,” 313–318) and Manski (1990 American Economic Review, Papers and Proceedings, “Nonparametric Bounds on Treatment Effects,” 319–323). Both those papers read very differently from the classic paper in the econometric literature on program evaluation and causality, published five years earlier, (Heckman, and Robb, 1985, “Alternative Methods for Evaluating the Impact of Interventions,” in Heckman and Singer (eds.), Longitudinal Analysis of Labor Market Data) which did not use the potential outcome framework. When the potential outcome framework became popular, there was little credit given to Rubin’s work, but there were also no references to Neyman (1923), Roy (1951) or Quandt (1958) in the Heckman and Manski papers. It appears that at the time the notational shift was not viewed as sufficiently important to attribute to anyone.
Haavelmo is certainly thinking of potential outcomes in his 1943 paper, and I [Imbens] view Haavelmo’s paper (and a related paper by Tinbergen) as the closest to a precursor of the Rubin Causal Model in economics. However, Haavelmos notation did not catch on, and soon econometricians wrote their models in terms of realized, not potential, outcomes, not returning to the explicit potential outcome notation till 1990.
The causality literature is actually one where there is a lot of cross-discipline referencing, and in fact a lot of cross-discipline collaborations between statisticians, econometricians, political scientists and computer scientists.
The potential-outcome or counterfactual-based model of causal inference has led to conceptual, methodological, and applied breakthroughs in core areas of econometrics and economics.
The key conceptual advances come from the idea of a unit-level treatment effect, yiT − yiC, which, although it is unobservable, can be aggregated in various ways. So, instead of the treatment effect being thought of as a parameter (“β” in a regression model), it is an average of individual effects. From one direction, this leads to the “local average treatment effect” of Angrist and Imbens, the principal stratification idea of Frangakis and Rubin, and various other average treatment effects considered in the causal inference literature. Looked at another way, the fractalization of treatment effects allows one to determine what exactly can be identified from any study. A randomized experiment can estimate the average treatment effects among the individuals under study; if those individuals are themselves a random sample, then the average causal effect in the population is also identifiable. With an observational study, one can robustly estimate a local average treatment effect in the region of overlap between treatment and control groups, but inferences for averages outside this zone will be highly sensitive to model specification. The overarching theme here is that the counterfactual expression of causal estimands is inherently nonparametric and unbounds causal inference from the traditional regression modeling framework. The counterfactual approach thus fits in very well with modern agent-based foundations of micro- and macro-economics which are based on individual behavior.
On the applied side, I think it’s fair to say that economics has moved in the past forty years to a much greater concern with causality, and much greater rigor in causal measurement, with the key buzzword being “identification.” Traditionally, in statistics, identification comes from the likelihood, that is, from the parametric statistical model. The counterfactual model of causal effects has shifted this: with causality defined nonparametrically in terms of latent data, there is a separation between (a) definition of the estimand, and (b) the properties of the estimator—a separation that has been fruitful both in the definition of causal summaries such as various conditional average treatment effects, and in the range of applications of these ideas. Organizations such as MIT’s Poverty Action Lab and Yale’s Innovations for Poverty Action have revolutionized development economics using randomized field experiments, and similar methods have spread within political science. I’m sure that a Nobel Prize will soon be coming to Duflo, Banerjee, or some others in the subfield of experimental economic development. [This was written in 2017, a couple years before that happened. — AG] Within micro-economics, identification strategies have been used not just for media-friendly “cute-o-nomics” but also in areas such as education research and the evaluation of labor and trade policies where randomized experiments are either impossible or impractical to do at scale.
Joshua Angrist and Guido Imbens are two economists who have worked on methods and applications for causal inference. In an influential 1994 paper in Econometrica, they introduced the concept of the local average treatment effect, which is central to any nonparametric understanding of causal inference. This idea generalizes the work of Rubin in the 1970s in defining what quantities are identifiable from any given design. Imbens has also done important work on instrumental variables, regression discontinuity, matching, propensity scores, and other statistical methods for causal identification. Angrist is an influential labor economist who has developed and applied modern methods for causal inference to estimate treatment effects in areas of education, employment, and human capital. The work of Angrist and Imbens is complementary in that Imbens has developed generally-applicable methods, and Angrist and his collaborators have solved real problems in economics.
Previous Nobel prizes that are closely related to this work include Trygve Haavelmo, Daniel McFadden, and James Heckman.
Haavelmo’s work on simultaneous equations can be seen as a bridge between macroeconomic models of interacting variables, and the problems of causal identification. As noted in the Imbens quote above, Haavelmo’s work can be seen to have anticipated the later ideas of potential outcomes as a general framework for causal inference. The ideas underlying McFadden’s work on choice models has been important in modern microeconomics and political science and are related to causal inference in that all these models depend crucially on unmeasurable latent individual variables. Unobservable continuous utility parameters can be thought of as a form of potential outcome. I do not know of any direct ways in which these research streams influenced each other; rather, my point is that these different ideas, coming from different academic perspectives and motivated by different applied problems, have a commonality which is the explicit modeling of aggregate outcomes and measures such as choices or local average treatment effects, in terms of underlying distributions of latent variables. This approach now seems so natural that it can be hard to realize what a conceptual advance it is, as compared to direct modeling of observed data.
Heckman’s work on selection models is important very broadly in microeconomics, as it is the nature of economic decision making that choices do not, in general, look like random assignments. Indeed it can be said that all of microeconomics resides in this gap gap between statics and dynamics. Heckman’s selection model differs from the most popular modern approaches to identification in that it relies on a statistical model rather than any sort of natural experiment, but it falls within the larger category of econometric methods for removing or reducing the biases that arise from taking naive comparisons or regressions and considering these as causal inferences.
You can take all the above not as an authoritative discussion of the history of econometrics but rather as a statistician’s perspective on how these ideas fit together. See this 2013 post and discussion thread for some further discussion of the history of causal inference in statistics and econometrics.
P.S. More here.
I guess I’d say it all started with Sewell Wright, who seems to be completely forgotten from all these histories.
Pearl traces things back to Sewell Wright and Haavelmo, and Wright in particular is singled-out in the “Book of Why”. His comments on the award:
https://twitter.com/yudapearl/status/1447596404101160964
George Davey Smith also points to Sewell in this recorded talk https://www.youtube.com/watch?v=Ai5Vf74xVmQ&t=6s
I’m curious what you, Andrew, and/or other commenters things–was Rubin slighted? True, he’s not an economist, but haven’t mathematicians won the econ prize before? Why not a statistician?
Besides popularizing the RCM, he was Imben’s co-author on their textbook (not to mention a bunch of papers), and (I guess this is my statistician bias) I think of Angrist Imbens and Rubin (1996) as really central to a lot of this stuff.
Adam:
The ideas and work of Rubin (along with others) were honored, and I think that’s the most important thing. What ultimately matters is the work, not the names and personalities. If it were up to me, they’d give the award to research projects, not to individual people.
> was Rubin slighted?
Yes. Isn’t that undeniable? There is no single person more responsible for the rise of the potential outcomes approach than Rubin. Everything traces back to Rubin (1974) and related work. Imbens’ most cited paper is co-authored with Angrist and Rubin. That same paper is Angrist’s most cited research. That paper is:
Identification of causal effects using instrumental variables
JD Angrist, GW Imbens, DB Rubin
Journal of the American statistical Association 91 (434), 444-455
Given that Rubin was slighted, the more interesting question is Why?
There’s a maximum of three people that can win in a year. Isn’t it a general criticism of the Nobel that this often results in people slighted?
https://en.wikipedia.org/wiki/Nobel_Memorial_Prize_in_Economic_Sciences
“As with the Nobel Prizes, no more than three people can share the prize for a given year; they must still be living at the time of the Prize announcement in October; and information about Prize nominations cannot be disclosed publicly for 50 years.[23]”
D:
I agree that Rubin’s work on causal inference is important and influential. I also don’t think it’s any great mystery that a committee of economists giving a prize for economics will choose three economists to be the recipients.
Please change the blog formatting back! This font is so hard to read!
Agreed — I prefer the previous style / color scheme as well, but at least change the font to something legible!
I played around a bit with this and made it more readable:
https://addons.mozilla.org/en-US/firefox/addon/custom-style-script/
URL:
https://statmodeling.stat.columbia.edu/*
CSS:
div#page {
max-width: 100%;
}
.singular .entry-content {
width: 100%;
}
body, input, textarea {
font-weight: 500;
line-height: 1.1;
}
.commentlist .avatar {
width: 0px;
height: 0px;
}
.commentlist > li.comment {
width: 900px;
}
.wp-block-latest-comments__comment-excerpt p {
line-height: 1;
}
.wp-block-latest-comments__comment-meta, .wp-block-latest-comments__comment-date {
line-height: 1;
}
Then used this add-on for the dark theme:
https://addons.mozilla.org/en-US/firefox/addon/darkreader/
It is much better:
https://i.ibb.co/cw3qGYd/comments.png
https://i.ibb.co/YyCVq9v/homescreen.png
I couldn’t figure out how to align the comments to the left without breaking something else yet. That is the only big issue remaining, imo.
Another important thing is Andrew should bring back the recent comments to the article pages. It appears they are only on the homepage now.
Also make recent comments much larger. I used it all the time, like 50, or more.
I figured out how to align the comments to the left, also uncentered the article titles and decreased the title font size a bit. Then got rid of the avatars in the recent comment feed. Adding this CSS results in a readable site on my PC. I have no idea on mobile:
https://pastebin.com/dvbwDYGp
Oooh this is fun. Thanks!
Yes! And is there a reason why only the text only occupies the middle 20% of my screen? It’s like the bad old days of 2004 all over again.
Basically AB-testing (NHST) has lead to web designers optimizing for people scrolling to the bottom of the page, staying on the page the longest, and similar easy to measure metrics. This is why you get low-contrast text arranged in relatively short lines.
Anyway, you can change the CSS of any page you visit often pretty easily to suit your own preferences. Then everyone gets what they deem is best for themselves instead of a one-size-fits-all solution based on weird metrics.
Thanks for the add-on recommendation and CSS code, Anon. It’s sooooo much better now!
Yes, the blog is another victim of modern web aesthetics with illegible poor contrast fonts, e.g. see “[How the Web Became Unreadable”: https://www.wired.com/2016/10/how-the-web-became-unreadable/
On my phone, the horrible font is the primary problem. On my computer, half the screen is also wasted by huge areas of white (and blue) space on the left and right.
The font doesn’t seem too bad to me, but the bright blue stuff outside the margins is hard on the eyes.
I also prefer the comments to be right below the post.
+1. Or at least to some design that doesn’t only use a fourth of my screen.
I asked my friends/neighbors who are ages, 32, 39, 27, 40, 62, if they can read the letters. All said the letters are faint. The whitespace is really white also.
The issue is that once you get 4+ deep in the comments the page is 70%+ void of information and there are only a few words per line of text. This actively prevents effective communication, and as demonstrated above can be fixed by changing a dozen or so lines of CSS (probably less since I haven’t messed with CSS for years either). I can’t imagine anyone actually wants to read a comment with 3 words per line, etc.
We can’t fix the short list of recent comments or add the recent comment sidebar to the article pages (instead of only the homepage) this way though. Andrew needs to have that done in the html.
Anon:
I don’t know how to do this myself, but our sysadmin can do it. I’m trying to collect the changes and do them all at once rather than asking him to keep going in and doing one little thing after another.
Makes sense. Thanks!
I was getting scared for a minute.
I have mixed feelings about Nobel prize, especially the Nobel prize in Economics. I agree with Andrew that ultimately scientific awards should go to projects rather than individual people. There are additional problems inherent in these awards such as giving prestige to people who have an effect of policy (Hayek noted this in his acceptance speech), giving credit for work without necessarily crediting the people on whose work they based it on, and treating science as an individual rather than a team effort.
Nevertheless, the Nobel prize does seem to capture the public imagination. It’s a way to publicize ideas and methods such as natural experiments and difference in differences. Partly, I believe many people can relate more to an idea if they can associate a person with it, which isn’t really how science works but it can be a vehicle by which to popularize science and scientific ideas.
Again I still have pretty mixed feelings. Working in statistics and policy evaluation, I have a pretty solid understanding of LATE, which may be a reason I feel particularly ambivalent about this year’s award.
Ambivalent:
My feelings are similar to yours; see for example here, here, here, and here.