Jas sends along this paper (with Devin Caughey), entitled Regression-Discontinuity Designs and Popular Elections: Implications of Pro-Incumbent Bias in Close U.S. House Races, and writes:
The paper shows that regression discontinuity does not work for US House elections. Close House elections are anything but random. It isn’t election recounts or something like that (we collect recount data to show that it isn’t). We have collected much new data to try to hunt down what is going on (e.g., campaign finance data, CQ pre-election forecasts, correct many errors in the Lee dataset). The substantive implications are interesting. We also have a section that compares in details Gelman and King versus the Lee estimand and estimator.
I had a few comments:
David Lee is not estimating the effect of incumbency; he’s estimating the effect of the incumbent party, which is a completely different thing. The regression discontinuity design is completely inappropriate for estimating the effect of incumbency.
See here for a discussion (from 5 years ago) of David Lee’s work. I guess I should’ve published this, but Lee’s idea seemed so evidently inappropriate (at least for the problem of estimating incumbency advantage in the U.S.) that it didn’t seem worth devoting effort to it. Leigh Linden did convince me that the estimate makes sense for India, though, where it really does seem to make more sense to think of incumbency as a property of a political party rather than of an individual legislator.
Gary King and I distinguished between incumbency effect and incumbent party effect in our 1990 AJPS paper, where we explicitly laid out the causal inference. I also discussed this example in the linear regression chapter in Bayesian Data Analysis.
Finally, the most sophisticated analysis of incumbency advantaget and its variation that I know of is my recent Jasa paper with Huang.
In short, we’re estimating the effect of incumbency, Lee is estimating the effect of incumbent party. You can see this through a thought experiment in which all congressmembers are term-limited to serve only two years. There would then never be any incumbents running for reelection (thus, no incumbency effect) but there would be an incumbent party effect. And, correspondingly, the Gelman and Huang (or Gelman and King) estimate of incumbency advantage would be undefined, but the Lee estimator of incumbent party effect would work just fine.
And then I had a few comments on the Caughey and Sekhon paper in particular:
1. Perhaps you should make your title and abstract more general, since really the key contribution of your paper is not about incumbency (since, as you point out, the different estimates happen to be pretty similar for U.S. congress) but rather more generally about natural experiments.
2. The implication I get from the beginning of the paper is that, if only the RD assumptions were valid, Lee’s estimate would be just fine. Later on, you clarify that there are tradeoffs–basically, Lee is attempting to trade off validity for reliability. (I say that he’s trading off valididty because I don’t think anyone would really consider the incumbent-party effect as an incumbency effect.) So it’s clear by the end. But I didn’t think the point was so clear at the beginning. And a connection to the concepts of reliability and validity might be useful. That’s a general issue in causal inference: do you want a biased, assumption-laden estimate of the actual quantity of interest, or a crisp randomized estimate of something that’s vaguely related that you happen to have an experiment (or natural experiment) on?
3. You’re not gonna be suprised to hear that I think Table 2 should be a graph. Also, I really really don’t recommend fitting fourth-order polynomials. This is a weird thing that economists do, I don’t know why they do it unless it’s from some sort of strange tradition. High-order polynomials blow up at the extremes. Splines would be better. I suppose I should write a paper about this–maybe you’d be interested in collaborating on such an effort?
4. I think your conclusion is great, but I don’t get the point about “mass produced” studies. We only need one at a time, right? Mass production seems like a lot to ask in any case.
This is good stuff. Caughey and Sekhon cover some ground in the incumbency problem, do some innovative data analysis, and connect to important larger questions regarding reliability and validity in causal inference, and the temptation that’s out there to choose simple designs or simple models in order to obtain clean identification which might end up identifying a parameter that doesn’t quite line up to the original causal question of interest.