Yesterday I spoke at the Princeton economics department. The title of my talk was:
“Unbiasedness”: You keep using that word. I do not think it means what you think it means.
The talk went all right—people seemed ok with what I was saying—but I didn’t see a lot of audience involvement. It was a bit like pushing at an open door. Everything I said sounded so reasonable, it didn’t seem clear where the interesting content was.
The talk went like this: I discussed various examples where people were using classical unbiased estimates. There was one example with a silly regression discontinuity analysis controlling for a cubic polynomial, where it’s least squares so it’s unbiased but the model makes no sense. And another example with a simple comparison in a randomized experiment, where selection bias (the statistical significance filter and various play-the-winner decisions) push the estimate higher so that the reported estimate is biased, even though the particular statistical procedure being reported is nominally unbiased.
My point was: Here are these methods that respected researchers (including economists) use, that get published in top journals, but which are clearly wrong, in the sense of giving estimates and uncertainty statements that we don’t believe.
Why are people using such methods, in one case using a clearly inappropriate model and in the other case avoiding clearly appropriate adjustments?
I think part of the problem is a prioritizing of “unbiasedness” and a misunderstanding of what this really means in the practical world of data analysis and publication. The idea is that unbiased estimates are seen as pure, and that it’s ok to use an analysis that’s evidently flawed, if it does not “bias” the estimate. So, in a regression discontinuity setting, it’s considered ok to control for a high-degree polynomial because this fits into the general idea that, if you control for unnecessary predictors in a regression, you’re fine: it adds no bias and all that might happen is that your standard error gets bigger. Now, ok, nobody wants a big standard error, but remember that the usual goal in applied work is “statistical significance.” So . . . as long as your estimate is more than 2 standard errors away from 0, you’re cool. The price you paid in terms of variance was, apparently, not too high.
In my talk, I then continued by briefly describing our Xbox analysis, as an example of how we can succeed by adjusting. Instead of clinging to a nominally unbiased procedure, we can reduce actual bias by modeling.
As I said above, the people in the audience (mostly economists and political scientists) pretty much agreed with everything I said, except that they disagreed with my claim that “minimizing bias is the traditional first goal of econometrics.”
Or, maybe they accepted that this was a traditional first goal but they said that econometrics has moved on. In particular, I was informed that econometricians are much more interested in interval estimation than point estimation, and their typical first goal now is coverage. In fact, I was told that I could pretty much keep my talk as it was and just replace “unbiasedness” with “coverage” everywhere and it would still work. Thus, various conventional approaches for obtaining 95% intervals are believed to be ok because they have 95% coverage. But, because of selection and omitted variables, these intervals don’t have that nominal coverage. And that’s good to know.
The other point that was made to me after the talk was that, yes, some of the work I criticized was by respected economists—but this work was not published in econ journals. One of the papers was published in the tabloid PPNAS, for example. And these Princeton people assured me that had the work I’d criticized been presented in their seminar, they would’ve seen the problems—the omitted variable bias in one example and the selection bias in the other.
The point I made which still holds, I think, is the critique of what is commonly viewed as inferential conservatism. I feel that a central stream in econometric thinking is to play it safe, to favor unbiasedness and robustness over efficiency. And my central point is that the choices that look like “playing it safe” (for example, using least squares with no shrinkage, or taking simple comparisons with no adjustments) are, in practice, only used when the resulting estimates are more than 2 standard errors away from 0—and this selection sets us up for lots of problems.
So, I agree that it’s misleading to think of unbiasedness as the first goal in modern econometrics, but it remains my impression that there’s a misguided tendency in econometrics to downplay methods that increase statistical efficiency.