Comments on: Difference-in-difference estimators are a special case of lagged regression

By: Z

Mon, 06 May 2019 21:23:40 +0000

By: Corey

Corey — Mon, 06 May 2019 20:40:22 +0000

In reply to Anonymous. So... you're saying they aren't the same in general. Anyway, causal identification (which, if you get it right, lets you figure out what an intervention will do) doesn't just depend on the equations but also on conditional independence assumptions (or equivalently, the causal graph backing the equations).

By: Daniel Lakeland

Daniel Lakeland — Mon, 06 May 2019 20:00:13 +0000

In reply to AnonymousCommentator.

I think the real problem here is one of cultural confusion. Bayesian models will look like:

Outcome[i] = Fb(EarlierOutcome, Covariates, Parameters) + Error[i]

The quantity of interest will be the posterior distribution over Parameters.

Whereas typical Econometric “unbiased estimator” methods will want

Outcome[i] – EarlierOutcome[i] = Fe(TreatmentIndicator, Covariates, Parameters) + Error[i]

and the quantity of interest is the “unbiased point estimate” of the Parameters, usually a linear coefficient of the treatment indicator.

If you restrict the Bayesian model to use EarlierOutcome in a strictly *linear* way, and restrict the EarlierOutcome coefficient to be 1, and restrict the usage of Covariates, and eliminate various structural equation assumptions in Fb, etc then you can convert the first model into the second form.

In this sense, the second form is a special case of the first one.

My impression is that the attraction of the second form is that with appropriate assumptions you can maybe get unbiased estimates of the TreatmentIndicator coefficient without having to make all the structural mechanistic assumptions that go into the Fb Bayesian model.

I personally don’t find that to be a convincing argument. It’s like saying that if you randomly tweak certain screws under the hood of your car you can get the fastest lap time without even knowing what a fuel injector is or whether the car is even a gasoline, diesel, or electric…. maybe so, but I doubt it in practice and besides the main thing I want to know is exactly what all the knobs do.

By: AnonymousCommentator

AnonymousCommentator — Mon, 06 May 2019 19:33:16 +0000

Andrew — You may want to change the title of your post because difference-in-difference is not a special case of lagged regression.

Li and Ding state this explicitly in the very nice article that you link to. I quote below:
“Gelman (2007) pointed out that restricting beta to equal 1 in (6) gives identical least squares estimators for tau from models (5) and (6). This suggests that, under these two linear models, the difference-in-difference estimator is a special case of the lagged dependent-variable regression estimator. However, the nonparametric identification Assumptions 1 and 2 are not nested, and the difference-in-differences estimator is not a special case of the lagged-dependent variable adjustment estimator in general.”

Their reference to “Gelman (2007)” is one of your blog posts. Li and Ding are politely pointing out that you are mistaken in your statements about differences-in-differences because you have forgotten about more general cases.

Thank you for linking to Li and Ding’s paper. It was a useful read.

By: A. Tasso

A. Tasso — Mon, 06 May 2019 18:01:36 +0000

I see that, in response to the comment Jens posted on Gelman’s blog from 2007, Gelman said “Thanks for the comments. I’ll take a look more carefully and get back to you all.” But I don’t see that he posted a follow up response.

By: Anonymous

Anonymous — Mon, 06 May 2019 15:10:54 +0000

From that 2007 post, a comment: https://statmodeling.stat.columbia.edu/2007/02/15/differenceindif/#comment-42233

“they are not the same. In fact, they are based on very different assumptions. ….
[DID case]

Y_1i – Y_0i = beta_0 + beta_1 D_i + e_i

so need mean-independence (e orthogonal to D_i). while the lagged regression yields:

Y_1i = gamma_0 + gamma_1 Y_0i + gamma_3 D_i + e_i ”

Apparently taking a statistics course makes you lose all notion of algebra:

setting gamma_1 to 1 and solving for “Y_1i – Y0i” and inspecting coefficients yields that the models are *exactly the same* when beta_0 = gamma_0 and beta_1 = gamma_3