Skip to content

Piss-poor monocausal social science

Dan Kahan writes:

Okay, have done due diligence here & can’t find the reference. It was in recent blog — and was more or less an aside — but you ripped into researchers (pretty sure econometricians, but this could be my memory adding to your account recollections it conjured from my own experience) who purport to make estimates or predictions based on multivariate regression in which the value of particular predictor is set at some level while others “held constant” etc., on ground that variance in that particular predictor independent of covariance in other model predictors is unrealistic. You made it sound, too, as if this were one of the pet peeves in your menagerie — leading me to think you had blasted into it before.

Know what I’m talking about?

Also — isn’t this really just a way of saying that the model is misspecified — at least if the goal is to try to make a valid & unbiased estimate of the impact of that particular predictor? The problem can’t be that one is using a regression to try to estimate the impact of variance in one of member of a set of independent influences on some state of affairs or other phenomenon — that’s commonplace; it has to be that a model that posits variance in one predictor w/o corresponding covariance in another which w/ which it is collinear will yield invalid or biased predictions. Better just to leave out the collinear covariate out of the model, b/c then the regression estimate of the impact of the predictor of interest will “include” the covarying impact of the (now omitted) covariate (or maybe better yet, combine the covariate and the predictor of interest into some sort of index—just don’t try to estimate effect of one “holding other constant” if that’s not the way the world works)

My reply:

I don’t recall this post at all! Maybe you’re thinking of some other curmudgeon. . . . In any case, I’m reminded of the advice I often give that each causal inference typically requires its own analysis. I’m generally skeptical of an analysis where someone picks out one coefficient to address Hypothesis 1, another to address Hypothesis 2a, and so forth. If a causal inference can be framed via a natural experiment on some variable x, you want to control for things that come before x and not what comes after (I’m thinking of logical order, which is related to but is not identical to time order).


  1. fred says:

    Dan, is it something like this? (In particular, the challenges of interpreting main effects in the presence of interactions?)

  2. revo11 says:

    What if the treatment is a categorical variable? Is it preferable to treat each category as a separate analysis?

  3. Dave says:

    I think he’s referring to this post, which refers to this post, which refers to this post.

  4. dmk38 says:

    Dave nailed it! Thanks. Post that Fred linked is also very pertinent.