Methodological terrorism. For reals. (How to deal with “what we don’t know” in missing-data imputation.)

Kevin Lewis points us to this paper, by Aaron Safer-Lichtenstein, Gary LaFree, Thomas Loughran, on the methodology of terrorism studies. This is about as close to actual “methodological terrorism” as we’re ever gonna see here.

The linked article begins:

Although the empirical and analytical study of terrorism has grown dramatically in the past decade and a half to incorporate more sophisticated statistical and econometric methods, data validity is still an open, first-order question. Specifically, methods for treating missing data often rely on strong, untestable, and often implicit assumptions about the nature of the missing values.

Later, they write:

If researchers choose to impute data, then they must be clear about the benefits and drawbacks of using an imputation technique.

Yes, definitely. One funny thing about missing-data imputation is that the methods are so mysterious and are so obviously subject to uncheckable assumptions that there’s a tendency for researchers to just throw up their hands and give up, and either go for crude data-simplification strategies such as throwing away all cases where anything is missing, or just imputing without any attempt to check the resulting inferences.

My preference is to impute and then check assumptions, as here. That said, in practice this can be a bit of work so in a lot of my own applied work I kinda close my eyes to the problem too. I should do better.

5 thoughts on “Methodological terrorism. For reals. (How to deal with “what we don’t know” in missing-data imputation.)

  1. Just as we should be clear of the biases and assumptions that can be made using listwise and pairwise deletion. Which bias and assumptions do you want to highlight in our studies? No technique is without issues.

  2. I always find it unusual that researchers will avoid data imputation because of the complexity of the methods and think they are fine by excluding cases that have data missing (some even consider it to increase rigor). Do they not understand that if there’s a systematic reason for the data missing, it will simply bias the results?

    • While I agree with this as a general idea, it does depend on the problem being studied. While you might argue that it is better to model every case in a set of data rather than remove it, modeling cases without enough data to be useful is indistinguishable from deletion, though I certainly acknowledge the benefit of this as a normative practice given it forces researchers to think more deeply about their data.

Leave a Reply to Keith O'Rourke Cancel reply

Your email address will not be published. Required fields are marked *