Causal inference and decision trees

Causal inference and decision analysis are two areas of statistics in which I’ve seen very little overlap: the work in causal inference is typically very “foundational” with continuing reassessment based on first principles, whereas decision analysis is more of meat-and-potatoes Bayesian inference–slap down a probability model, stick in a utility function, and turn the crank. (With all this processing, this must be ground beef and mashed potatoes.)

Actually, though, causal inference and decision analysis are connected at a fundamental level. Both involve manipulation and potential outcomes. In causal inference, the “causal effect” (or, as Michael Sobel would say, the “effect”) is the difference between what would happen under treatment A and what would happen under treatment B. The key to this definition is that either treatment could be applied to the experimental unit by some agent (the “experimenter”).

In parallel, decision analysis concerns what would happen if decision A or decision B were chosen. When drawing decision trees, we let squares and circles represent decision and uncertainty nodes, respectively. To map on to causal inference, the squares would represent potential treatments and the circles would represent uncertainty in outcomes–or population variability.

In practice, the two areas of research are not always so closely connected. For example, in our decision analysis for home radon, the key decision is whether to remediate your house for radon. The causal effect of this decision on reducing the probability of lung cancer death is assumed to follow a specified functional form as estimated from previous studies. For our decision analysis we don’t worry about too much about the details of where that estimate came from.

But in thinking about causal effects, the decision-making framework might be helpful in distinguishing among different possible potential-outcome frameworks.

2 thoughts on “Causal inference and decision trees

  1. I think this is related…

    Can anyone recommend a useful contemproary review of statistics and the law?

    We have a problem involving matching a high dimensional vector of observations with a large database of predictions. People are interested in the reliability of a putative match. The only analogous discussion I've seen in the lit is for presentation of dna matching evidence in criminal trials.

Comments are closed.