“Causal” is like “error term”: it’s what we say when we’re not trying to model the process.

After my talks at the University of North Carolina, Cindy Pang asked me a question regarding causal inference and spatial statistics: both topics are important in statistics but you don’t often see them together.

I brought up the classical example of agricultural studies, for example in which different levels of fertilizer are applied to different plots, the plots have a spatial structure (for example, laid out in rows and columns), and the fertilizers can spread through the soil to affect neighboring plots. This is known in causal inference as the spillover problem, and the way I’d recommend attacking it is to set up a parametric model for the spillover as a function of distance, which affects the level of fertilizer going to each plot, so that you could directly fit a model of the effect of the fertilizer on the outcome.

The discussion got me thinking about the way in which we use the term “causal inference” in statistics.

Consider some familiar applications of “causal inference”:
– Clinical trials of drugs
– Observational studies of policies
– Survey experiments in psychology.

And consider some examples of problems that are not traditionally labeled as “causal” but do actually involve the estimation of effects, in the sense of predicting outcomes under different initial conditions that can be set by the experimenter:
– Dosing in pharmacology
– Reconstructing climate from tree rings
– Item response and ideal-point models in psychometrics.

So here’s my thought: statisticians use the term “causal inference” when we’re not trying to model the process. Causal inference is for black boxes. Once we have a mechanistic model, it just feels like “modeling,” not like “causal inference.” Issues of causal identification still matter, and selection bias can still kill you, but typically once we have the model for the diffusion of fertilizer or whatever, we just fit the model, and it doesn’t seem like a causal inference problem, it’s just an inference problem. To put it another way, causal inference is all about the aggregation of individual effects into average effects, and if you have a direct model for individual effects, then you just fit it directly.

This post should have no effect in how we do any particular statistical analysis; it’s just a way to help us structure our thinking on these problems.

P.S. Just to clarify: In my view, all the examples above are causal inference problems. The point of this post is that only the first set of examples are typically labeled as “causal.” For example, I consider dosing models in pharmacology to be causal, but I don’t think this sort of problem is typically included in the “causal inference” category in the statistics or econometrics literature.

41 thoughts on ““Causal” is like “error term”: it’s what we say when we’re not trying to model the process.

  1. Obvious once you’ve heard it, but an incredibly useful lens on the world. Another gold nugget from a blog that has proven itself a very rich vein.

  2. Nicely captured. Would be good to have a formal way of expressing “modeling of individual effects with a mechanistic model” within the framwork of causal inference. This could be possible by using random effects describing individual response, add them to the DAG, and condition on them. This could also be a way to replace “sequential exchangability” with a mechanistic model.

    Anyone, any thoughts on this? Or better, a reference to a paper discussing this?

    • Individual effects are well captured within the framework of causal inference, because every SCM (structural causal model) can be regarded as “mechanistic model” and gives you, for every individual u, the counterfactual Y(x,u), for every X and Y.
      At the same time, Andrew’s statement that you need the mechanistic model to get individual effect is wrong.
      See chapter 9, and 11.9.1. Also, see recent conceptual paper on personalized decision making https://ucla.in/33HSkNI
      No model at all is needed, just experimental and observational data from same population.

      • Thank you! You say that it is possible to estimate individual causal effects using causal inference if you have appropriate data. This seems to be compatible and does further support the original statement from Andrew that mechanistic models that model individual effects do (or can) provide causal estimates, however these models are not thought of as providing causal estimates.

        Taking up the example from above, dosing in pharmacology. This gives you a rich body of literature when you search for it in google schoolar:
        https://scholar.google.com/scholar?hl=en&q=pkpd+modeling

        As in your conceptual paper, most of this work looks into estimating individual treatment effects. The approach works best, if you have rich data with several or many observations for each subject. The models try to honor any available pharmacological understanding of the drug and the disease and aim at a detailed description of the available data including systematic trends, variability and individual parameters. Technically, the models are often non-linear mixed effect models with random effects to describe differences between subjects, and use modeling & simulation similar to some of the g-computation methods.

        Thus, reformulating my question, do we need to use SCM or g-computation with sequential exchangeability or any other method established for “causal inference” to analyze this data to get valid causal estimates (assuming that the data supports this). Or, can we use the established mechanistic modeling methods, and show that the estimates are valid causal estimates. My strong impression is that this should be possible as suggested by Andrew and as alluded to in my first answer reply, but I’m not sure that this has already been done.

        Does anyone have further input, ideas, references?

        • Maybe of interest, Sarah Ackley’s recent commentary- she’s straddling mathematical modeling in the infectious disease epi tradition and causal modeling as has come to dominate epi:
          Sarah F Ackley, Justin Lessler, M Maria Glymour, Dynamical Modeling as a Tool for Inferring Causation, American Journal of Epidemiology, Volume 191, Issue 1, January 2022, Pages 1–6, https://doi.org/10.1093/aje/kwab222

  3. “To put it another way, causal inference is all about the aggregation of individual effects into average effects…”

    Isn’t it just the opposite? You measure the aggregate / average effect and attempt to disaggregate into individual causes. That’s why it so frequently fails. In fact it’s only in the best case scenario that you even know what your measuring, since you don’t know even know how many components make up the individual effects, let alone what they are or their relative magnitudes.

    It’s great that you point out this distinction. Scientists have been using the approach of isolating individual effects and modeling them for centuries, so it would be great for statisticians and data analysts to use a successful approach too.

  4. Interesting. But at some level isn’t everything is a black box? Gravity causes things to fall to the ground, but that’s just because we have a name (gravity) for the regularity of things falling to the ground. And in a real world model of things falling to the ground, you have various intervening effects (air pressure, high mass objects just beneath the earth’s surface) that might actually have effects if the experiment is sensitive enough. In your fertilizer example, distance doesn’t “cause” anything — some things (hopefully) highly correlated with distance do.

    • And while distance cannot be caused by the fertilizer either it may not be so clear for other covariates. (Does the fertilizer diffuse better when the soil is warmer or does its diffusion have an effect on the temperature?)

    • Jonathan:

      Regarding your last sentence: In the fertilizer example, I’m not saying that “distance” causes anything; it’s the application of fertilizer that has a causal effect, an effect that varies by distance.

  5. With the fertilizer example, even if we have a mechanistic model for spillover, aren’t we still doing causal inference in some meaningful sense when we estimate average treatment effects based on collections of individual measurements? More specifically, we still depend on aspects of the design and various statistical assumptions to get average control and treatment estimates of whatever we’re measuring and then interpret them as unbiased estimates of the full set of potential controls and treatments, right? Or maybe this is just what you were referring to as the black box?

    • Noah:

      Yes, we are indeed still doing causal inference in the setting where the spread of the fertilizer is being modeled. I’m not saying this isn’t causal inference; I’m saying that this doesn’t usually get thought of as causal inference in statistics jargon. Just like, when I do pharmacology, it falls in the category of “pharmacology” or “mechanistic modeling,” not in the category of “causal inference,” even though it actually is causal inference.

      • I think Noah is saying that in the fertilizer example even after modeling fertilizer spread you still need to do what is usually thought of as causal inference. This is because accounting for interference (even if there is a model for the interference process) is still “traditional” causal inference.

  6. This is really fascinating/clarifying — I wonder if there is an in between space and whether there is any literature on this. For instance, pharmacology models in the lab might not work as well in modeling clinical trials where there are other potential confounded so you somehow want to get “causal” estimates of parameters that will generalize. In my own field of cognitive science, I imagine this is one thing that limits models derived from precise lab experiments being applied to more real-world data/scenarios. I remember seeing something related to this idea previously on the blog (https://statmodeling.stat.columbia.edu/2020/01/08/how-to-cut-using-stan-if-you-must/) but I’m wondering if there are other ways of thinking about it as well. I’d imagine there is work on structural models in Econ that is related but would be curious if anyone knows of specific references.

    • No way Andrés.
      Causal MODELS are models. Hierachical models are a type of causal model. All your career has been based on causal models. Causal models try to model causality, a very real phenomenon of Nature. So what is causality, really? This is how I, an ex-physicist like you, defines it:

      Causality is a time-induced ordering between two events, the transmission of information (and the associated energy) from the earlier of the two events to the later one, and the physical response of the later event to the reception of that information.

      And what is Causal Inference: It’s a theory that models causality using probability and statistics. The theory distinguishes between correlation and causation, and it tries to answer “Why?” and “What if?” type questions.

    • “I’d imagine there is work on structural models in Econ that is related but would be curious if anyone knows of specific references.”

      Pretty much all of econometrics, beyond the elementary OLS level, is based on this. In lieu of mechanistic models, econometricians use economic theory to come up with an underlying model of what’s going on, and then choose an appropriate estimator.

      E.g. if the price of something goes up, does that cause people to buy less of it, so we’ll see a fall in the quantity? Or does it cause people to want to sell more of it, so we’ll see an increase in quantity? Without an economic model, we have to fumble around looking at not just the correlation of price and quantity but also try to figure out the role of possible covariates.

      Or we could apply the economic model of supply and demand and realize that there are two major forces at work, and model how supply and demand interact, and based on that model choose to use two-stage least squares, instrumental variables, etc. to do the estimation.

      The problem of course is coming up with that economic model, and getting agreement that it’s the right one to use. What if someone says that ordinary supply-and-demand is not appropriate to use in studying this market because some firms have market power or even a monopoly. Or the market is not in equilibrium as assumed by supply-and-demand models, or the market is actually composed of several sub-markets, or etc. etc.

      Economics does have an advantage over other social sciences in that many of its research areas are ones where standard economic models can work reasonably well. When a good becomes more expensive, people buy less of it: you can rely on that economic law to hold, as long as other variables don’t change their values too much.

      But needless to say there are many other areas where there are a lot of economic models that could be used, that disagree with each other, and no one knows which one is best. Economists can’t predict when the next recession will come any better than geologists can predict when the next big earthquake will hit California. (I.e. we know it’s coming, we even have some idea of how frequently it happens, but we don’t know exactly when it’ll happen.)

  7. apologies in advance for the pedantry.

    I’ve designed fertilizer experiments for the last three years. While I get your meta-point because something like this is how I’d recommend approaching the problem, this is a bad practical example. ClimateCorp (not where I worked) spent an immense amount of money to try and solve this and did poorly

    (1) The academic agronomists I work with disagree about how much the fertilizer will flow in a well-designed trial on mostly flat land, but their numbers are all small: between 0 and 2 rows. The typical algorithm is therefore just to measure results from the middle… e.g. the middle 2 rows of a 4 row plot.
    (2) You don’t have a measurement that’s strongly controlled by fertilizer at that spatial resolution.

    You do see spatial effects at larger trials on farm that a diffusive model can help you deal with. And if your field has a lot of topology and is thus inconsistent, you could see more fertilizer drift… but that’s really driven by the hydrology. So actually need to model an interaction of the hydrology/topology term with the treatment structure… I’ve never seen enough statistical power to do this.

    Your best approach is to use a crop/soil model, but those require a lot of calibration, so in practice you can’t tune this.

  8. this title reminds me of the slam-sham paper that, even in randomized control trial we can do some sort of regression/modeling; even with iid sample draws we can do some sort of quadrature, etc. We have been so used to the fitting-the-model-as-much-as-you-want approach, while its general benefit or cost (on robustness and estimate efficiency) is not well-understood.

    • Yuling:

      Yes, I agree. A few things are going on. One thing is that in causal inference people will often put too high a value on “unbiasedness,” which can cause big problems as discussed in that sham-chickens paper. Another thing is the connection between randomization, or causal identification more generally, and robustness, which is a point made by Rubin in a classic 1978 paper.

  9. “To put it another way, causal inference is all about the aggregation of individual effects into average effects”

    Andrew: Your idealized characterization begs the question of how often the necessary conditions are satisfied. In the case of observational / correlational research in the social and behavioral sciences the answer is pretty much never. The Introduction and Discussion sections of most empirical reports traffic freely in speculations about causal processes when they refer to things that activate, blunt, buffer, compromise, disrupt, exacerbate, suppress other things. These and related terms don’t refer to mere statistical operations, mathematical functions or abstract universal laws. They’re action words, verbs, explicit theoretical references to putatively causal generative mechanisms—things influencing other things, hypothesized psychological processes— that can exist only in concrete living individuals. In the Methods and Results sections, however, these individuals are nowhere to be found. The unit of analysis for data analytic and statistical modeling purposes is instead an aggregate data structure characterized by things like conditional dependencies, effect sizes, and the relative statistical “contributions” of specific variables to variance in the dependent variable(s) of interest. And then something magic happens. The linguistic landscape of the Discussion section is no longer statistical but theoretical once again. Variables don’t just correlate and predict as they did in the Methods and Results sections. We’re told now that they also influence and explain one another, as readers are invited to join the authors in concluding that the aggregate-level data structure reveals something causally important about the psychological processes of concrete, living individuals they never actually studied.

    The most troubling thing about this practice is that it isn’t considered troubling at all!!

    John

      • Andrew says:
        >>But the term “causal inference” in statistics and econometrics typically seems to be reserved for black-box setting.

        >>Once we have a mechanistic model, it just feels like “modeling,” not like “causal inference.”

        Andrew, isn’t what you mean by “reserved for black-box setting” equal to the more conventional statement “reserved for dealing with missing information”?

        Also, when you use the expressions “feels like modelling” and “mechanistic model”, isn’t that equal to “feels like predicting”?

        Because if this is the case, it seems to me that what you are saying is that “causal inference” and “prediction” are different. That is also what Judea means when he says that causal inference is about rungs 2 and 3, whereas prediction is about rung 1.

        What a tangled web we weave.

        • Robert:

          No, I’m not saying that causal inference and prediction are different! I think of causal inference as a special case of prediction, where we’re making predictions conditional on hypothetical treatment assignment.

          Also, no, I do not equate “missing information” with “black-box.” with For example, a multi-compartment model in pharmacology has lots of missing information, but I think of it as a “mechanistic model,” not a “black box.”

        • Okay, Andrew, but let me clarify that what I meant by “missing information” is the missing information in a counterfactual (this is the familiar Rubin interpretation of counterfactuals) . I did not mean the unobserved, hidden nodes, in, for instance, a Kalman filter or a Hidden Markov Model.

    • Thanks for the citation!

      I’d add that I see causality as equally if not more fundamental than probability in applied statistics. All inferential statistics depend on some kind of model (set of assumptions) about the causal mechanism generating the data – mechanisms that include (beyond any targeted effect) factors affecting selection for observation and treatment as well as factors affecting measurements and outcomes. Experimental design is then seen as largely about eliminating (blocking) causal effects of those factors, which is sometimes emulated (not always with complete success) by conditioning on the factors.

      In my experience, the abstraction of causal assumptions into acausal probability distributions has appeared as a major obstacle for users trying to understand what the resulting statistics do and don’t mean in the application context. Consequently, my ideal of a general stat course covers logic and causation before probability and statistics (in fact that’s how I taught stats for several decades, introducing both potential-outcome and graphical causal models under the causation section).

      For more details see
      Greenland, S. (2022). The causal foundations of applied probability and statistics. Ch. 31 in: Dechter, R., Halpern, J., and Geffner, H., eds. Probabilistic and Causal Inference: The Works of Judea Pearl. ACM Books, no. 36, 605-624, https://dl.acm.org/doi/10.1145/3501714.3501747, https://arxiv.org/abs/2011.02677

      • I learn more from the comments section of this blog alone than from a year’s worth of NBER working papers. Thank you all and thanks Andrew for keeping the lost art of blogging alive.

  10. Yeah, I totally agree with this in terms of the labeling/framing of what is a causal inference problem and what is plain-old model-fitting and inference. The only thing that I think is tricky here is that the mechanistic/process models are still usually pretty abstract, e.g. transmission models or spatial models are closer in spirit to linear regression than to the actual processes they purport to represent, even if they’re something less of a black box than a process-free model. That’s not inherently problematic, but I think we’ve seen through covid that it’s easy to get kind of mystified by more realistic models and forget how abstract they really are.

  11. A rare, half-way agreement with Andrew who wrote:
    “statisticians use the term “causal inference” when we’re not trying to model the process.”
    True, statisticians use the term when they are not trying or when the can’t.
    Causal Inference folks use this term when they ARE trying to model the process, and succeed.

    • Judea:

      See P.S. above. I do consider mechanistic models (for example, for dosing in pharmacology) to be causal. And I’m a statistician. But the term “causal inference” in statistics and econometrics typically seems to be reserved for black-box settings. In that sense, the above post is all about terminology—but it’s not just about terminology, in that I am concerned that the formulation of causal inference in terms of black boxes (no modeling of the mechanism) can give researchers an excuse to . . . not model the mechanism! So I think we’re in agreement here, except I disagree with your implicit distinction between “statisticians” and “Causal Inference folks.” There’s lots of overlap between those two categories of researcher.

      • Re-reading your post, I pause at every line that mentions “causal inference” and I say to myself: This is not my “causal inference”, and if Andrew is right that this is what statisticians mean by “causal inference”, then there are two non intersecting kinds of “causal inference” in the world, one used by statisticians and one by people in my culture that, for lack of better word, I call “causal inference folks”.
        I cannot go over every line, but here is a glaring one: “causal inference is all about the aggregation of individual effects into average effects, and if you have a direct model for individual effects, then you just fit it directly.”
        Not in my culture. I actually go from average effects to individual effects. See https://ucla.in/33HSkNI.
        Moreover, I have never seen “a direct model for individual effects” unless it is an SCM, Is that what you had in mind? If so how does it differ from a “mechanistic model”. What would I be missing if I use SCM and never mention “mechanistic models”?

        Bottom line, your post reinforces my explicit distinction between “statisticians” and “Causal Inference folks.” to the point where I can hardly see an overlap. To make it concrete, let me
        ask a quantitative question: How many “statisticians” do you know who subscribe to the First
        Law of Causal Inference, or to the Ladder of Causation, or to the backdoor criterion or etc.
        These are foundational notions that we “causal inference folks” consider to be the DNA of our culture, without which we are back in pre-1990 era.

        • Judea:

          I agree that the phrase “causal inference” is used in different ways by different people. That is the point of my post. There, I gave three examples of problems that I’ve worked on (dosing in pharmacology, reconstructing climate from tree rings, and item response and ideal-point models in psychometrics) that I consider to be causal inference, but they would not be put in the “causal inference” category in statistics books. There are other problems I’ve worked on (estimating the effects of incumbency advantage and the Millennium Villages project ) that I also consider to be causal inference, and they would be considered causal inference in statistical terminology. It struck me that a key difference between the two sets of examples is that in the first set of examples we have an individual causal model. For instance, in the dosing problem we have a differential equation model of the flow of the drug through compartments of the body: I consider this a causal model in that it makes predictions about potential outcomes defined by different dosings. I’m fine also calling this a mechanistic model.

          You ask, “What would I be missing if I use SCM and never mention ‘mechanistic models’?” My answer is that you’d be just fine! The point the above post is to discuss the way in which these different problems are placed into categories. Each problem can be solved on its own, and indeed there are lots of different ways of solving problems of dosing, climate reconstruction, etc., and we often use terms such as “differential equation modeling,” “inverse problems,” etc., which are focused on the type of mathematical model being used rather than on the statistical aspects of the problem. There’s no reason why you should use the term “mechanistic model” if you don’t like it; it’s a term I’m using here because it helps me understand the characterization of these different problem.

        • All,
          I have summarized the above in a blog entry titled: “What statisticians mean by `Causal Inference’: Is Gelman’s blog representative?”
          https://ucla.in/39uPjDc
          which provides additional references to “The first Law”, Individual-level causation, and more. It is being discussed now in my Twitter account @yudapearl; you are welcome to join.

  12. I really like the mechanistic vs. statistical framing of causality. I think that these are distinct concepts that reflect real differences that we often paper over. A mechanistic models I would argue is one that (if we knew all the components) would work 100% of the time. Modelling a light switch for instance–if the switch is on, then we know that the light will turn on. There are many biological, physical, and other processes work like this. In opposition to mechanistic, I would frame the other type of causality as those that are inherently probabilistic and consequently statistical.

    I think that there is a tendency to want to be able to reduce all probabilistic causal models to mechanistic (or assume that the probabilistic is a proxy of the mechanistic) but I do think that there are situations where there are legitimate probabilistic causes. Probabilistic causes work at the level of the group and so we can make mechanistic-type conclusions at the group level but there is an inherent randomness if we want to apply it to individuals. An example from physics is boiling water–you can show mechanistic-type causes at the group level (average, probabilistic, speed of molecules can be determined precisely), but there is inherent randomness at the individual level (individual molecule speeds will vary).

    • This is a useful distinction, even tough I would see your two examples as extremes, with most real situations lying in-between. How about fasting for 6 hours, then eating an apple or a candy and then following up on blood sugar and hunger over the next few hours? It is not fully mechanistic, it will be different each time a person does it, and different people will react somewhat differently.

      The distinction is useful, since for the two extremes one would clearly use different modeling techniques to describe the data. For mechanistic/deterministic situations, we use things such as differential equations to describe the data. Situations dominated by uncertainty/variability use statistics to describe the observations. In-between we have tools such as nonlinear mixed effects models.

      Going to causal inference, mechanistic/deterministic situation cover causal inference implicitly. For statistics powerful methods for causal inference have been developed over the last few decades including tools such as do-calculus, counterfactuals or structural causal models. In-between when combining mechanistic with statistical aspects, my impression is that we still lack clarity for causal inference. Here, further research may perhaps be more required than strong words.

Leave a Reply

Your email address will not be published. Required fields are marked *