Toward a Shnerbian theory that establishes connections between the complexity (nonlinearity, chaotic dynamics, number of components) of a system and the capacity to infer causality from datasets

Nadav Shnerb writes:

Regarding your latest post on casual claims, I believe there must be a theory of the “complexity-predictability relationship” that establishes connections between the complexity (nonlinearity, chaotic dynamics, number of components) of a system and the capacity to infer causality from datasets. For example, when a system’s dynamics are chaotic, the occurrence of state A at t=0 preceding state B at t=T provides very little insight into the system’s behavior when initiated at a slightly perturbed state A+dA and so forth. Are you familiar with any existing theories that explore this concept?

Recently, I delved into a variant of ANOVA, by examining the impact of random perturbations in selected elements of a random matrix on its highest eigenvalue. The outcomes imply that as the size of the matrix increases, the viability of such an analysis diminishes. This observation has prompted me to ponder these questions.

This is an interesting topic; it relates to our piranha theorems, I think. Shnerb is a physicist so he’s coming at this problem from a different direction than we are, as social scientists and mathematicians.

This also reminds me of the concept of unfolding-flower models and, more generally, the idea that we can infer more parameters when we get more data, with interesting patterns that arise for nonlinear models. There’s a connection here to phase transitions in statistical physics, where when you add more energy to a system, additional quantum levels get activated. I’ve wanted to pursue this idea further for decades but have never quite taken the time to work it out carefully in a series of examples. Maybe writing this post is a first step!

6 thoughts on “Toward a Shnerbian theory that establishes connections between the complexity (nonlinearity, chaotic dynamics, number of components) of a system and the capacity to infer causality from datasets

  1. Interestingly, if you have an entire trajectory, the fact that it diverges rapidly from epsilon perturbations implies that you can very precisely determine the parameters defining the measured trajectory.

    So yeah, two time points spread far apart are meaningless but multiple time points measured through time means you’re absolutely zeroing in on reality.

  2. I find the language here confusing. Doesn’t “chaotic” refer to systems for which future states cannot be predicted from current states by any method? Also, “chaotic system” seems like an oxymoron since the word “system,” at least in engineering, is used to denote a functional entity that produces a stable outcome in a definable regime.

    For any system capable of producing a stable outcome, there is a discrete set of equations that fully describe the system. The development and use of those equations falls under control theory. These would be “causation” equations, describing how energy flows through the system in a causal way. So for me, this sentence:

    “…a theory of the “complexity-predictability relationship” that establishes connections between the complexity (nonlinearity, chaotic dynamics, number of components) of a system and the capacity to infer causality from datasets”

    I would say that yes there is, and it is called “control theory.” Am I missing something?

    • It’s not really that future states cannot be predicted from current states. It’s that small perturbations in the parameters lead to very large changes in the behavior over time. Since the actual value of the parameters cannot be measured with arbitrarily small error, the future behavior of the system cannot be reliably calculated. That’s a defacto limitation, but the underlying equations are definite. In addition, some systems have chaotic regimes and non-chaotic regimes.

      A model that helps me to think about these things is a river. Some parts of the river may be in laminar flow, but others not. It we drop to wood chips into the river side by side, they will slowly move apart. We can’t even say that they will eventually move downstream at the average rate of the river water because one might, e.g., get caught in an eddy near the shore and never make it all the way downstream. Another chip might get caught in a rotor and released at some time later. Yet there are probably some useful statistical properties that could be discovered.

      • This is the right way to think about it. It’s easy to get chaotic behavior from simple well defined equations of motion. Basically a ball bouncing in a 2D box that has a circular obstruction in the middle is chaotic. Just the slightest change in initial position or velocity will result in eventually very different position (the difference grows exponentially in time).

        The fact that trajectory differences grow exponentially in time means if you have accurate measurements the initial conditions can be recovered to 1/exp(kt) type accuracy, the longer you observe, the tighter the bounds on the initial conditions (and/or other parameters).

        If you have errors in observations, but you have a long trajectory and a Bayesian model you can often get quite good identification of important parameters.

  3. Don’t think you really infer causality from data. It is something assumed by whatever theory or model.

    Eg, the current mainstream theory is that each event is collectively “caused” by every event in its past light cone.

    Of course our models are abduced (guessed) from some kind of data/experience so in that sense causality is still being “inferred”. The *capacity* to infer causality is the relative predictive skill of the model (relative to precision and accuracy of whatever other models have been suggested).

  4. Thanks for the piranha paper. It certainly does stimulate thought about what we are actually doing with data and what we hope to achieve. I would like to touch briefly on a couple of points in the paper.

    I work in the epidemiological area and indentifying causality hinges on understanding data generating mechanisms. The discussion of the piranha problem motivated me to think about our data centric methods to uncover causality. From the paper, Cook’s advice about context, “the structure of the system” is absolutely vital in my view. As the paper states “system-level variation puts a limit on what can be learned about the average effects of particular interventions”. Trouble is, we mostly don’t fully know system level variation. We make hypotheses about data generating mechanisms without having “all the data” about system level variation. At present I have to agree with the notion that, more generally, a low dimensional structure is probably what we can reasonably expect from our data and methods.

    The need for theorems to capture the surprising regularity we seem to see around us makes me think about paths of least energy from dynamics theory and perhaps is tied up with information theory/entropy.

    Regarding the absence of limiting results for high dimensions, I think we already have that, courtesy of Michel Talagrand for which he recently won a Fields Medal. His inequalities go a long way in understanding why we see regularity in my view.

    Your reference to Terrence Tao’s discussion about transitivity of correlations was great and putting correlation in high dimesnional data in a geometric context hinted at the results obtained by Talagrand.

    In the preamble about chaos theory on this page, it is important to distinguish between deterministic, probabilistic systems and systems that may be a hybrid of the two. Theory in one may not necessarily translate to the other but just the same, I think there would be overlap.

    Thanks again for an incredibly stimulating paper and discussion.

Leave a Reply

Your email address will not be published. Required fields are marked *