Is your model converging?

This post is by Aki

I too often see people saying their model is converging or not converging. Sure, if you are doing iterative model building as part of your Bayesian workflow you could say that that iterative process eventually converges to the final model, but it seems people are actually talking about whether the inference algorithm is converging.

A Bayesian model describes a joint distribution of data and parameters. If we condition on observed data, we get the posterior distribution. We often use iterative inference algorithms to make posterior inference. If the inference algorithm doesn’t converge, the convergence problems don’t depend only on the model, but on the model, parameterisation, and the data, which together determine the geometry of the posterior. The same model and different parameterisation or data lead to different posterior geometry. For the same posterior, different iterative algorithms or algorithm choices can also lead to different convergence problems. (We have several exmples of iterative inference algorithm convergence problems in the soon to appear Bayesian workflow book)

If you want someone to help with possible inference convergence problems, it is not sufficient to tell which model you have, but you also need to tell about the parameterisation, data, and algorithm. Stop talking about models (not) converging (unless doing iterative model building) and talk about the inference algorithm (not) converging, as it is more accurate and implies dependency on the posterior and algorithm.

7 thoughts on “Is your model converging?

  1. Good points. And they remind me of two other things:

    1. Rubin has argued that, in any applied setting, the model you are fitting is the model you’re computing, which is not necessarily the same as the model you’ve written in mathematical notation. This holds for complicated stochastic iterative algorithms and also for very basic approaches such as optimization as in maximum likelihood. That’s one reason for doing posterior predictive checks, as then you’re comparing the data to the model you have actually fit.

    2. Statistical workflow involves nested looping. The inner loop is the iterative algorithm that’s being used to fit the model. (Except for the simplest models, the fitting algorithm will be iterative.) The outer loop is that you’re fitting different models. In general it’s better to have threaded rather than nested looping. But in our workflow we rarely have threaded looping. We’ll fit models one at a time, each with its own iterative algorithm. I guess that we (and data analysts more generally) could do better by embedding in a threaded looping framework.

  2. Well, duh… everyone who has ever talked about problems with convergence has very clearly meant problems with inference algorithms not converging under a given model, data, and parameterization. Natural language is just a lot simpler when you can assume the other party can make inferences from context clues.

    • Anon:

      Just because something seems obvious to you (“Well, duh”), it doesn’t mean it’s known to “everyone.” Aki’s had many collaborators and has taught zillions of students, and I think it’s a safe bet that this is a real misunderstanding he’s talking about here.

    • I agree that it’s not obvious why “inference algorithm (not) converging […] implies dependency on the posterior and algorithm” but “model (not) converging” doesn’t imply dependency on anything else. What would it mean for a model to converge anyway?

  3. My response – which I wrote and didn’t post because I am trying to be less negative here – was along the lines of what Anon wrote. The post left me wondering which fields Aki was referring to, and what it is about causal inference in those fields that makes this sort of stuff confusing, but there was nothing in the post about that. That would have been interesting to me.

    If you spent your career working on math models of mechanical systems, you would never be confused about this. The “inference algorithms” are just canned programs that you adapt to your system, and everyone knows that under some parameterizations your model will not converge upon a single solution because of the limitations of the algorithms.

    (Also, how do you know a priori that you didn’t goof up the model in such a way that the inference algorithm would have converged if you did things correctly, but your model won’t converge because you goofed it up?)

  4. I wrote this post in the context of Bayesian workflow and refer to the Bayesian Workflow book. The post is based on what I’ve seen while teaching Bayesian workflow for 200+ students per year, reading thousands of Stan Discourse posts with questions from applications in many different fields, many Bsky posts, and, recently, some AI agent skill instructions. As Bayesian workflow has iterative model building, I am recommending that in this context “model (not) converging” is not used for “inference (not) converging.” I don’t remember seeing “model not converging” in any context other than the Bayesian workflow context, but that is likely as I’m focusing on reading Bayesian workflow-related topics. I don’t remember seeing “model not converging” years ago in Bayesian context. Thanks for pointing out that there are fields where talking about models not converging refering to inference problems is common. Maybe that and increasing use of Bayesian workflow has made it leak to Bayesian Workflow context, too.

Leave a Reply

Your email address will not be published. Required fields are marked *