State-space models in Stan

Michael Ziedalski writes:

For the past few months I have been delving into Bayesian statistics and have (without hyperbole) finally found statistics intuitive and exciting. Recently I have gone into Bayesian time series methods; however, I have found no libraries to use that can implement those models.

Happily, I found Stan because it seemed among the most mature and flexible Bayesian libraries around, but is there any guide/book you could recommend me for approaching state space models through Stan? I am referring to more complex models, such as those found in State-Space Models, by Zeng and Wu, as well as Bayesian Analysis of Stochastic Process Models, by Insua et al. Most advanced books seem to use WinBUGS, but that library is closed-source and a bit older.

I replied that he should you post his question on the Stan mailing list and also look at the example models and case studies for Stan.

I also passed the question on to Jim Savage, who added:

Stan’s great for time series, though mostly because it just allows you to flexibly write down whatever likelihood you want and put very flexible priors on everything, then fits it swiftly with a modern sampler and lets you do diagnoses that are difficult/impossible elsewhere!

Jeff Arnold has a fairly complete set of implementations for state-space models in Stan here. I’ve also got some more introductory blog posts that might help you get your head around writing out some time-series models in Stan. Here’s one on hierarchical VAR models. Here’s another on Hamilton-style regime-switching models. I’ve got a half-written tutorial on state-space models that I’ll come back to when I’m writing the time-series chapter in our Bayesian econometrics in Stan book.

One of the really nice things about Stan is that you can write out your state as parameters. Because Stan can efficiently sample from parameter spaces with hundreds of thousands of dimensions (if a bit slowly), this is fine. It’ll just be slower than a standard Kalman filter. It also changes the interpretation of the state estimate somewhat (more akin to a Kalman smoother, given you use all observations to fit the state).

Here’s an example of such a model.

Actually that last model had some problems with the between-state correlations, but I guess it’s still a good example of how to put something together in Markdown.

14 thoughts on “State-space models in Stan

  1. The r package ctsem is designed for hierarchical continuous time state space stuff using Kalman filter implementations within Stan. Here’s some relevant work:

    For the super fast intro see the examples section of the ctStanFit function, or here’s some related works:

    Hierarchical Bayesian continuous time system formulation
    https://www.researchgate.net/publication/324093594_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling

    Software vignette – discussing linear state space and Bayesian approach only but can be adjusted for frequentist / non-linear as per examples in ctStanFit help.
    https://www.researchgate.net/publication/310747987_Introduction_to_Hierarchical_Continuous_Time_Dynamic_Modelling_With_ctsem

    • Another package I’ve used for univariate state space modeling of time varying coefficients that uses Stan and the Kalman filter to estimate the parameters is “walker”.

      “An alternative solution used by walker is based on the marginalization of the regression coefficients β during the MCMC sampling by using the Kalman filter. This provides fast and accurate inference of marginal posterior p(σ|y). Then, the corresponding joint posterior p(σ,β|y)=p(β|σ,y)p(σ|y) can be obtained by simulating the regression coefficients given marginal posterior of standard deviations. ” – walker vignette

  2. There are lots of examples of HMMs and related models as well as lots of spatio-temporal models floating around. There’s a world of difference between discrete cases like HMMs where you can use dynamic programming and continuous cases like time series where you can’t. Among the continous models, there’s also a big difference between conjugate models like a lot of the Kalman filters and non-conjugate models like a lot of the spatio-temporal models we use.

    Actually that last model had some problems with the between-state correlations

    That’s a perennial problem with spatio-temporal models as you’re setting up the correlations in the model. A helpful reparameterization can be to model the differences. For example, if you have a latent time series alpha[1], …, alpha[T], the direct model alpha[t + 1] ~ normal(alpha[t], sigma) where alpha is a parameter leads to dependence between alpha[t] and alpha[t + 1]. Alternatively, you can take alpha[1] and differences delta[t + 1] = alpha[t + 1] – alpha[t], taking delta[t] ~ normal(0, sigma).

    • “A helpful reparameterization can be to model the differences”.

      This little nugget has done more to improve my time series models than any other thing I’ve read before online / in the Stan docs. Thanks, Bob!

  3. I like Stan a lot, but the biggest issue I’ve had with time series modelling in Stan is that it can be hard to enforce stationarity for more complicated types of models. The Stan documentation has a good example for AR series, but for a VAR with many variables it can be much more challenging.

    • We’ve had a lot of discussion about this with Andrew. He’d said at one point he doesn’t like hard constraints on stationarity because the data may not be consistent with stationarity. What will then happen is that probability mass will pile up near the constraint boundaries that’s trying to get beyond them and cause both statistical problems (shifts posterior mean estimates) and computational problems (unconstrained values run off to infinity). I found an interesting discussion with historical references (didn’t even know ResearchGate had message boards!) that covers a lot of the same ground. This is all way beyond my understanding of time series, though.

      We don’t have an easy way of enforcing constraints on complex roots or eigenvalues. I don’t know if there’s a way to reparameterize to build up a stable set of coefficients.

      Ben Goodrich has been working on constrained times series with Jacobians directly into Stan. I’m not sure what the scope of that effort is, but they may get easier going forward, at least in some cases.

      • I appreciate the reply Bob.

        In the frequentist community, it is a common strategy to difference I(n) variables until they are stationary. So for instance, if you have a VAR(p) where the each series is I(1), then if you estimate this with OLS, then there could be issues with inference (for the same reason why we use Dickey-Fuller unit root tests instead of T tests). If you difference each series to be I(0), then stationarity issues often go away (there may be other reasons why this model does not forecast well, this is where VECM models come in). It’s been some time since I’ve fit a Bayesian VAR, but my recollection is that if it is a VAR of a small number of I(0) variables with just a few lags, then stationarity is not a big issue. However, as the number of variables or lags increases, then it increasingly becomes an issue because when you simulate the coefficients there might be some simulated sets of coefficients that have are non-stationary. This means you have to check all of them for being stationary and throw out the ones that will produce useless forecasts.

        I think I had mentioned at a Stan meetup recently that I tried figuring out how to simulate stationary coefficients and the paper made my head spin. Honestly, if there was like a bountysource for Stan that could send some money Ben Goodrich’s way, I would be interested.

    • Hi,
      Have a look at https://cran.r-project.org/web/packages/Rlgt/index.html
      The package uses Stan and provides a few models that extend some Exponential Smoothing models (in state-space with single source of error formulation), using a non-linear trend, Student error distribution and functions for error size (heteroscedastic).
      They are designed to deal with fast growing series. In tests, e.g. on M3 data set, they beat Exponential Smoothing, ARIMA, Theta.

  4. You might want to look into Chronikis, an open-source language for Bayesian time-series modeling that currently focuses on linear state-space models, and compiles to Stan. Here’s the announcement on the Stan mailing list:

    https://discourse.mc-stan.org/t/chronikis-a-language-for-bayesian-time-series-models-that-compiles-to-stan/8410

    Here’s the website:

    http://opensource.adobe.com/Chronikis/

    Chronikis has good support for (quasi-)periodic components of a model. The period can be arbitrarily large, and need not even be an integer, and the number of parameters defining the component is independent of period. Chronikis accomplishes this by approximating the zero-mean Gaussian process defined by variant of MacKay’s periodic covariance function with an equivalent linear state-space model via a Fourier series decomposition.

    It’s also worth mentioning that Stan has a gaussian_dlm_obs distribution that computes the log likelihood for a linear state-space model, in the common case that all the matrices defining the model are time-invariant. Chronikis makes use of this.

Leave a Reply to Danton Noriega-Goodwin Cancel reply

Your email address will not be published. Required fields are marked *