The Stan Core Roadmap

Here’s the plan for Stan core development that Bob presented at Stancon last week (that is, back at the end of August, 2018):

Part I. Rear-View Mirror

Stan 2.18 Released
Multi-core Processing has Landed!
Multi-Process Parallelism
Map Function
New Built-in Functions
Manuals to HTML
Improved Effective Sample Size
Foreach Loops
Data-qualified Arguments
Bug Fixes and Enhancements
Math Library Enhancements
CmdStan Enhancements

Part II. The Road in Front

GPU Support
GPU Speedup, Cholesky (40+ times)
PDEs, DAEs & Definite Integrals
Tuples (i.e., Product Types)
Ragged Arrays
Lambdas and Function Types
Independent Generated Quants
Adjoint-Jacobian Product Functor
Mass Matrix/Step Size Init
Variadic Functions, not Packing

Part III. The Longer Road

Faster Compile Times
Blockless Stan Language
Blockless Linear Regression
Non-Centered Normal Module
Protocol Buffer I/O
Logging Standards

We have lots more plans, but these are the specific items on the agenda for the Stan language.

By the time this post appears, we can report on what in the above list has already been done, what else has been done, and what’s planned for next steps.

P.S. Follow the link and you can watch all the talks from Stancon.

5 thoughts on “The Stan Core Roadmap

  1. This is a great summary of recent changes. I really need to get around to watching more of those videos…

    I have much more experience with rstan than any other interface. I know that Stan has a cmdstan interface that you could theoretically call within all kinds of different programming languages. However, I understand there is some kind of overhead here. Is it possible to only use stan with C++ and no additional overhead?

  2. CmdStan basically calls the C++ with no overhead other than the I/O required to get data in and draws out. So while you can run directly in C++ (reading the CmdStan code’s the easiest way to see how to do that), there’s really nothing to be gained unless you need to interface with other C++.

    The overhead caused by RStan vs. CmdStan is minimal until you start running up against memory limitations. I’m not sure if PyStan is measurably slower than CmdStan or not.

    • I was thinking of the I/O as the overhead, but it’s been a while since I’ve thought of cmdstan (after posting I looked on the groups and remembered I had asked about it like 2-3 years ago, so to some extent I haven’t learned much in that time…).

      My typical workflow in rstan (the same as most other people, I imagine) is to load some data in memory, process it, and then call the stan function on it and then do whatever analysis afterward I need to. My understanding with cmdstan is that if I were doing all this in C++, then after processing it I would need to save it to a file, then call cmdstan on it, and then use I/O to get the draws out to do analysis. My issue with this is, particularly on the first part, what if I have some big data set, say bigger than 100MB in memory in R, and want to run stan on it. Or what if I want to do a lot of stan runs on a part of that large data set. The I/O slowdown becomes more of a negative the bigger the dataset or the more times you need to run the sampler.

  3. We’ve added a slick new location/scale transform not on the roadmap. And the transpiler is nearly complete in OCaml, which will open the road to the new language changes (plus compiler optimizations and lint [pedantic] mode).

    GPU is plugging along and should be out in 2.19 (because 2.19 is essentially coming out when GPU is ready). There’s progress on the definite integrator and on the algebraic solver (the latter not listed). Mass matrix init is done in CmdStan and standalone generated quantities are nearly done.

Leave a Reply to Bob Carpenter Cancel reply

Your email address will not be published. Required fields are marked *