Skip to content

Parallel in Stan

by Andrew Gelman and Bob Carpenter

We’ve been talking about some of the many many ways that parallel computing is, or could be used, in Stan. Here are a few:

– Multiple chains (Stan runs 4 or 8 on my laptop automatically)

– Hessians scale linearly in computation with dimension and are super useful. And we now have a fully vetted forward mode other than for ODEs.

EP (data partitioning)

– Running many parallel chains, stopping perhaps before convergence, and weighting them using stacking (Yuling and I are working on a paper on this)

– Bob’s idea of using many parallel chains spawned off an optimization, as a way to locate the typical set during warmup

– Generic MPI for multicore in-box and out-of-box for
parallel density evaluation

– Multithreading for parallel forward and backward time exploration in HMC

– Multithreading parallel density evaluation

– GPU kernelization of sequence operations

– Multithreading for multiple outcomes in density functions

– Then there’s all the SSE optimization down at the CPU level for pipelining.

P.S. Thanks to Zad for the above image demonstrating parallelism.


  1. Harlan says:

    Say I have a nice looking model written in stan. What is the easiest way to do a simulation study to investigate the operating characteristics of said model? Can I run stan in parallel, once for every different simulated dataset?

  2. Eric says:

    I heard recently on a C++ podcast about an NVidia compiler that was able to convert C++-17 code into parallel code that can run on a GPU ( Is this related to the “GPU kernelization of sequence operations” mentioned?

  3. Joshua Pritikin says:

    “Hessians scale linearly in computation with dimension” — How is this true? I mean, at a minimum, a Hessian of dimension D has D(D+1)/2 unique entries.

Leave a Reply