Superfast not because of faster convergence but because they use a clever acceptance/rejection trick so that most of the time they don’t have to evaluate the entire target density. It’s written in terms of single-step Metropolis but I think it should be possible to do it in HMC or Nuts, in which case we could try it out in Stan.

From the abstract:

The idea behind the generic acceleration is to divide the acceptance step into several parts, aiming at a major reduction in computing time that outranks the corresponding reduction in acceptance probability. The division decomposes the “prior x likelihood” term into a product such that some of its components are much cheaper to compute than others. Each of the components can be sequentially compared with a uniform variate, the first rejection signalling that the proposed value is considered no further, This approach can in turn be accelerated as part of a prefetching algorithm taking advantage of the parallel abilities of the computer at hand. We illustrate those accelerating features on a series of toy and realistic examples.

What the paper really needs, though, is a “money shot,” a graph showing the potentially huge gains attainable in some settings. My impression from talking with Christian is that this new algorithm could be much faster than what we’re doing now, but in the paper they just show gains of up to a factor of 5 or so. If they do things right, they could get gains in wall time per effective sample size of a factor of 10 or 100 or more, no?

P.S. More from X here.

Super cool! I would expect HMC/NUTS not to benefit as much because it doesn’t speed up gradient stuff, but still very useful.