This post is from Bob.
Sabbatical visitors
If you work in computational stats or ML (or even in other branches of applied math) and have a sabbatical coming up and would like to spend it at the Center for Computational Mathematics, which is part of the Flatiron Institute in NYC, please drop me a line:
Pre-faculty visitors
Some of our postdoc applicants have wound up getting faculty offers, at which point we can make them visiting researcher offers at better than postdoc salaries for a year. We have great computing resources, great physical space, and a wonderful set of colleagues across a range of scientific computing areas of interest.
What we did last year
Last year, Nawaf Bou-Rabee was on sabbatical here and Sifan Liu was here as a pre-faculty visitor. Four different papers about which I’m excited came out of this collaboration and we really feel like we’re just getting warmed up, if I may be permitted a pun.
- Gibbs self tuning (GIST) for dynamic HMC, co-authored with Nawaf and Milo Marsden.
- GIST for step size adaptation, co-authored with Nawaf, Milo, and Tore Kleppe.
- WALNUTS for local step size tuning, with Nawaf, Sifan, and Tore.
- No underrun sampler (NURS), which is a gradient-free implementable version of the hit-and-run sampler, with Nawaf, Sifan, and Stefan Oberdörster.
Nawaf stayed on as a part-time visiting researcher, and with Tore and Sifan, we’ve turned our attention to mass matrix adaptation.
Sifan has left us for Duke, where I have zero doubt she’ll be hugely successful. She has the same kind of research X-ray vision that I last experienced working with Matt Hoffman. I frankly couldn’t keep up. These projects with us were just a fraction of what she worked on while here. She also collaborated with a bunch of new people locally and came up with a couple novel normalizing flow implementations that connect to quasi Monte Carlo. I can’t wait to see what she does next.
Going forward: flows and diffusions
This year, Luhuan Wu is here as a pre-faculty visitor before she heads off to a faculty position at Johns Hopkins in Applied Math and Statistics. Most of our new ML postdocs this year have worked on diffusion models and normalizing flows (this includes Luhuan, Mark Goldstein, and Louis Grenioux), as have many of our research scientists and previous postdocs (though a group of three postdocs who collaborated on a diffusion plus HMC model of galactic dust denoising project to measure cosmic microwave background all got jobs and left). I hope to spend at least half of my time going forward, starting in January, working on normalizing flows and diffusions with the goal of developing a practical tool that applied statisticians can use. What got me super excited about this was when Justin Domke took a five month leave of absence here—his work with Abhinav Agrawal on normalizing flows actually really works robustly, in many hard cases better than Stan, though it takes a bajillion flops, for which you need a good GPU. If you’re interested in this project as a visitor here, please let me know!
To anyone who’s reading this and might be interested, let me just echo Bob that Flatiron is great. I go there one day a week and work with Bob and his colleagues, and lots of interesting conversations are happening all over.
> working on normalizing flows and diffusions with the goal of developing a practical tool that applied statisticians can use
What do you think is missing from the existing stuff? I think diffrax was the last thing I used. I have a vague sense that there were sharp edges, but maybe that was me just being unfamiliar with what I was doing. Looking at it again, it looks quite nice.
Iirc the flows I tried ended up working like the box art said they would, which was kinda surprising to me — my expectation is these fancy algo things end up being super finicky. So I always have them in the back of my mind to play with again.
I looked up diffrax and it appears to be an ODE solver library in JAX. I’m talking about normalizing flows and diffusions used for variational inference, which doesn’t involve any ODE solving per se. Of course, our applied models often involve ODEs, for example in pharmacokinetic models.
I think a number of things need to happen before a typical Stan user could effectively use normalizing flows. First, it needs to be a bit easier to write models. PyMC and NumPyro get you part of the way there, but I’m not super keen on either system’s syntax and they’re both up to their necks in Python programming. Second, the tools themselves need to be more automated—most of them are bristling with control parameters—I think this is going to require some experimentation to see what works. Third, I think we need to run some large-scale evaluations to convince statisticians these things are robust and understand where they are not. Fourth, posterior analysis tooling needs to be integrated along with suggestions for workflow. Fifth, they often require hardware beyond what our current users have, but I’m not going to be able to solve that problem.
It was nice to see that the computing resoruces are indeed impressive:
https://www.simonsfoundation.org/flatiron/scientific-computing-core/
“. Powering the work of the Flatiron Institute is a high-performance computer cluster designed, deployed and maintained by SCC. It comprises 180,000 cores, 600 GPUs, and 60 petabytes of raw storage.”
What we have is great for an academic institution, but chump change for big tech.
Our Scientific Compute Core (SCC) keeps it built out to the point where it’s easy to get access without waiting. Our postdocs can easily burn 10K H100 hours for a project and often an order of magnitude more than that for bigger projects. That’s really good for an academic institution, but chump change for big tech. For comparison, an H100 GPU-hour from AWS of the kind we have is on the order of US$5.