Postdocs in probabilistic modeling! With David Blei! And Stan!

David Blei writes:

I have two postdoc openings for basic research in probabilistic modeling.

The thrusts are (a) scalable inference and (b) model checking. We
will be developing new methods and implementing them in probabilistic
programming systems. I am open to applicants interested in many kinds
of applications and from any field.

“Scalable inference” means black-box VB and related ideas, and “probabilistic programming systems” means Stan! (You might be familiar with Stan as an implementation of Nuts for posterior sampling, but Stan is also an efficient program for computing probability densities and their gradients, and as such is an ideal platform for developing scalable implementations of variational inference and related algorithms.)

And you know I like model checking.

Here’s the full ad:

===== POSTDOC POSITIONS IN PROBABILISTIC MODELING =====

We expect to have two postdoctoral positions available for January 2014 (or later). These positions are in David Blei’s research group in the Computer Science Department at Princeton University. They are one-year positions with likely renewal to two years. They are for doing basic research in probabilistic modeling.

We will have two main research thrusts:

(a) Developing new scalable methods of approximate posterior
inference. We are interested in developing generic variational
methods for massive data sets and streaming data sets. For example,
see our recent work on stochastic variational inference (Hoffman et
al., 2013) and nonconjugate variational inference (Wang and Blei,
2013).

(b) Developing new methods for calculating model fitness and new ways
of diagnosing model misfit. We are interested in developing modern
methods related to predictive sample re-use (Geisser, 1975) and
posterior predictive checks (Rubin, 1984; Meng, 1994; Gelman et al.
1996).

We will implement these ideas in modern probabilistic programming
systems and exercise them in several problem domains. (Though the
work is for general methodological research, we encourage applicants
who are already interested in specific applied problems.) More
broadly, our goal is to tighten the probabilistic modeling
pipeline—posit a model, estimate a posterior, check the model,
revise the model—in the service of scientific and technological
applications.

QUALIFICATIONS

Applicants should have a PhD and experience with applied probabilistic
modeling. Our research will be in statistical machine learning, but
we happily will consider applicants fields outside of Computer Science
and Statistics (e.g., Physics, Biology, Social Sciences, Astronomy,
etc.)

TO APPLY

Send your CV to [email protected] and arrange to have two
letters of reference sent to the same address. Optionally, you may
also include a research statement.

7 thoughts on “Postdocs in probabilistic modeling! With David Blei! And Stan!

  1. Not yet. I’m going to write it all up after Stan 2.0 is out.

    For now, the forward mode is incomplete in that the probability functions aren’t yet implemented. It’ll take some more template metaprogamming on OperandsAndPartials to get it to fly, but Daniel Lee and I have the design down.

    For figuring out how auto-diff works, look at the agrad/rev and agrad/fwd unit tests. There are examples for almost any function there. The set of functions available is documented in the Stan manual. All the C++ built-in operators, too, like += and its ilk.

    For a fairly complete set of usage examples embedded in C++ functors, look at the develop branch file src/stan/agrad/autodiff.hpp. It shows how to do gradients, Jacobians, Hessians, Hessian-vector products, gradient-vector products (i.e., directional derivatives), and some more complicated things we need for RMHMC (like the gradient of the trace of a matrix-Hessian product).

    • Eventually — we promised to build one as part of our NSF grant.

      We’re working on Python first.

      If someone wants to volunteer to integrate with MATLAB, that’d be great. There are two ways to do it—calling Stan as an external process and integrating in-memory with MATLAB. We took the latter approach with Python and R, but it’s much easier to do the former.

      • One reason to consider doing the MATLAB interface out-of-process is that it would allow people who use Octave and other matlab-esq language implementations to use it. Since R and Python are both open source software and have open reference implementations it makes more sense to do in-process in those contexts, but the closed nature of MATLAB gives more incentive to put the Stan stuff out-of-process so that it can be more widely used outside of the commercial MATLAB implementation.

  2. “Stan is also an efficient program for computing probability densities and their gradients”

    I’m currently writing a C++ library for large-scale reversible jump MCMC applications (they will run inside Nested Sampling too, to get over the first order phase transitions). I would love to use Stan so that I can program some distributions without having to do all the gradients too, but I’m not smart enough to understand the Stan code. :(

Comments are closed.