## GPstuff: Bayesian Modeling with Gaussian Processes

I think it’s part of my duty as a blogger to intersperse, along with the steady flow of jokes, rants, and literary criticism, some material that will actually be useful to you.

So here goes.

Jarno Vanhatalo, Jaakko Riihimäki, Jouni Hartikainen, Pasi Jylänki, Ville Tolvanen, and Aki Vehtari write:

The GPstuff toolbox is a versatile collection of Gaussian process models and computational tools required for Bayesian inference. The tools include, among others, various inference methods, sparse approximations and model assessment methods.

We can actually now fit Gaussian processes in Stan. But for big problems (or even moderately-sized problems), full Bayes can be slow. GPstuff uses EP, which is faster. At some point we’d like to implement EP in Stan. (Right now we’re working with Dave Blei to implement VB.)

GPstuff really works. I saw Aki use it to fit a nonparametric version of the Bangladesh well-switching example in ARM. He was sitting in his office and just whipped up the model and fit it.

1. David says:

just when I start learning about GP and realize it could solve some of my problems, you blog on that useful GP code. That’s a welcome coincidence (for me). Thanks a lot !!! Let’s see if it can cope with 3 millions data points !

2. Also of note, I’ve been working on improving (ie, speeding up) GP models in Stan. Hopeful this will be finished for the next release.

3. For those of you like me who didn’t know what EP or VB stood for, EP is Expectation Propagation http://research.microsoft.com/en-us/um/people/minka/papers/ep/ and VB is Variational Bayesian methods. http://en.wikipedia.org/wiki/Variational_Bayesian_methods

• Nick Cox says:

This is helpful. VB triggered with me “Visual Basic”, and then “surely not…”. (Not that I use Visual Basic, naturally.)

• Yes, this was the only thing that I could think of for “VB” too, and EP made me think of “EM” or Expectation Maximization, which is related. anyway, glad I’m not the only reader of this blog that doesn’t have the technical abbreviations entirely OTTOMT.

• Andrew says:

We have sections on VB and EP in BDA3.

• For someone who professes to hate acronyms, Andrew sure uses a lot of them.

“BDA3” is Bayesian Data Analysis, 3rd Edition, but as of yet, it’s not publicly available.

On the other hand, the standard machine learning texts all discuss both VB and EP: (1) Kevin Murphy’s new book, (2) David MacKay’s old book (available free from the author), and (3) Chris Bishop’s book.

• Andrew says:

Those sources are all great, but our treatment in BDA3 should be particularly clear for readers with a statistical background (I hope).

• We’re particularly interested in two recent developments for variational methods.

One development is Wang and Blei’s approach to nonconjugate models. Usually variational methods are restricted to models where certain conditional expectations are easy to compute. This new development will let us exploit the full power of Stan’s modeling language:

Chong Wang, David M. Blei. 2012–2013. Variational Inference in Nonconjugate Models.
http://arxiv.org/abs/1209.4360

The other development is stochastic variational inference by Matt Hoffman (who you may know as the inventor of NUTS), with help from Wang, Blei and Paisley. Stochastic variational inference is like other stochastic optimization in that it lets you fit models with small batches of data in memory at any one time.

Matt Hoffman, David M. Blei, Chong Wang, John Paisley. 2012–2013. Stochastic Variational Inference.
http://arxiv.org/abs/1206.7051

• David J. Harris says:

I’m really excited about this. Would Stan’s existing code support Gaussian random fields (i.e. Gaussian processes with multiple response variables)? If not, do you have any plans to implement them?

Thanks for making such a great piece of software

• Marcus says:

I’m not entirely sure what you mean by a Gaussian random field but there’s a fair chance Stan already does supports it in one form or another but it may be impractically slow. If you can express the the log posterior then you can run the model in Stan. If you’d like to see it better supported, post the details (a paper reference would suffice) here or on one of the stan lists (or you can email me directly) and I’ll see if I can work on it.

• David J. Harris says:

Awesome, I’ll see if I can make it work when I have a spare moment.

If you’re interested, definition 2.1.1 and section 2.2.2 of this pdf from Jonathan Taylor may be enough to clear things up. It’s literally just like a GP except that you’re trying to predict a vector instead of a scalar.

Also, thanks for offering your support–I’ll email you if I run into any trouble.

• Marcus says:

Ah, ok. Yes, that’s definitely implementable in Stan and there will be significantly better support for it (ie, faster and easier to code) in the next release (or maybe two releases, depending on how much time I have and when the next release happens).

• David J. Harris says:

Amazing. Thanks!

• Sam Gershman says:

You might also want to look at the inference algorithm developed by me, Dave Blei and Matt Hoffman for inference in nonconjugate models:
http://web.mit.edu/sjgershm/www/GershmanHoffmanBlei12.pdf
It’s very flexible and can be used basically for any model you can apply NUTS to. We’ve released a Matlab implementation which you can get here: http://web.mit.edu/sjgershm/www/npv.v1.zip

4. nottrampis says: