Fränzi and Tobias‘s book is now real:

Fränzi Korner-Nievergelt, Tobias Roth, Stefanie von Felten, Jérôme Guélat, Bettina Almasi, and Pius Korner-Nievergelt (2015) *Bayesian Data Analysis in Ecology Using Linear Models with R, BUGS, and Stan*. Academic Press.

This is based in part on the in-person tutorials that they and the other authors have been giving on statistical modeling for ecology.

The book starts at the beginning with an introduction to R, regression and ANOVA, discusses maximum likelihood estimation, then generalized linear models including “mixed effects” models, and then proceeds to Bayesian modeling with MCMC computation for inference, and winds up with some case studies involving BUGS and Stan. Everything works up from simple “hello world” type programs through real examples, which I really appreciate myself in computational examples.

Stan’s primarily showcased in three fully worked out examples (which I also really appreciate as a reader), all of which appear in Chapter 14, “Advanced Ecological Models”:

(14.2) zero-inflated Poisson mixed model for analyzing breeding success,

(14.3) occupancy model to measure species distribution, and

(14.5) analyzing survival based on mark-recapture data.

So, do the owls represent the prior, the groundhogs the likelihood, and the bird the posterior? That’s a strong prior!

Bob, have you also seen that Doing Bayesian Data Analysis, 2nd Edition, by Kruschke 2014, has a whole chapter (17 pages) on Stan?

Yes, I have — we’ve been mentioning it in the “broader impacts” sections of our talks and grant proposals! I should blog about that one, too, but I only have the contents, not the book itself.

I want to point readers of this blog to a phenomenal new book that will come out soon, by Richard Mcelreath:

http://xcelab.net/rm/statistical-rethinking/

This is going to be a very important book for teaching non-statisticians Bayesian methods. It is also a study in good writing, none of that vapid PG Wodehouse humor that people sometimes attempt when explaining statistics to non-statisticians.

Thanks for the pointer. It’s another book using Stan and R!

There a couple of sample chapters available from the website. As I said in the main body of this post, I very much like books that are organized around working code and simulation to illustrate theory. I don’t know what McElreath means by “maximum entropy” in “The book teaches generalized linear multilevel modeling (GLMMs) from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy.” I’m hardly a philosopher of statistics. [update: Though now I’m taking a guess based on the Wikipedia page on max ent distributions. I had no idea the normal distribution had maximum entropy among distributions with a fixed mean and standard deviation!]

I would think the focus on max a posteriori (MAP, i.e., posterior mode) estimation with an emphasis on multilevel modeling would be problematic given that many multilevel models don’t have posterior modes even with priors. I understand a lot of people like to start with posterior mode estimation because it seems simpler and more deterministically procedural than full Bayes via simulation; in the interest of full disclosure, I’ve even done it myself in a recent tutorial.

P.S. I have to object to Shravan’s description of P. G. Wodehouse’s humour as vapid! Bertie Wooster was a vapid character, as were some of his fellow upper-class twits, but Bertie was just a foil for Jeeves, who, like the humour, was anything but vapid. (“u” not redacted to preserve the ur-Britishness of the whole Wodehousian enterprise.)

Yes, I have to concede the point re Jeeves. I enjoyed Wodehouse when I was in school, not so much later in life. Maybe I’m just turning into a curmudgeon (well, I know I am). I recently started to re-read Graham Greene, one of my favourites (note spelling) from my distan youth, but it was a no-go from the very start.

Agreed on MAP estimation. MAP estimation is not used for multilevel models. It just appears in the linear models chapters. All the GLMs are done by comparing MAP estimates to full Bayes, so students get a sense of the risks of MAP estimation. It’s a whole lot easier to get people into Bayes without having to identify it with MCMC, I find.

And map2stan is like Stan, but without having to write data and parameter blocks. Also, it does automatic imputation of missing values.

> get people into Bayes without having to identify it with MCMC

I would agree that seems to be important as black box dis-understanding spontaneously occurs with MCMC and MC

I liked the Golems in your online slides, especially the small world of Golems and the [real] big world.

In a webinar I found the distinction between the [real] big world and the analyst’s modelling of it [Golems world] poorly grasped and hard to get across (some material here – https://phaneron0.wordpress.com/2012/11/23/two-stage-quincunx-2/ )

I was also worried about the MAP but as its just a ladder for folks to climb up quickly and kick aside – no problem.

Thanks, Keith. Really my goal is to deflate the reader’s expectations for what statistics, of any paradigm, can do. I think there are far too many incentives to exaggerate the power of our methods, as Andrew often comments.

Additionally, there are far too many methods, in general, I think.

People spend too much time coming up with esoteric methods of marginal improvement in some cherry-picked region of parameter space but too little time validating / benchmarking / fine-tuning current methods or guiding users which method to use where.

I think we have too much method-space fragmentation. Somewhat like Linux distros.

> esoteric methods of marginal improvement in some cherry-picked region of parameter space

Those are the only things that one can publish (using standard methods no matter how appropriate “does not involve enough technical development to merit publication in our journal”)

> far too many incentives to exaggerate the power of our methods

Totally agree and perhaps especially so in introductory courses.

(Once with group of highly published statisticians in a SAMSI research working group I was herding, I suggested the use of “less wrong” wording rather than better/best/optimal wording in the minutes/reports. In discussion, they agreed the motivation was very appropriate but asked that we not as it felt too uncomfortable.)

(Sorry for hijacking this thread.)

For map2stan, is that based on what you had earlier that converted an lme4-style model description to code? And when you say automatic imputation of missing values, how do you do that? If you’re using Stan, it requires the missing values to have a probability model (everything would be uniform by default) and be continuous. Or are you using some standalone (multiple) imputation scheme?

map2stan doesn’t use those lme4 formulas, no. It’s basically a simplified full-Bayes model specification. There are some examples on the github repository:

https://github.com/rmcelreath/rethinking

If you scroll down the README, you’ll see examples of multilevel models and imputation.

Re imputation, if NAs are present in a variable ‘x’ but no distribution is assigned to variable ‘x’, it throws an error and stops. If it finds a distributional assumption, it uses it for imputation in the expected fully Bayesian way.

Stan does all of the hard work. It’s awesome.