Applied regression and multilevel modeling books using Stan

Edo Navot writes:

Are there any plans in the works to update your book with Prof. Hill on hierarchical models to a new edition with example code in Stan?

Yes, we are planning to break it up into 2 books and do all the modeling for both books in Stan. It’s waiting on some new functionality we’re building in Stan to do maximum likelihood, penalized maximum likelihood, and maximum marginal likelihood, and also to fit various standard models such as linear and logistic regression automatically.

17 thoughts on “Applied regression and multilevel modeling books using Stan

    • It’ll be at least a year or longer before we can start because Rob Trangucci’s just building out some of Ben Goodrich and Richard McIlreath’s prototypes for the prototypes for the Stan functionality now.

      I’ve been urging Andrew to just go full Bayes the whole way — none of the data in Gelman and Hill is so huge you couldn’t fit it with Stan’s existing MCMC. (I believe Andrew’s main reason for wanting max marginal likelihood is speed [and maybe connecting to tradition and other literature, but I’m not sure about that]).

      • Bob – why go full Bayes the whole way?

        To my mind, the book is so useful partly because it gives such a gentle introduction. You can literally give it to a bright undergraduate politics student and they’d be able to teach themselves statistics. Besides, for simple modelling jobs, the gain from going full Bayes is pretty slim (aside from being taught to think in terms of probability models, which certainly clarifies thinking).

        • Jim,

          Why would you think that a bright undergraduate politics student would struggle more with a fully Bayesian exposition than they would with a classical treatment?

          I suspect that if Andrew, Jennifer, and Bob wanted to do this, the result would be just as accessible to your undergraduates as the current text.

          And the gain by going full Bayes… the clarity and consistency of thinking associated with modeling uncertainty with random variables is *huge*.

          That said, I’ve heard claims like this before, that seem to indicate that Bayesian statistics is somehow more difficult to learn than classical statistics. In my experience this is not the case. In fact, some Bayesian concepts are *way* more intuitive than their classical counterparts.

        • JD,

          Agree on your clarity and consistency points. But on the ease of learning, I think you’re being too generous to undergraduate politics students. For a big part of the audience for the book, fundamental concepts for Bayesian stats, like likelihood, marginalisation, invariant distributions etc. are rightly introduced once the reader understands why they’re building models in the first place. And using the lmer examples at the beginning is a great way to illustrate that ‘why’.

          I basically taught myself stats from that book (after having taken several uninspiring undergraduate stats courses). What I found refreshing was that the first half of the book in particular teaches intuition in modelling approaches and causal reasoning, rather than spending too much time talking about the mechanics of estimators.

        • I agree that “some Bayesian concepts are *way* more intuitive than their classical counterparts.” In fact, one common problem in learning frequentist statistics is that many (probably most) students tend to interpret the frequentist concepts in a Bayesian rather than frequentist manner, thus misunderstanding the concepts. My (admittedly limited) experience in introducing Bayesian statistics to students who have some background in frequentist statistics is that they think Bayesian is better, because it’s what they intuitively “want”.

          That being said, there’s still the question of how best to teach Bayesian statistics to e.g., political science undergraduates. Has anyone here used John Kruschke’s book “Doing Bayesian Data Analysis”? I noticed the second edition on the new book shelf a couple of weeks ago. It says (p. 1), “This book is speaking to a person such as a first-year graduate student or advanced undergraduate in the social or biological sciences.” I’d be interested in hearing what users think of it.

        • It’s a big can of worms.

          My guess at Andrew’s approach is that students have had some experience with regression and its purposes (though no real understanding) and he starts with that and builds from it.

          Also an applied understanding of Bayesian statistics probably takes years of study and practice.

          Working through these materials* https://phaneron0.wordpress.com/2012/11/23/two-stage-quincunx-2/ with a number of people with differing backgrounds was and continues to be surprising – there is something non-intuitive about representing and thinking about uncertainty regardless of the simplicity and directness of the representations/methods.

          * Galton’s two stage quincunx provides an arguably adequate way to represent and show most of what goes on in Bayesian statistics.

  1. Prof. Gelman,
    Wondering if the updated Multilevel book(s?) is nearing completion? I’m a big fan of the first edition, so looking forward to the updated version.

    CC

  2. Professor Gelman,

    Like others, I’m really looking forward to “Advanced Regression and Multilevel Models.” In the mean time, I have been trying to work through “Data Analysis Using Regression and Multilevel/Hierarchical Models,” especially the later chapters on multilevel modeling.

    When I tried to get Bugs set up (I’m already an R user) to actually run some of the models, I followed this link mentioned at the start of Appendix C: http://www.stat.columbia.edu/~gelman/arm/software/ then clicked on “(Occasionally updated) instructions for downloading and using the software (in R and Bugs) we use to fit, plot, understand, and use regression models.” But it looks like the original instructions were removed and now it’s a link to Stan. Were those instructions archived somewhere, possibly? Stan is awesome, but I was still hoping to use Bugs, if possible, to be able to closely follow the textbook example code.

    Thank you,
    Brian

    • Brian:

      I don’t recommend doing anything in Bugs anymore. Here is Stan code for the examples in my book with Jennifer. I can’t vouch for every line of code—I haven’t gone through it myself—but I think most of it is fine. Also we have lots of lmer and glmer, and now I’d do this using stan_glmer.

  3. Hi Andrew,

    Is there an updated timeline for “Advanced Regression and Multilevel Models”? I’m looking forward to its release but would like to read about multilevel modeling sooner rather than later.

    Thanks,
    Joe

    • Joe:

      Thanks for asking. We just finished Active Statistics and then this year we want to finish Bayesian Workflow, then Advanced Regression and Multilevel Models should be next. Best guess is 2024.

Leave a Reply to Andrew Cancel reply

Your email address will not be published. Required fields are marked *