What should Yuling include in his course on statistical computing?

Yuling writes:

I have been creating a new course for a PhD level core class on statistical computing in my department. Due to my personal bias, this course will be largely computing methods for Bayesian inference (or probabilistic ML).

I have created a tentative schedule for this course.

Any suggestions on the topics or reading materials that I try to cover?

Please leave your suggestions for topics and reading materials, or any other thoughts you have on statistical computing, in the comments.

My suggestion is that Yuling have them read my article with Aki, What are the most important statistical ideas of the past 50 years?, which is not just about computing but could be helpful to give students a sense of the motivation behind many of the developments in computational statistics.

1 thought on “What should Yuling include in his course on statistical computing?

  1. I would love to have the textbook you’re going to write based on this class :-).

    As I told Yuling via email, there’s always the danger of trying to pack too much into a class. All of these algorithms are subtle and just developing intuitions about concentration of measure, how distance behaves in high dimensions, and even what log concavity entails can be challenging. So as a solution, I’m going to suggest even more topics :-).

    If you’re looking for big topics that you missed, I’d include max marginal likelihood in continuous cases and then INLA. There is also the issue of surrogates or whatever you call building a neural network or GP to model your density SBI-style.

    How much are you hoping to introduce the algorithms and practical methods and how much are you hoping to go into the convergence proofs? The papers are all about the latter, but then there’s a ton of applied math stuff you’d need to cover.

    For autodiff, I think you should include Giles’s extended matrix autodiff paper. That sets up the problem more neatly than I’ve seen elsewhere, including in Charles’s survey. There’s also a really great survey paper from Mohamed et al. on Monte Carlo (stochastic) gradients that covers a lot of what you cover elsewhere.

    For optimization for machine learning, you need to cover Adam and its relatives. I love the Bottou description of SGD, but nobody uses vanilla SGD any more.

Leave a Reply

Your email address will not be published. Required fields are marked *