Charles Margossian, Aki Vehtari, Daniel Simpson, Raj Agrawal write: Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. […]

**Statistical computing**category.

## Super-duper online matrix derivative calculator vs. the matrix normal (for Stan)

I’m implementing the matrix normal distribution for Stan, which provides a multivariate density for a matrix with covariance factored into row and column covariances. The motivation A new colleague of mine at Flatiron’s Center for Comp Bio, Jamie Morton, is using the matrix normal to model the ocean biome. A few years ago, folks in […]

## Statistical Workflow and the Fractal Nature of Scientific Revolutions (my talk this Wed at the Santa Fe Institute)

Wed 3 June 2020 at 12:15pm U.S. Mountain time: Statistical Workflow and the Fractal Nature of Scientific Revolutions How would an A.I. do statistics? Fitting a model is the easy part. The other steps of workflow—model building, checking, and revision—are not so clearly algorithmic. It could be fruitful to simultaneously think about automated inference and […]

## Stan pedantic mode

This used to be on the Stan wiki but that page got reorganized so I’m putting it here. Blog is not as good as wiki for this purpose: you can add comments but you can’t edit. But better blog than nothing, so here it is. I wrote this a couple years ago and it was […]

## It’s “a single arena-based heap allocation” . . . whatever that is!

After getting 80 zillion comments on that last post with all that political content, I wanted to share something that’s purely technical. It’s something Bob Carpenter wrote in a conversation regarding implementing algorithms in Stan: One thing we are doing is having the matrix library return more expression templates rather than copying on return as […]

## Laplace’s Demon: A Seminar Series about Bayesian Machine Learning at Scale

David Rohde points us to this new seminar series that has the following description: Machine learning is changing the world we live in at a break neck pace. From image recognition and generation, to the deployment of recommender systems, it seems to be breaking new ground constantly and influencing almost every aspect of our lives. […]

## We need better default plots for regression.

Robin Lee writes: To check for linearity and homoscedasticity, we are taught to plot residuals against y fitted value in many statistics classes. However, plotting residuals against y fitted value has always been a confusing practice that I know that I should use but can’t quite explain why. It is not until this week I […]

## New Within-Chain Parallelisation in Stan 2.23: This One‘s Easy for Everyone!

What’s new? The new and shiny reduce_sum facility released with Stan 2.23 is far more user-friendly and makes it easier to scale Stan programs with more CPU cores than it was before. While Stan is awesome for writing models, as the size of the data or complexity of the model increases it can become impractical […]

## Bayesian analysis of Santa Clara study: Run it yourself in Google Collab, play around with the model, etc!

The other day we posted some Stan models of coronavirus infection rate from the Stanford study in Santa Clara county. The Bayesian setup worked well because it allowed us to directly incorporate uncertainty in the specificity, sensitivity, and underlying infection rate. Mitzi Morris put all this in a Google Collab notebook so you can run […]

## MRP with R and Stan; MRP with Python and Tensorflow

Lauren and Jonah wrote this case study which shows how to do Mister P in R using Stan. It’s a great case study: it’s not just the code for setting up and fitting the multilevel model, it also discusses the poststratification data, graphical exploration of the inferences, and alternative implementations of the model. Adam Haber […]

## Webinar on approximate Bayesian computation

X points us to this online seminar series which is starting this Thursday! Some speakers and titles of talks are listed. I just wish I could click on the titles and see the abstracts and papers! The seminar is at the University of Warwick in England, which is not so convenient—I seem to recall that […]

## Conference on Mister P online tomorrow and Saturday, 3-4 Apr 2020

We have a conference on multilevel regression and poststratification (MRP) this Friday and Saturday, organized by Lauren Kennedy, Yajuan Si, and me. The conference was originally scheduled to be at Columbia but now it is online. Here is the information. If you want to join the conference, you must register for it ahead of time; […]

## More coronavirus research: Using Stan to fit differential equation models in epidemiology

Seth Flaxman and others at Imperial College London are using Stan to model coronavirus progression; see here (and I’ve heard they plan to fix the horrible graphs!) and this Github page. They also pointed us to this article from December 2019, Contemporary statistical inference for infectious disease models using Stan, by Anastasia Chatzilena et al. […]

## Fit nonlinear regressions in R using stan_nlmer

This comment from Ben reminded me that lots of people are running nonlinear regressions using least squares and other unstable methods of point estimation. You can do better, people! Try stan_nlmer, which fits nonlinear models and also allows parameters to vary by groups. I think people have the sense that maximum likelihood or least squares […]

## Estimates of the severity of COVID-19 disease: another Bayesian model with poststratification

Following up on our discussions here and here of poststratified models of coronavirus risk, Jon Zelner writes: Here’s a paper [by Robert Verity et al.] that I think shows what could be done with an MRP approach. From the abstract: We used individual-case data from mainland China and cases detected outside mainland China to estimate […]

## Prior predictive, posterior predictive, and cross-validation as graphical models

I just wrote up a bunch of chapters for the Stan user’s guide on prior predictive checks, posterior predictive checks, cross-validation, decision analysis, poststratification (with the obligatory multilevel regression up front), and even bootstrap (which has a surprisingly elegant formulation in Stan now that we have RNGs in trnasformed data). Andrew then urged me to […]

## “A Path Forward for Stan,” from Sean Talts, former director of Stan’s Technical Working Group

Sean Talts was talking about his ideas of how Stan should move forward, given anticipated developments in the probabilistic programming infrastructure. I encouraged his to write up his ideas in some sort of manifesto form, and he did so. Here it is. The title is “A Path Forward for Stan,” and it begins: Stan has […]

## 100 Things to Know, from Lane Kenworthy

The sociologist has this great post: Here are a hundred things worth knowing about our world and about the United States. Because a picture is worth quite a few words and providing information in graphical form reduces misperceptions, I [Kenworthy] present each of them via a chart, with some accompanying text. This is great stuff. […]

## Naming conventions for variables, functions, etc.

The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future. Three principles of naming follow: 1. Names should mean something. 2. Names should be as short as possible. 3. Use your judgement to balance (1) and (2). The third […]

## Computer-generated writing that looks real; real writing that looks computer-generated

You know that thing where you stare at a word for long enough, it starts to just look weird? The letters start to separate from each other, and you become hyper-aware of the arbitrariness of associating a concept with some specific combination of sounds? There’s gotta be a word for this. Anyway, I was reminded […]