Skip to content
Archive of posts filed under the Statistical computing category.

Conference on Mister P online tomorrow and Saturday, 3-4 Apr 2020

We have a conference on multilevel regression and poststratification (MRP) this Friday and Saturday, organized by Lauren Kennedy, Yajuan Si, and me. The conference was originally scheduled to be at Columbia but now it is online. Here is the information. If you want to join the conference, you must register for it ahead of time; […]

More coronavirus research: Using Stan to fit differential equation models in epidemiology

Seth Flaxman and others at Imperial College London are using Stan to model coronavirus progression; see here (and I’ve heard they plan to fix the horrible graphs!) and this Github page. They also pointed us to this article from December 2019, Contemporary statistical inference for infectious disease models using Stan, by Anastasia Chatzilena et al. […]

Fit nonlinear regressions in R using stan_nlmer

This comment from Ben reminded me that lots of people are running nonlinear regressions using least squares and other unstable methods of point estimation. You can do better, people! Try stan_nlmer, which fits nonlinear models and also allows parameters to vary by groups. I think people have the sense that maximum likelihood or least squares […]

Estimates of the severity of COVID-19 disease: another Bayesian model with poststratification

Following up on our discussions here and here of poststratified models of coronavirus risk, Jon Zelner writes: Here’s a paper [by Robert Verity et al.] that I think shows what could be done with an MRP approach. From the abstract: We used individual-case data from mainland China and cases detected outside mainland China to estimate […]

Prior predictive, posterior predictive, and cross-validation as graphical models

I just wrote up a bunch of chapters for the Stan user’s guide on prior predictive checks, posterior predictive checks, cross-validation, decision analysis, poststratification (with the obligatory multilevel regression up front), and even bootstrap (which has a surprisingly elegant formulation in Stan now that we have RNGs in trnasformed data). Andrew then urged me to […]

“A Path Forward for Stan,” from Sean Talts, former director of Stan’s Technical Working Group

Sean Talts was talking about his ideas of how Stan should move forward, given anticipated developments in the probabilistic programming infrastructure. I encouraged his to write up his ideas in some sort of manifesto form, and he did so. Here it is. The title is “A Path Forward for Stan,” and it begins: Stan has […]

100 Things to Know, from Lane Kenworthy

The sociologist has this great post: Here are a hundred things worth knowing about our world and about the United States. Because a picture is worth quite a few words and providing information in graphical form reduces misperceptions, I [Kenworthy] present each of them via a chart, with some accompanying text. This is great stuff. […]

Naming conventions for variables, functions, etc.

The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future. Three principles of naming follow: 1. Names should mean something. 2. Names should be as short as possible. 3. Use your judgement to balance (1) and (2). The third […]

Computer-generated writing that looks real; real writing that looks computer-generated

You know that thing where you stare at a word for long enough, it starts to just look weird? The letters start to separate from each other, and you become hyper-aware of the arbitrariness of associating a concept with some specific combination of sounds? There’s gotta be a word for this. Anyway, I was reminded […]

Is data science a discipline?

Jeannette Wing, director of the Columbia Data Science Institute, sent along this link to this featured story (their phrase) on their web site. Is data science a discipline? Data science is a field of study: one can get a degree in data science, get a job as a data scientist, and get funded to do […]

What can we do with complex numbers in Stan?

I’m wrapping up support for complex number types in the Stan math library. Now I’m wondering what we can do with complex numbers in statistical models. Functions operating in the complex domain The initial plan is to add some matrix functions that use complex numbers internally: fast fourier transforms asymmetric eigendecomposition Schur decomposition The eigendecomposition […]

Deep learning workflow

Ido Rosen points us to this interesting and detailed post by Andrej Karpathy, “A Recipe for Training Neural Networks.” It reminds me a lot of various things that Bob Carpenter has said regarding the way that some fitting algorithms are often oversold because the presenters don’t explain the tuning that was required to get good […]

Rank-normalization, folding, and localization: An improved R-hat for assessing convergence of MCMC

With Aki, Dan, Bob, and Paul: Markov chain Monte Carlo is a key computational tool in Bayesian statistics, but it can be challenging to monitor the convergence of an iterative stochastic algorithm. In this paper we show that the convergence diagnostic R-hat of Gelman and Rubin (1992) has serious flaws. R-hat will fail to correctly […]

An article in a statistics or medical journal, “Using Simulations to Convince People of the Importance of Random Variation When Interpreting Statistics.”

Andy Stein writes: On one of my projects, I had a plot like the one above of drug concentration vs response, where we divided the patients into 4 groups. I look at the data below and think “wow, these are some wide confidence intervals and random looking data, let’s not spend too much time more […]

A normalizing flow by any other name

Another week, another nice survey paper from Google. This time: Papamakarios, G., Nalisnick, E., Rezende, D.J., Mohamed, S. and Lakshminarayanan, B., 2019. Normalizing Flows for Probabilistic Modeling and Inference. arXiv 1912.02762. What’s a normalizing flow? A normalizing flow is a change of variables. Just like you learned way back in calculus and linear algebra. Normalizing […]

Making differential equation models in Stan more computationally efficient via some analytic integration

We were having a conversation about differential equation models in pharmacometrics, in particular how to do efficient computation when fitting models for dosing, and Sebastian Weber pointed to this Stancon presentation that included a single-dose model. Sebastian wrote: Multiple doses lead to a quick explosion of the Stan codes – so things get a bit […]

How good is the Bayes posterior for prediction really?

It might not be common courtesy of this blog to make comments on a very-recently-arxiv-ed paper. But I have seen two copies of this paper entitled “how good is the Bayes posterior in deep neural networks really” left on the tray of the department printer during the past weekend, so I cannot underestimate the popularity of […]

Monte Carlo and winning the lottery

Suppose I want to estimate my chances of winning the lottery by buying a ticket every day. That is, I want to do a pure Monte Carlo estimate of my probability of winning. How long will it take before I have an estimate that’s within 10% of the true value? It’ll take… There’s a big […]

MRP Conference at Columbia April 3rd – April 4th 2020

The Departments of Statistics and Political Science and Institute for Social and Economic Research and Policy at Columbia University are delighted to invite you to our Spring conference on Multilevel Regression and Poststratification. Featuring Andrew Gelman, Beth Tipton, Jon Zelner, Shira Mitchell, Qixuan Chen and Leontine Alkema, the conference will combine a mix of cutting […]

Rao-Blackwellization and discrete parameters in Stan

I’m reading a really dense and beautifully written survey of Monte Carlo gradient estimation for machine learning by Shakir Mohamed, Mihaela Rosca, Michael Figurnov, and Andriy Mnih. There are great explanations of everything including variance reduction techniques like coupling, control variates, and Rao-Blackwellization. The latter’s the topic of today’s post, as it relates directly to […]