Skip to content
Archive of entries posted by

The Economist not hedging the predictions

Andrew’s hedge that he’s predicting vote intent and not accounting for any voting irregularities either never made it to the editorial staff at The Economist or they chose to ignore it. Their headline still reports that they’re predicting the electoral college, not voter intent: Predicting voter intent is a largely academic exercise in that all […]

Hiring at all levels at Flatiron Institute’s Center for Computational Mathematics

We’re hiring at all levels at my new academic home, the Center for Computational Mathematics (CCM) at the Flatiron Insitute in New York City. We’re going to start reviewing applications January 1, 2021. A lot of hiring We’re hoping to hire many people for each of the job ads. The plan is to grow CCM […]

More on absolute error vs. relative error in Monte Carlo methods

This came up again in a discussion from someone asking if we can use Stan to evaluate arbitrary integrals. The integral I was given was the following: where the -ball is assumed to be in dimensions so that . (MC)MC approach The textbook Monte Carlo approach (Markov chain or plain old) to evaluating such an […]

Probabilities for action and resistance in Blades in the Dark

Later this week, I’m going to be GM-ing my first session of Blades in the Dark, a role-playing game designed by John Harper. We’ve already assembled a crew of scoundrels in Session 0 and set the first score. Unlike most of the other games I’ve run, I’ve never played Blades in the Dark, I’ve only […]

Drunk-under-the-lamppost testing

Edit: Glancing over this again, it struck me that the title may be interpreted as being mean. Sorry about that. It wasn’t my intent. I was trying to be constructive and I really like that analogy. The original post is mostly reasonable other than on this one point that I thought was important to call […]

Super-duper online matrix derivative calculator vs. the matrix normal (for Stan)

I’m implementing the matrix normal distribution for Stan, which provides a multivariate density for a matrix with covariance factored into row and column covariances. The motivation A new colleague of mine at Flatiron’s Center for Comp Bio, Jamie Morton, is using the matrix normal to model the ocean biome. A few years ago, folks in […]

Make Andrew happy with one simple ggplot trick

By default, ggplot expands the space above and below the x-axis (and to the left and right of the y-axis). Andrew has made it pretty clear that he thinks the x axis should be drawn at y = 0. To remove the extra space around the axes when you have continuous (not discrete or log […]

Prior predictive, posterior predictive, and cross-validation as graphical models

I just wrote up a bunch of chapters for the Stan user’s guide on prior predictive checks, posterior predictive checks, cross-validation, decision analysis, poststratification (with the obligatory multilevel regression up front), and even bootstrap (which has a surprisingly elegant formulation in Stan now that we have RNGs in trnasformed data). Andrew then urged me to […]

Naming conventions for variables, functions, etc.

The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future. Three principles of naming follow: 1. Names should mean something. 2. Names should be as short as possible. 3. Use your judgement to balance (1) and (2). The third […]

Is data science a discipline?

Jeannette Wing, director of the Columbia Data Science Institute, sent along this link to this featured story (their phrase) on their web site. Is data science a discipline? Data science is a field of study: one can get a degree in data science, get a job as a data scientist, and get funded to do […]

What can we do with complex numbers in Stan?

I’m wrapping up support for complex number types in the Stan math library. Now I’m wondering what we can do with complex numbers in statistical models. Functions operating in the complex domain The initial plan is to add some matrix functions that use complex numbers internally: fast fourier transforms asymmetric eigendecomposition Schur decomposition The eigendecomposition […]

A normalizing flow by any other name

Another week, another nice survey paper from Google. This time: Papamakarios, G., Nalisnick, E., Rezende, D.J., Mohamed, S. and Lakshminarayanan, B., 2019. Normalizing Flows for Probabilistic Modeling and Inference. arXiv 1912.02762. What’s a normalizing flow? A normalizing flow is a change of variables. Just like you learned way back in calculus and linear algebra. Normalizing […]

Monte Carlo and winning the lottery

Suppose I want to estimate my chances of winning the lottery by buying a ticket every day. That is, I want to do a pure Monte Carlo estimate of my probability of winning. How long will it take before I have an estimate that’s within 10% of the true value? It’ll take… There’s a big […]

Abuse of expectation notation

I’ve been reading a lot of statistical and computational literature and it seems like expectation notation is absued as shorthand for integrals by decorating the expectation symbol with a subscripted distribution like so: This is super confusing, because expectations are properly defined as functions of random variables. For example, the square bracket convention arises because […]

Rao-Blackwellization and discrete parameters in Stan

I’m reading a really dense and beautifully written survey of Monte Carlo gradient estimation for machine learning by Shakir Mohamed, Mihaela Rosca, Michael Figurnov, and Andriy Mnih. There are great explanations of everything including variance reduction techniques like coupling, control variates, and Rao-Blackwellization. The latter’s the topic of today’s post, as it relates directly to […]

Royal Society spam & more

Just a rant about spam (and more spam) from pay-to-publish and closed-access journals. Nothing much to see here. The latest offender is from something called the “Royal Society.” I don’t even know to which king or queen this particular society owes allegiance, because they have a .org URL. Exercising their royal prerogative, they created an […]

Beautiful paper on HMMs and derivatives

I’ve been talking to Michael Betancourt and Charles Margossian about implementing analytic derivatives for HMMs in Stan to reduce memory overhead and increase speed. For now, one has to implement the forward algorithm in the Stan program and let Stan autodiff through it. I worked out the adjoint method (aka reverse-mode autodiff) derivatives of the […]

Macbook Pro (16″ 2019) quick review

I just upgraded yesterday to one of the new 2019 Macbook Pro 16″ models: Macbook Pro (16″, 2019), 3072 x 1920 pixel display, 2.4 GHz 8-core i9, 64GB 2667 MHz DDR4 memory, 2880 x 1800 pixel display, AMD Radeon Pro 5500M GPU with 4GB of GDDR6 memory, 1 TB solid-state drive US$4120 list including Apple […]

Field goal kicking—like putting in 3D with oblong balls

Putting Andrew Gelman (the author of most posts on this blog, but not this one), recently published a Stan case study on golf putting [link fixed] that uses a bit of geometry to build a regression-type model based on angles and force. Field-goal kicking In American football, there’s also a play called a “field goal.” […]

Econometrics postdoc and computational statistics postdoc openings here in the Stan group at Columbia

Andrew and I are looking to hire two postdocs to join the Stan group at Columbia starting January 2020. I want to emphasize that these are postdoc positions, not programmer positions. So while each position has a practical focus, our broader goal is to carry out high-impact, practical research that pushes the frontier of what’s […]