Skip to content
Archive of posts filed under the Bayesian Statistics category.

Reverse-engineering the problematic tail behavior of the Fivethirtyeight presidential election forecast

We’ve been writing a bit about some odd tail behavior in the Fivethirtyeight election forecast, for example that it was giving Joe Biden a 3% chance of winning Alabama (which seemed high), it was displaying Trump winning California as in “the range of scenarios our model thinks is possible” (which didn’t seem right), and it […]

Interactive analysis needs theories of inference

Jessica Hullman and I wrote an article that begins, Computer science research has produced increasingly sophisticated software interfaces for interactive and exploratory analysis, optimized for easy pattern finding and data exposure. But assuming that identifying what’s in the data is the end goal of analysis misrepresents strong connections between exploratory and confirmatory analysis and contributes […]

“Model takes many hours to fit and chains don’t converge”: What to do? My advice on first steps.

The above question came up on the Stan forums, and I replied: Hi, just to give some generic advice here, I suggest simulating fake data from your model and then fitting the model and seeing if you can recover the parameters. Since it’s taking a long time to run, I suggest just running your 4 […]

Calibration problem in tails of our election forecast

Following up on the last paragraph of this discussion, Elliott looked at the calibration of our state-level election forecasts, fitting our model retroactively to data from the 2008, 2012, and 2016 presidential elections. The plot above shows the point prediction and election outcome for the 50 states in each election, showing in red the states […]

Stan’s Within-Chain Parallelization now available with brms

The just released R package brms version 2.14.0 supports within-chain parallelization of Stan. This new functionality is based on the recently introduced reduce_sum function in Stan, which allows to evaluate sums over (conditionally) independent log-likelihood terms in parallel, using multiple CPU cores at the same time via threading. The idea of reduce_sum is to exploit […]

More on martingale property of probabilistic forecasts and some other issues with our election model

Edward Yu writes: I’m wondering if you’ve seen Nassim Taleb’s article arguing that we should price election forecasts as binary options. You seem to be generally fine with this approach, as when Nate Silver asked your colleague: On the off-chance our respective employers would allow it, which they almost certainly wouldn’t in my case, could […]

They’re looking for Stan and R programmers, and they’re willing to pay.

Tom Vladeck writes: I am one half of a company building a media mix model, primarily for online e-commerce brands. Our modeling is done in Stan, and we are looking to hire part time developers (paid, of course, at a real rate) to build and maintain our Stan models and R code. They can be […]

Postdoc in Bayesian spatiotemporal modeling at Imperial College London!

Seth Flaxman writes: We are hiring a postdoctoral research associate with a background in statistics or computer science to join a vibrant team at the cutting edge of the emerging field of spatiotemporal statistical machine learning (ST-SML). ST-SML draws in equal parts on Bayesian spatiotemporal statistics, scalable kernel methods and Gaussian processes, and recent deep […]

Election Scenario Explorer using Economist Election Model

Ric Fernholz writes: I wanted to tell you about a new website I built together with my brother Dan. The 2020 Election Scenario Explorer allows you to explore how electoral outcomes in individual states influence the national election outlook using data from your election model. The map and tables on our site reveal some interesting […]

Election forecasts: The math, the goals, and the incentives (my talk this Friday afternoon at Cornell University)

At the Colloquium for the Center for Applied Mathematics, Fri 18 Sep 3:30pm: Election forecasts: The math, the goals, and the incentives Election forecasting has increased in popularity and sophistication over the past few decades and has moved from being a hobby of some political scientists and economists to a major effort in the news […]

Information, incentives, and goals in election forecasts

Jessica Hullman, Christopher Wlezien, and I write: Presidential elections can be forecast using information from political and economic conditions, polls, and a statistical model of changes in public opinion over time. We discuss challenges in understanding, communicating, and evaluating election predictions, using as examples the Economist and Fivethirtyeight forecasts of the 2020 election. Here are […]

Post-stratified longitudinal item response model for trust in state institutions in Europe

This is a guest post by Marta Kołczyńska: Paul, Lauren, Aki, and I (Marta) wrote a preprint where we estimate trends in political trust in European countries between 1989 and 2019 based on cross-national survey data. This paper started from the following question: How to estimate country-year levels of political trust with data from surveys […]

Problem of the between-state correlations in the Fivethirtyeight election forecast

Elliott writes: I think we’re onto something with the low between-state correlations [see item 1 of our earlier post]. Someone sent me this collage of maps from Nate’s model that show: – Biden winning every state except NJ – Biden winning LA and MS but not MI and WI – Biden losing OR but winning […]

More on that Fivethirtyeight prediction that Biden might only get 42% of the vote in Florida

I’ve been chewing more on the above Florida forecast from Fivethirtyeight. Their 95% interval for the election-day vote margin in Florida is something like [+16% Trump, +20% Biden], which corresponds to an approximate 95% interval of [42%, 60%] for Biden’s share of the two-party vote. This is buggin me because it’s really hard for me […]

Florida. Comparing Economist and Fivethirtyeight forecasts.

Here’s our current forecast for Florida: We’re forecasting 52.6% of the two-party vote for Biden, with a 95% predictive interval of approx [47.0%, 58.2%], thus an approx standard error of 2.8 percentage points. The 50% interval from the normal distribution is mean +/- 2/3 s.e., thus approx [50.7%, 54.5%]. Yes, I know these predictive distributions […]

More limitations of cross-validation and actionable recommendations

This post is by Aki. Tuomas Sivula, Måns Magnusson, and I (Aki) have a new preprint paper that analyzes one of the limitations of cross-validation Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison. Normal distribution has been used to present the uncertainty in cross-validation for a single model and in model comparison at least since […]

Bayesian Workflow (my talk this Wed at Criteo)

Wed 26 Aug 5pm Paris time (11am NY time): The workflow of applied Bayesian statistics includes not just inference but also model building, model checking, confidence-building using fake data, troubleshooting problems with computation, model understanding, and model comparison. We move toward codifying these steps in the realistic scenario in which we are fitting many models […]

David Spiegelhalter wants a checklist for quality control of statistical models?

David Spiegelhalter writes in with a quick question: Although I don’t do any technical stuff now, I find myself arguing for using quantified expert judgement in assessing a distribution for the size of systematic biases in estimates from lower-quality data-sources, particularly for official stats such as migration estimates, but also in other areas. We have […]

Cmdstan 2.24.1 is released!

Rok writes: We are very happy to announce that the next release of Cmdstan (2.24.1) is now available on Github. You can find it here: 2 New features: A new ODE interface Functions for hidden Markov models with a discrete latent variable Elementwise pow operator and matrix power function Newton solver Support for the […]

“I just wanted to say that for the first time in three (4!?) years of efforts, I have a way to estimate my model. . . .”

After attending a Stan workshop given by Charles Margossian at McGill University, Chris Barrington-Leigh wrote: I just wanted to say that for the first time in three (4!?) years of efforts, I have a way to estimate my model. Your workshop helped me and pushed me to be persistent enough to code up my model. […]