## Stan’s Super Bowl prediction: Broncos 24, Panthers 13

We ran the data through our model, not just the data from the past season but from the past 17 seasons (that’s what we could easily access) with a Gaussian process model to allow team abilities to vary over time. Because we’re modeling individual game outcomes, our model automatically controls for imbalances such as Carolina’s notoriously easy schedule. And we don’t just model win/loss or even score differential, we model points for each team, which allows us to estimate offense and defense numbers for each team. Also we model separate scores (TD, FG, etc) so that we can get some shot at predicting the actual scores.

Our model isn’t perfect; there’s a lot more information out there we’re not using. No play-level data or even player-level data. Still, it’s what our model predicts: Broncos 24, Panthers 13.

P.S. (9 Feb) Hey, the game’s over. What actually happened? Broncos 24, Panthers 10. Pretty good! Actually better than we might expect—we got lucky. But we’ll take it.

Go Stan!

P.P.S. Damn! We forgot to preregister. But you can take our word for it that this is the only analysis we did with these data.

P.P.P.S. To all you vexatious replication bullies who keep buggin me about the data: I’ll release my Excel files when I damn well please. We spent a year working on this paper, sweating out every number, sweating out over what we were doing, and then to see people blogging about it in real time—that’s not the way science really gets done. And so itâ€™s a little hard for us to respond to all of the blog posts that are coming out.

1. Jaime says:

I’d love to see code / full model. Maybe in the Stan Examples Wiki? https://github.com/stan-dev/example-models/wiki

2. Jonathan (another one) says:

So Stan couldn’t predict that Gano’s field goal would miss by 2 feet? What use is MCMC then?

• Dale says:

Actually, in many of the Monte Carlo simulations, the field goal was missed.

• Tom M says:

Well, my model (completely subjective prior plus uniform likelihood) correctly predicted the final score. Unfortunately, gremlins got into my Excel spreadsheeet and switched the team labels.

3. Arun says:

Is this posted anywhere? Would be interested to see what the Stan model looks like and how the Gaussian process is included.

4. Giri says:

Wow that is close; must have been those moderately-strongly-weak informative priors!

5. Roy says:

Rumor has it that the Broncos each did a power pose for six minutes an hour before the game and that was the difference. I also heard you had gotten wind of this and included it in your model, all of which will be covered in your upcoming TED talk. it will be interesting to hear you explain the significant effect of the power pose to a live audience, I have never had the chance to hear you speak.

6. Laplace says:

Damn! We forgot to preregister

You’re in good company. Kepler and Newton conducted pure post hoc data dredging/curve fitting on purely observational data, without preregistering anything. Their results (Kepler’s laws of planetary motion & Netwon’s law of gravity) wouldn’t have passed a modern frequentist significance test (according to Jeffreys), yet they had the gall to publish their results in non-peer reviewed books.

That was 3 or 4 centuries ago. Just imagine what those amateurs could have accomplished with preregistration, peer review, randomized control trials to determine causality, and Frequentism. Hell, they might have solved the whole “expansive poses” issue. Sometimes I wonder if the Universe is playing a giant practical joke on us.

• John Hall says:

To be fair to Kepler and Newton, you can’t run an experiment on planetary motion.

• Bob says:

I don’t think that “curve fitting” describes Newton’s calculation of the speed of sound. Garden of Forking paths might describe how he dealt with the fact that his theoretical model did not match the data. But, he presented his theoretical model first and then offered some (incorrect) patches to his model to move the prediction closer to the data.

More than a century later, Laplace added a correction term that accounted for the fact that the compression and expansion of air in soundwaves is adiabatic, not isothermal, and got a formula that gives answers much closer to measurements.

Bob

• Laplace says:

We should turn this into a competition. Whenever someone makes a claim about how science should be done, is done, or only can be done, the contestants have to the find the most egregiously successful example of a famous scientist doing the exact opposite and making a big discovery.

7. Dzhaughn says:

Good form: you didn’t even predict in your “on deck this week” that you would blog a prediction.

8. Hernan Bruno says:

I started using STAN after seeing Andrew’s analysis of the Football (Soccer) World Cup games. That convinced me that STAN was the way to go and was easy to use. So I totally fell for this post. I only understood the joke reading the comments. That is the degree of my faith in STAN and Prof. Gelman.

Now, it would be nice if someone did the analysis ;)

9. Chris G says:

You know I’m pretty sure that if you’d offered Panthers +9.5 to the general public that you’d have scored a bigger windfall than if you’d won Powerball. What’s the honorarium for a TED talk?

(I bet Roy is right – see his comment above. The power posing probably was probably what made the difference.)

10. Damn it, I sent this to my students before I realized it was a joke. (I had them do Super Bowl score predictions as an exercise the week before the game…)