It’s here! (and here’s the page with all the Stan case studies).

In this case study, I’m following up on two earlier posts, here and here, which in turn follow up this 2002 paper with Deb Nolan.

My Stan case study is an adaptation of a model fit by Columbia business school professor and golf expert Mark Broadie to a dataset that he put together. I pointed Broadie to the case study and he gave two comments.

First, he wrote:

Purely as a fitting exercise, the logistic model using a linear function of distance, x, does not work well. What does fit well is using log(x), log(x)^2 and log(x)^3, where x is the initial distance of the putt. I believe there are simple physics reasons why log(x) and powers should work better than x.

That makes sense. But I think this whole epicycle thing is not the best way to go here, given that we can fit a more direct model using the geometry of the shot.

Second, Broadie read this passage from my case study: “The model is unrealistic in other ways, for example by assuming distance error is strictly proportional to distance aimed, and assuming independence of angular and distance errors. Presumably, angular error is higher for longer putts. Again, though, we can’t really investigate such things well using these data which are already such a good fit to the simple two-parameter model we have already fit.”

Broadie responded:

I agree and view this example as an interesting simple modeling exercise that shows the difference between a “purely fitting exercise” and building models from “first principles” (in two steps). In reality, putts are not hit over flat level surfaces, so the “real” modeling problem involves green contour information in addition to the putt distance and success results. A simple model of a green surface is to assume a planar green surface that has a non-zero slope. Then the putt difficulty can be described by both the distance and the green slope vector. This planar green model produces putts that “curve” or “break” as they travel from the initial position to the hole.

The model can also be expanded by allowing the parameters to vary by golfer and playing conditions.

**P.S.** I wrote this example up as a Stan case study because it’s a great example of probabilistic modeling and Bayesian workflow. Usually, the main benefits of the Bayesian framework are regularization, partial pooling, and the use of prior information. In this particular example, it’s none of these; rather, the benefit is that we can cleanly write and fit our models in Stan, allowing a comfortable workflow in which we can play around with our models and graph the results in R. If we were to work with more granular data and try to estimate different parameters for each golfer using a hierarchical model, then some of those other virtues of Bayes would come into play.

Cool! I first saw this golf model example on a YouTube video and thought it was the coolest thing.

I probably missed something in my reading, but why does the new data (in red) look so much smoother than the old data (blue)? The old data looks more ‘real’ with more variability. Is the new data actual data from golf putts?

Jd:

I think the new data are smoother for two reasons. First, the sample size for the new data is much higher, so you don’t see the pure noise variation that you see in the older, smaller data set. Second, the new data are gathered in a more organized way, I think they’re all the data from some large set of tournaments. I’m not quite sure where the older dataset came from, and there could be some problems with measurement or selection.

Andrew, this case study is really fabulous as it emphasizes that Bayesian analyses aren’t in general just Frequentist analyses but including a prior. The real advantage of Bayesian methods is they give you a logical way to do statistics on models involving mechanism. Here the mechanisms are things like geometric relationships between angles and distances, relationships between initial energy, energy dissipation, and distance traveled, relationships between mathematical assumptions like Binomial vs Normal and the actual shape of model errors… etc

At each step, you take a stylized fact such as “even if you have a very specific constant sized angular error, the ability to sink the put changes with distance because the hole takes up a smaller and smaller angle at larger distances, with the angle being proportional to 1/distance” and then incorporates this knowledge into the model. Later you incorporate things like “binomial errors assume the variance is related directly to sample size, but the variance is actually related more to distance traveled” etc…

People often seem to think that a Bayesian analysis is just about taking a frequentist analysis and adding a prior and interpreting the results as posterior probability… but it’s not, not even close.

They need to measure sigma_angle some other way to check the model.

I agree. When I first saw the golf example, it was eye-opening for me.

I wasn’t taught Bayesian methods in school. I have switched to them over the last few years, and it went like this: an actual estimate and uncertainty seemed more useful than a hypothesis test -> I looked at bootstrapping and CI’s -> the definition of a CI wasn’t easy -> the average PI interprets CI as a posterior probability interval anyway -> Bayesian interpretation seemed like what was wanted. Then: I almost always needed models with varying intercepts and slopes -> for lme4 to get a good estimate of the CI for parameters it seemed I needed to do bootstrapping or MCMC -> this took a long time -> running model in brms seemed actually faster than running it quickly in lme4 + bootstrapping. Plus brms seemed way more flexible and had tools to present results. Then: realized I have some prior information -> can use this in Bayesian models. And in general running models with varying slopes and intercepts just seems easier.

So far, it just seemed like a much better way to run the same frequentist models and report results in a more useful way and avoid NHST.

Then I saw this golf model on a YouTube video, and it really made me realize how much more one could do with some thinking + expertise. ..now to actually try to implement this.

Yep, I followed a similar path to rediscovering science after being horribly mistrained to think about things the wrong way.

+1. Your path is quite common I think :)

As of 2019, players are allowed to leave the flag stick (or the “pin”) in the hole when putting. There is some uncertainty about how this affects performance. On the one hand, the pin acts as a backstop so you can hit the ball harder and reduce the break, but on the other hand it can deflect balls that might have gone in.

So the analysis can be redone to see what has changed because of the new rule. Some players leave the pin in and some don’t, and some leave it in sometimes and not others, so it would be particularly interesting if the data included information as to whether the pin was left in on a given putt.

I think it *is* a cool example, but I might be very strange in that I prefer the logistic regression, because it doesn’t dictate the probability to be at 100% for 0 distance. The reason is golfer “yips” might make shorter puts a non 100% situation.

http://www.statisticool.com/golfputtingmodels.htm

For the new data, I’m going to assume the data from 1996 and 2016-2018 are independent, and merge the datasets together, and run standard logistic regression on them.

Summary:

-golfers predicted to miss more than make putts if distance > ~12ft

-prob(making distance=0 putt) = 94%. Maybe not 100% for small distances because of the well-known “yips”

-model over-predicts success for short distances, and under-predicts success for large distances

Now I’ll do the same thing, but if a distance is 16, I’ll merge those counts in with the distance 16.43 cases, etc., instead of treating them like separate categories. For example:

16 201 27

16.43 35712 7196

16.43 35913(=35712+201) 7223(=7196+27)

I wouldn’t have merged like this if it was 16.67. I would have merged that in with the 17 category. Dichotomania.

Doing that I get…not much changes at all from what I did at first.

I realize there was a ton of overdispersion. So I refit with Williams’ method to accommodate it. Much better I think, but still not great.

Justin

Justin:

Look at the data. The golfers made 45183 out of 45198 putts that were an average of 0.28 feet from the hole. 45183/45198 = 0.9997. Not 0.94.

Hi Andrew, I think this is what I was doing. The values a and b are the estimated intercept and slope from the logistic regression.

Original data:

a: 2.231

b: -.2557

predicted prob of success for x=0: .903

New data:

a: 2.836

b: -.2481

predicted prob of success for x=0: .9446

Original data appended with New data (ie. one big dataset):

a: 2.832

b: -.2481

predicted prob of success for x=0: .9443

What were the populations of the original and new data? Amateurs and pros, or all pro players?

Justin