Jacob Schumaker writes:
Reformed political scientist, now software engineer here.
Re: the hot hand fallacy fallacy from Miller and Sanjurjo, has anyone discussed why a basic regression doesn’t solve this? If they have I haven’t seen it.
The idea is just that there are other ways of measuring the hot hand. When I think of it, it’s the difference in the probability of making a shot between someone who just made a shot and someone who didn’t. In that case, your estimate is unbiased right? The fallacy identified by Miller and Sanjurjo only matters if you analyze the data in a certain way, right?
My quick answer: (a) hotness is not just about the last shot you made or missed, so yours will be a very noisy measure, and (b) with finite sequences, your approach will have the same bias as in Gilovich et al.’s estimate. To put it another way, the regression you can do on the data is not the regression you want to do; it’s a regression with measurement error in x, and that gives you a biased estimate; also there’s the selection issue that Miller and Sanjurjo discussed. There’s really no easy way out.
There have now been two compelling studies into hot hands in baseball. To some degree or another they suffer all these biases, but I think overall they are really clever analyses.
https://www.gsb.stanford.edu/insights/jeffrey-zwiebel-why-hot-hand-may-be-real-after-all
https://fivethirtyeight.com/features/baseballs-hot-hand-is-real/
To analyze properly the hot-hand hypothesis in basketball, you’d really want to be able to control for defensive effort. I can’t recall if any of these studies do this, but you would expect defensive alignments to react to a hot hand, which would reduce the shooter’s p(success | history) but not necessarily his/her p(success | history, defensive effort), which is how I’d define hot hand.
I’ve heard many people make arguments like this, but I confess I’ve never understood it. If the hot hand phenomenon includes defensive effort to stop it, why not where the shot is released from as well? Why not atmospheric conditions?
What started as a simple example of the fact that human minds impose patterns on random events (with the observation that to many people roulette wheels have hot hands) has become a complex scientific explanation of something that seems to be really, really uninteresting. (And unlike Shravan, I love sports examples.) Do we really want a model of the probability that player X makes a shot in state of the world Z1,…,Zn? And we want that model not to prove we can model probabilities, but to ask the question of whether adding history variables adds significantly to our understanding of p?
I mean, I get Miller and Sanjurjo, and it was definitely useful as an statistical teaching tool. But the real animating question here is whether the probability changes *enough* so that the naive pattern-making cerebral apparatus’ impression of a change in p is indicative of a real-world change in p. And it absolutely isn’t. People can’t tell the difference between short random binary sequences with p=.45 and p=.50. Neither, it turns out, can statisticians without tons of data. We knew that before we started, didn’t we?
The hot hand phenomenon is (to me) a certainty after n shots than shot n+1 will be good. It is not refuted when shot n+1 has good defense. That’s just called “good defense broke up the hot hand.” Irrefutable.
Jonathan:
The probability of making a shot could well change by much more than 5 percentage points, comparing a player when he is hot to when he is cold. The point of all the statistical analysis by Miller, Sanjurjo, and many others is that the simple estimate is severely biased, both because of that weird probability thing that Miller and Sanjurjo identified, and also because of attenuation when “hotness” is measured in a highly noisy way based on the success of failure of the previous shot.
Fair enough, I guess, but again, my interpretation of the hot hand is that it isn’t about probabilities at all. It’s about certainties which are occasionally incorrect. I realize that sounds like a probabilistic statement to a statistician, but when people were watching Vinnie Johnson shoot in the 80’s, people were deciding whether p=1 or p=0 on that shot. It’s not the phenomenon that Gilovich was exploring… his (flawed) methodology was simply to try to look at an implication of the underlying phenomenon.
Let me put it another way. If Vinnie makes 3 in a row and shot 4 is blocked, no one would take that as evidence of a lack of a hot hand, but the stats have to.
Well the record for freethrows made in a row is in the thousands. Basically the limiting factor becomes fatigue/boredom rather than skill at that level.
So the real question is why players are missing at all when undefended. How often do you miss when throwing the trash out (I bet p~1)? Same for some people and free throws.
Yes… which is why no one thinks there’s a hot hand in free throws. Probably because they’re correct.
I don’t know much about this but a quick search shows it is a rather common belief:
The hot hand in basketball:On the misperception of random sequences
https://pdfs.semanticscholar.org/f472/0326b81d5528c0458510cd87ea7b57418c54.pdf
Revisiting the Hot Hand Theory with Free Throw Data in a Multivariate Framework
https://econpapers.repec.org/article/bpjjqsprt/v_3a6_3ay_3a2010_3ai_3a1_3an_3a2.htm
A Cold Shower for the Hot Hand Fallacy
http://asfee2015.sciencesconf.org/61541/document
Also, something isn’t clear to me. Is it a corollary of the hot hand fallacy that warming up doesn’t work, and fatigue plays no role (ie the claim is that p(success) doesn’t increase/decrease during the session/game)?
well, no people might believe that the hot hand is merely one of many factors, that can be overwhelmed by fatigue, or not being warmed up, or any number of things. The idea of the hot hand is that ceteris paribus players who made the last shot are more likely to make the next one, but it is only one of many influences on whether a player makes a shot.
It’s certainly possible that hitting the first free throw makes it more likely you hit the second. If people want to call that a hot hand, well, they can call it anything they like. But it’s certainly not the phenomenon that gave the phrase “hot hand” currency.
Jonathan:
your point: “People can’t tell the difference between short random binary sequences with p=.45 and p=.50. Neither, it turns out, can statisticians without tons of data. We knew that before we started, didn’t we?”
nice one. this is a point we like to make (but in a different way). The conceptual distinction between these cases is that statisticians see 0s and 1s, which are not that diagnostic of the underlying probability of success, whereas people–i.e. players & coaches—see much more (e.g mechanics, mood), which *may* be diagnostic of the underlying probability. There is some suggestive evidence that this is the case. For example, even in GVT’s, when re-analyzed, players can predict shot outcomes at rates better than chance: https://ssrn.com/abstract=3032826
You say that this “has become a complex scientific explanation of something that seems to be really, really uninteresting?” I guess it depends on what your goal is, and where it goes from here. Some people think the idea of flow, the zone, or whatever is interesting to study. Others think it is uninteresting because it is completely obvious that confidence, focus, motivation and motor control will vary sufficiently so as to affect the ability of a professional athlete. Still others think (thought) that both parties are misguided because there is nothing there, they are just interpreting patterns in randomness. I’d put my money on everyone’s point having some seed of a truth that is often underestimated but also often blown out of all proportion.
> having some seed of a truth that is often underestimated but also often blown out of all proportion.
And current academic processes (anything that converts uncertainty into claims with confidence with poor expected sign and magnitude errors) enthusiastically throws these seeds of truth into a fire while reciting, “If you love me, pop and fly; if you hate me lay and die”
pop(corn)-science!
+1 (though I still like your paper)
thx!
Joshua:
Where exactly in your MS paper are the results for computing the bias when k>1?
I gather there’s no neat formula except when k=1, but you can easilycalculate
it for any specific k,n,p. Is that right?
Thanks, but if you are warming up you will be more likely to make more shots in a row. If you are getting fatigued you are more likely to miss more shots in a row. Wouldn’t this look like hot/cold hands in the sequence data? Perhaps there is a data aggregation/reduction issue here.
http://asfee2015.sciencesconf.org/61541/document
So the hot hand is defined as something other than warm-up/fatigue and good/bad day effects. So if we believe there is a fallacy here, it amounts to saying the performance level does change, but only at the beginning and end of a session. It just can’t change (much) in the middle of a session… I didn’t expect this.
Also, before I looked closer at that paper I made a little sim of warmup/fatigue. At least the statistics p(H_i|M_i-1) and p(Hi|H_i-1) seem pretty indistinguishable even if the probability of success varies ~12%: https://image.ibb.co/gP4yJG/hothand.png
Trying out the “code” tag this time:
n = 100
x = 150
k = .5
t = -x:x
p = 1/(1 + exp(k*(t/x)^2))
# p = rep(.5, length(t))
# simulate some data assuming warm-up and fatigue
dat = replicate(n, sapply(p, function(x) sample(0:1, 1, prob = c(1 - x, x))))
out = matrix(nrow = ncol(dat), ncol = 2)
colnames(out) = c("pHH", "pHM")
for(i in 1:ncol(dat)){
hits = which(dat[, i] == 1)
miss = which(dat[, i] == 0)
hits = hits[!hits == nrow(dat)]
miss = miss[!miss == nrow(dat)]
pHH = mean(dat[hits + 1, i])
pHM = mean(dat[miss + 1, i])
out[i, ] = cbind(pHH, pHM)
}
par(mfrow = c(2, 2))
plot(p, ylim = c(0, .5), main = paste0("k = ", k))
plot(cumsum(dat[,1]))
lines(cumsum(sample(0:1, nrow(dat), replace = T)), col = "Red", lwd = 2)
hist(out[, "pHH"], breaks = seq(0, 1, by = .01))
hist(out[, "pHM"], breaks = seq(0, 1, by = .01))