Sander Greenland recently published a paper with a very clear and thoughtful exposition on why causality, logic and context need full consideration in any statistical analysis, even strictly descriptive or predictive analysis. For instance, in the concluding section – “Statistical science (as opposed to mathematical statistics) involves far more than data – it requires realistic […]

**Miscellaneous Statistics**category.

## Further formalization of the “multiverse” idea in statistical modeling

Cristobal Young and Sheridan Stewart write: Social scientists face a dual problem of model uncertainty and methodological abundance. . . . This ‘uncertainty among abundance’ offers spiraling opportunities to discover a statistically significant result. The problem is acute when models with significant results are published, while those with non-significant results go unmentioned. Multiverse analysis addresses […]

## Greek statistician is in trouble for . . . telling the truth!

Paul Alper points us to this news article by Catherine Rampell, which tells this story: Georgiou is not a mobster. He’s not a hit man or a spy. He’s a statistician. And the sin at the heart of his supposed crimes was publishing correct budget numbers. The government has brought a relentless series of criminal […]

## What went wrong with the polls in 2020? Another example.

Shortly before the election the New York Times ran this article, “The One Pollster in America Who Is Sure Trump Is Going to Win,” featuring Robert Cahaly, who on election day forecast Biden to win 235 electoral votes. As you may have heard, Biden actually won 306. Our Economist model gave a final prediction of […]

## The Pfizer-Biontech Vaccine May Be A Lot More Effective Than You Think?

Ian Fellows writes: I [Fellows] just wrote up a little Bayesian analysis that I thought you might be interested in. Specifically, everyone seems fixated on the 90% effectiveness lower bound reported for the Pfizer vaccine, but the true efficacy is likely closer to 97%. Please let me know if you see any errors. I’m basing […]

## Bayesian Workflow

Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Paul-Christian Bürkner, Lauren Kennedy, Jonah Gabry, Martin Modrák, and I write: The Bayesian approach to data analysis provides a powerful way to handle uncertainty in all observations, model parameters, and model structure using probability theory. Probabilistic programming languages make it easier to specify and […]

## Concerns with our Economist election forecast

A few days ago we discussed some concerns with Fivethirtyeight’s election forecast. This got us thinking again about some concerns with our own forecast for The Economist (see here for more details). Here are some of our concerns with our forecast: 1. Distribution of the tails of the national vote forecast 2. Uncertainties of state […]

## “Valid t-ratio Inference for instrumental variables”

A couple people pointed me to this recent econometrics paper, which begins: In the single IV model, current practice relies on the first-stage F exceed- ing some threshold (e.g., 10) as a criterion for trusting t-ratio inferences, even though this yields an anti-conservative test. We show that a true 5 percent test instead requires an […]

## Body language and machine learning

Riding on the street, I can usually tell what cars in front of me are going to do, based on their “body language”: how they are positioning themselves in their lane. I don’t know that I could quite articulate what the rules are, but I can tell what’s going on, and I know that I […]

## Between-state correlations and weird conditional forecasts: the correlation depends on where you are in the distribution

Yup, here’s more on the topic, and this post won’t be the last, either . . . Jed Grabman writes: I was intrigued by the observations you made this summer about FiveThirtyEight’s handling of between-state correlations. I spent quite a bit of time looking into the topic and came to the following conclusions. In order […]

## Reference for the claim that you need 16 times as much data to estimate interactions as to estimate main effects

Ian Shrier writes: I read your post on the power of interactions a long time ago and couldn’t remember where I saw it. I just came across it again by chance. Have you ever published this in a journal? The concept comes up often enough and some readers who don’t have methodology expertise feel more […]

## She’s wary of the consensus based transparency checklist, and here’s a paragraph we should’ve added to that zillion-authored paper

Megan Higgs writes: A large collection of authors describes a “consensus-based transparency checklist” in the Dec 2, 2019 Comment in Nature Human Behavior. Hey—I’m one of those 80 authors! Let’s see what Higgs has to say: I [Higgs] have mixed emotions about it — the positive aspects are easy to see, but I also have […]

## Alexey Guzey plays Stat Detective: How many observations are in each bar of this graph?

How many data points are in each bar of the top graph above? (See here for background.) It’s from this article: Milewski MD, Skaggs DL, Bishop GA, Pace JL, Ibrahim DA, Wren TA, Barzdukas A. Chronic lack of sleep is associated with increased sports injuries in adolescent athletes. Journal of Pediatric Orthopaedics. 2014 Mar 1;34(2):129-33. […]

## Reasoning under uncertainty

John Cook writes, “statistics is all about reasoning under uncertainty.” I agree, and I think this is a good way to put it. Statistics textbooks sometimes describe statistics as “decision making under uncertainty,” but that always bothered me, because there’s very little about decision making in statistics textbooks. “Reasoning” captures it much more than “decision […]

## Uri Simonsohn’s Small Telescopes

I just happened to come across this paper from 2015 that makes an important point very clearly: It is generally very difficult to prove that something does not exist; it is considerably easier to show that a tool is inadequate for studying that something. With a small-telescopes approach, instead of arriving at the conclusion that […]

## It’s kinda like phrenology but worse. Not so good for the “Nature” brand name, huh? Measurement, baby, measurement.

Federico Mattiello writes: I thought you might find this thread interesting, it’s about a machine learning paper building a “trustworthiness score” from faces databases and historical (mainly British) portraits. It checks many bias boxes I believe, but my biggest complaint (I know it shouldn’t be) is the linear regression of basically spherical clouds of points: […]

## Randomized but unblinded experiment on vitamin D as a coronavirus treatment. Let’s talk about what comes next. (Hint: it involves multilevel models.)

Under the heading, “Here we go again,” Dale Lehman writes: If you want to blog on the continuing theme – try this (it’s from Marginal Revolution, the citation): https://marginalrevolution.com/marginalrevolution/2020/09/a-vitamin-d-bet.html https://www.sciencedirect.com/science/article/pii/S0960076020302764 Vitamin D Can Likely End the COVID-19 Pandemic What is striking is the analysis by the Rootclaim group – repeated reliance on p values as […]

## A question of experimental design (more precisely, design of data collection)

An economist colleague writes in with a question: What is your instinct on the following. Consider at each time t, 1999 through 2019, there is a probability P_t for some event (e.g., it rains on a given day that year). Assume that P_t = P_1999 + (t-1999)A. So P_t has a linear time trend. What […]

## Does this fallacy have a name?

Rafa Irizarry writes: What do we call it when someone thinks cor(Y,X) = 0 because lim h -> 0 cor( X, Y | X \in (x-h, x+h) ) = 0 Example: Steph, Kobe, and Jordan are average (or below average) height in the NBA so height does not predict being good at basketball. GRE math […]

## We want certainty even when it’s not appropriate

Remember the stents example? An experiment was conducted comparing two medical procedures, the difference had a p-value of 0.20 (after a corrected analysis the p-value was 0.09) and so it was declared that the treatment had no effect. In other cases, of course, “p less than 0.10” is enough for publication in PNAS and multiple […]