Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

Everything that can be said can be said clearly.

The title as many may know, is a quote from Wittgenstein. It is one that has haunted me for many years. As a first year undergrad, I had mistakenly enrolled in a second year course that was almost entirely based on Wittgenstein’s  Tractatus. Alarmingly, the drop date had passed before I grasped I was supposed […]

Taking the bus

Bert Gunter writes: This article on bus ridership is right up your alley [it’s a news article with interactive graphics and lots of social science content]. The problem is that they’re graphing the wrong statistic. Raw ridership is of course sensitive to total population. So they should have been graphing is rates per person, not […]

What are my statistical principles?

Jared Harris writes: I am not a statistician but am a long time reader of your blog and have strong interests in most of your core subject matter, as well as scientific and social epistemology. I’ve been trying for some time to piece together the broader implications of your specific comments, and have finally gotten […]

Who are you gonna believe, me or your lying eyes?

This post is by Phil Price, not Andrew. A commenter on an earlier post quoted Terence Kealey, who said this in an interview in Scientific American in 2003: “But the really fascinating example is the States, because it’s so stunningly abrupt. Until 1940 it was American government policy not to fund science. Then, bang, the […]

Rethinking Rob Kass’ recent talk on science in a less statistics-centric way.

Reflection on a recent post on a talk by Rob Kass’ has lead me to write this post. I liked the talk very much and found it informative. Perhaps especially for it’s call to clearly distinguish abstract models from brute force reality. I believe that is a very important point that has often been lost […]

FDA statistics scandal update

The other day we reported on the director of the FDA who got embarrassed after garbling some statistics at a news conference. At the time, I wrote: The commissioner of the FDA might well too busy to be carefully reading the individual studies. I assume the fault is with whatever assistant prepared the numbers for […]

Statistics is hard, especially if you don’t know any statistics (FDA edition)

Paul Alper shares this story: From the NYT: Dr. Stephen M. Hahn, the commissioner of the Food and Drug Administration, said 35 out of 100 Covid-19 patients “would have been saved because of the administration of plasma.” He later walked this back because of confusion between Absolute Risk Reduction and Relative Risk Reduction, a common […]

Rob Kass: “The truth of a theory is contingent on both our state of knowledge and the purposes to which it will be put.”

Here’s a presentation, Exaggerated Claims Undermine Science by Ignoring the Scientific Method, by Rob Kass, a statistician who over the years has done a lot of interesting work on statistical theory and applications, especially in neuroscience. A few years ago, we discussed Kass’s thoughts on statistical pragmatism. And here’s a discussion of a couple of […]

Do we trust this regression?

Kevin Lewis points us to this article, “Do US TRAP Laws Trap Women Into Bad Jobs?”, which begins: This study explores the impact of women’s access to reproductive healthcare on labor market opportunities in the US. Previous research finds that access to the contraception pill delayed age at first birth and increased access to a […]

Some possibly different experiences of being a statistician working with an international collaborative research group like OHDSI.

This post is by Keith O’Rourke and as with all posts and comments on this blog, is just a deliberation on dealing with uncertainties in scientific inquiry and should not to be attributed to any entity other than the author. Starting at the end of March, I thought it would be good idea to let […]

Theorizing, thought experiments, fake-data simulation

I think of theorizing as like thought experiments or, in statistics, fake-data simulation: A way of exploring the implications of one’s ideas, essentially a form of deductive reasoning. Arguably, much of fiction serves this purpose too, of mapping out the implications of existing postulates, and, conversely, revealing implicit postulates or assumptions that drive our thinking.

Some thoughts inspired by Lee Cronbach (1975), “Beyond the two disciplines of scientific psychology”

I happened to come across this article today. It’s hardly obscure—it has over 3000 citations, according to Google scholar—but it was new to me. It’s a wonderful article. You should read it right away. OK, click on the above link and read the article. Done? OK, then read on.

How much of public health work “involves not technology but methodicalness and record keeping”?

Palko points us to this interesting point from Josh Marhsall: I [Marshall] am always struck by, amazed at how much of public health work involves not technology but methodicalness and record keeping. In purely technological terms much of it could have been done 100 years ago or, in outlines at least, 500 years ago. Phones […]

Thomas Basbøll will like this post (analogy between common—indeed, inevitable—mistakes in drawing, and inevitable mistakes in statistical reasoning).

There’s a saying in art that you have to draw things the way they look, not the way they are. This reminds me of an important but rarely stated principle in statistical reasoning, the distinction between evidence and truth. The classic error of novices when drawing is to draw essences—for example, drawing a head as […]

Know your data, recode missing data codes

We had a class assignment where students had to graph some data of interest. A pair of students made the above graph, as a reminder that some data cleaning is often necessary. The students came up with the excellent title as well!

Regression and Other Stories translated into Python!

Ravin Kumar writes in with some great news: As readers of this blog likely know Andrew Gelman, Jennifer Hill, and Aki Vehtari have recently published a new book, Regression and Other Stories. What readers likely don’t know is that there is an active effort to translate the code examples written in R and the rstanarm […]

Varimax: Sure, it’s always worked but now there’s maths!

Some day you will be loved — Death Cab for Cutie Here is a paper that I just read that I really liked: Karl Rohe and Muzhe Zeng’s Vintage Factor Analysis with Varimax Performs Statistical Inference. (Does anyone else get a Simon Smith and His Amazing Dancing Bear vibe off that title? No? Just me? […]

Math error in herd immunity calculation from CNN epidemiology expert

Michael Weissman and Sander Greenland write: Sanjay Gupta and Andrea Kane just ran an extensive front-page CNN article reporting that some residual T-cell immune responses cross-react with SARS-Cov-19, perhaps enough to provide many people with some protection. The article seemed straightforward and reasonable enough until it got to this strangely erroneous statement: For herd immunity, […]

“RA Fisher and the science of hatred”

Mark Brown points us to this thoughtful article by Richard Evans regarding the controversy over Ronald Fisher, who during the twentieth century made huge contributions to genetics and statistical theory and methods and who also had serious commitments to racism and eugenics. The controversy made its way into statistics. The Committee of Presidents of Statistical […]

Thinking about election forecast uncertainty

Some twitter action Elliott Morris, my collaborator (with Merlin Heidemanns) on the Economist election forecast, pointed me to some thoughtful criticisms of our model from Nate Silver. There’s some discussion on twitter, but in general I don’t find twitter to be a good place for careful discussion, so I’m continuing the conversation here. Nate writes: […]