Michael Nelson writes: I want to thank you for posting your last decade of publications in a single space and organized by topic. But I also wanted to share a critique of your argument style as exemplified in your Annals of Surgery correspondence [here and here]. While I think it’s important and valuable that you […]

**Teaching**category.

## More background on our research on constructing an informative prior from a corpus of comparable studies

Erik van Zwet writes: The post (“The Shrinkage Trilogy: How to be Bayesian when analyzing simple experiments”) didn’t get as many comments as I’d hoped, so I wrote an short explainer and a reading guide to help people understand what we’re up to. All three papers have the same very simple model. We abstract a […]

## “The 100 Worst Ed-Tech Debacles of the Decade”

This is a list from Audrey Watters (link from Palko). 100! Wow—that’s a long list. But it is for a whole decade. I doubt this’ll make it on to Bill Gates’s must-reads of the year, but I liked it. Just to give you a sense, I’ll share the first and last items on Watters’s list: […]

## Regression discontinuity analysis is often a disaster. So what should you do instead? Here’s my recommendation:

Summary If you have an observational study with outcome y treatment variable z and pre-treatment predictors X, and treatment assignment depends only on X, then you can estimate the average causal effect by regressing y on z and X and looking at the coefficient of z. If there is lack of complete overlap in X […]

## It was a year ago today . . .

We posted the following item: “We taught a class using Zoom yesterday. Here’s what we learned.” I was full of earnest thoughts. If you’d asked me whether I’d still be teaching on Zoom a year later, what would I have said? I’m not sure. The most relevant piece of information I can share with you […]

## Statistical fallacies as they arise in political science (from Bob Jervis)

Bob Jervis sends along this fun document he gives to the students in his classes. Enjoy. Theories of International Relations Assume that all the facts and assertions in these paragraphs are correct. Why do the conclusions not follow? (This does not mean that the conclusions are actually false.) What are the alternative explanations for the […]

## Toronto Data Workshop on Reproducibility

I (Lauren not Andrew writing) will be speaking at an upcoming online workshop on reproducibility (free and open). More details here. Looking at the talk outlines, I’m really looking forward to it. I think we can generally agree that reproducibility is a good thing, and something we want to strive for, but in practice there’s […]

## Kill the math in the intro stat course?

David Kane writes: Our introductory classes in statistics and data science use too much mathematics. The key causal effect which our students want our classes to have is to improve their future performance and opportunities. The more professional their computing skills (in the context of data analysis), the greater their likely success. Introductory courses should […]

## Summer training in statistical sampling at University of Michigan

Yajuan points us to this summer program:

## The textbook paradox: “Textbooks more than a very few years old cannot even be given away, but new textbooks are mostly made by copying from former ones”

The above remark, from Alan Dunne, applies to mature fields more than to new fields. For example, I guess the textbooks on deep learning are pretty recent, so anything a few years old really would be out of date. Even in subfields that have been around for awhile, it can take a while for textbook […]

## New textbook, “Statistics for Health Data Science,” by Etzioni, Mandel, and Gulati

Ruth Etzioni, Micha Mandel, Roman Gulati wrote a new book that I really like. Here are the chapters: 1 Statistics and Health Data 1.1 Introduction 1.2 Statistics and Organic Statistics 1.3 Statistical Methods and Models 1.4 Health Care Data 1.5 Outline of the Text 1.6 Software and Data 2 Key Statistical Concepts 2.1 Samples and […]

## Thanks, commenters!

The person who sent me this question (“You’re a data scientist at a local hospital and you’ve been asked to present to the physicians on communicating statistical information to patients. What should you say?”) the other day read the comment thread and responded: Thank you so much for putting the question to your readership. Their […]

## You’re a data scientist at a local hospital and you’ve been asked to present to the physicians on communicating statistical information to patients. What should you say?

Someone who wishes to remain anonymous writes: I just read your post reflecting on crappy talks . . . I’m reaching out because I’m a data scientist at a local hospital in the US and I’ve been asked to present to our physicians about communicating statistical information to patients (e.g., how to interpret the results […]

## Reflections on a talk gone wrong

The first talk I ever gave was at a conference in 1988. (This isn’t the one that went wrong.) I spoke on Constrained maximum entropy methods in an image reconstruction problem. The conference was in England, and I learned about it from a wall poster. They had travel funding for students. I sent in my […]

## Sketching the distribution of data vs. sketching the imagined distribution of data

Elliot Marsden writes: I was reading the recently published UK review of food and eating habits. The above figure caught my eye as it looked like the distribution of weight had radically changed, beyond just its mean shifting, over past decades. This would really change my beliefs! But in fact the distributional data wasn’t available […]

## Weakliem on air rage and himmicanes

Weakliem writes: I think I see where the [air rage] analysis went wrong. The dependent variable was whether or not an “air rage” incident happened on the flight. Two important influences on the chance of an incident are the number of passengers and how long the flight was (their data apparently don’t include the number […]

## Debate involving a bad analysis of GRE scores

This is one of these academic ping-pong stories of a general opinion, an article that challenges the general opinion, a rebuttal to that article, a rebuttal to the rebuttal, etc. I’ll label the positions as A1, B1, A2, B2, and so forth: A1: The starting point is that Ph.D. programs in the United States typically […]

## What are the most important statistical ideas of the past 50 years?

Aki and I wrote this article, doing our best to present a broad perspective. We argue that the most important statistical ideas of the past half century are: counterfactual causal inference, bootstrapping and simulation-based inference, overparameterized models and regularization, multilevel models, generic computation algorithms, adaptive decision analysis, robust inference, and exploratory data analysis. These eight […]

## Basbøll’s Audenesque paragraph on science writing, followed by a resurrection of a 10-year-old debate on Gladwell

I pointed Thomas Basbøll to my recent post, “Science is science writing; science writing is science,” and he in turn pointed me to his post from a few years ago, “Scientific Writing and ‘Science Writing,’” which stirringly begins: For me, 2015 will be the year that I [Basbøll] finally lost all respect for “science writing”. […]

## Is causality as explicit in fake data simulation as it should be?

Sander Greenland recently published a paper with a very clear and thoughtful exposition on why causality, logic and context need full consideration in any statistical analysis, even strictly descriptive or predictive analysis. For instance, in the concluding section – “Statistical science (as opposed to mathematical statistics) involves far more than data – it requires realistic […]