Skip to content
Archive of posts filed under the Causal Inference category.

Causal inference data challenge!

Susan Gruber, Geneviève Lefebvre, Tibor Schuster, and Alexandre Piché write: The ACIC 2019 Data Challenge is Live! Datasets are available for download (no registration required) at (bottom of the page). Check out the FAQ at The deadline for submitting results is April 15, 2019. The fourth Causal Inference Data Challenge is taking place […]

Does Harvard discriminate against Asian Americans in college admissions?

Sharad Goel, Daniel Ho and I looked into the question, in response to a recent lawsuit. We wrote something for the Boston Review: What Statistics Can’t Tell Us in the Fight over Affirmative Action at Harvard Asian Americans and Academics “Distinguishing Excellences” Adjusting and Over-Adjusting for Differences The Evolving Meaning of Merit Character and Bias […]

Coursera course on causal inference from Michael Sobel at Columbia

Here’s the description: This course offers a rigorous mathematical survey of causal inference at the Master’s level. Inferences about causation are of great importance in science, medicine, policy, and business. This course provides an introduction to the statistical literature on causal inference that has emerged in the last 35-40 years and that has revolutionized the […]

“The Book of Why” by Pearl and Mackenzie

Judea Pearl and Dana Mackenzie sent me a copy of their new book, “The book of why: The new science of cause and effect.” There are some things I don’t like about their book, and I’ll get to that, but I want to start with a central point of theirs with which I agree strongly. […]

“She also observed that results from smaller studies conducted by NGOs – often pilot studies – would often look promising. But when governments tried to implement scaled-up versions of those programs, their performance would drop considerably.”

Robert Wiblin writes: If we have a study on the impact of a social program in a particular place and time, how confident can we be that we’ll get a similar result if we study the same program again somewhere else? Dr Eva Vivalt . . . compiled a huge database of impact evaluations in […]

Matching (and discarding non-matches) to deal with lack of complete overlap, then regression to adjust for imbalance between treatment and control groups

John Spivack writes: I am contacting you on behalf of the biostatistics journal club at our institution, the Mount Sinai School of Medicine. We are working Ph.D. biostatisticians and would like the opinion of a true expert on several questions having to do with observational studies—questions that we have not found to be well addressed […]

Debate about genetics and school performance

Jag Bhalla points us to this article, “Differences in exam performance between pupils attending selective and non-selective schools mirror the genetic differences between them,” by Emily Smith-Woolley, Jean-Baptiste Pingault, Saskia Selzam, Kaili Rimfeld, Eva Krapohl, Sophie von Stumm, Kathryn Asbury, Philip Dale, Toby Young, Rebecca Allen, Yulia Kovas, and Robert Plomin, along with this response […]

A potential big problem with placebo tests in econometrics: they’re subject to the “difference between significant and non-significant is not itself statistically significant” issue

In econometrics, or applied economics, a “placebo test” is not a comparison of a drug to a sugar pill. Rather, it’s a sort of conceptual placebo, in which you repeat your analysis using a different dataset, or a different part of your dataset, where no intervention occurred. For example, if you’re performing some analysis studying […]

What to do when your measured outcome doesn’t quite line up with what you’re interested in?

Matthew Poes writes: I’m writing a research memo discussing the importance of precisely aligning the outcome measures to the intervention activities. I’m making the point that an evaluation of the outcomes for a given intervention may net null results for many reasons, one of which could simply be that you are looking in the wrong […]

Don’t get fooled by observational correlations

Gabriel Power writes: Here’s something a little different: clever classrooms, according to which physical characteristics of classrooms cause greater learning. And the effects are large! Moving from the worst to the best design implies a gain of 67% of one year’s worth of learning! Aside from the dubiously large effect size, it looks like the […]

Discussion of effects of growth mindset: Let’s not demand unrealistic effect sizes.

Shreeharsh Kelkar writes: As a regular reader of your blog, I wanted to ask you if you had taken a look at the recent debate about growth mindset [see earlier discussions here and here] that happened on Here’s the first salvo by Brooke McNamara, and then the response by Carol Dweck herself. The debate […]

The gaps between 1, 2, and 3 are just too large.

Someone who wishes to remain anonymous points to a new study of David Yeager et al. on educational mindset interventions (link from Alex Tabarrok) and asks: On the blog we talk a lot about bad practice and what not to do. Might this be an example of how *to do* things? Or did they just […]

John Hattie’s “Visible Learning”: How much should we trust this influential review of education research?

Dan Kumprey, a math teacher at Lake Oswego High School, Oregon, writes: Have you considered taking a look at the book Visible Learning by John Hattie? It seems to be permeating and informing reform in our K-12 schools nationwide. Districts are spending a lot of money sending their staffs to conferences by Solution Tree to […]

When anyone claims 80% power, I’m skeptical.

A policy analyst writes: I saw you speak at ** on Bayesian methods. . . . I had been asked to consult on a large national evaluation of . . . [details removed to preserve anonymity] . . . and had suggested treading carefully around the use of Bayesian statistics in this study (basing it […]

Let’s be open about the evidence for the benefits of open science

A reader who wishes to remain anonymous writes: I would be curious to hear your thoughts on is motivated reasoning among open science advocates. In particular, I’ve noticed that papers arguing for open practices have seriously bad/nonexistent causal identification strategies. Examples: Kidwell et al. 2017, Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method […]

China air pollution regression discontinuity update

Avery writes: There is a follow up paper for the paper “Evidence on the impact of sustained exposure to air pollution on life expectancy from China’s Huai River policy” [by Yuyu Chen, Avraham Ebenstein, Michael Greenstone, and Hongbin Li] which you have posted on a couple times and used in lectures. It seems that there […]

Data-based ways of getting a job

Bart Turczynski writes: I read the following blog with a lot of excitement: Then I reread it and paid attention to the graphs and models (which don’t seem to be actual models, but rather, well, lines.) The story makes sense, but the science part is questionable (or at least unclear.) Perhaps you’d like to have […]

He wants to know what to read and what software to learn, to increase his ability to think about quantitative methods in social science

A law student writes: I aspire to become a quantitatively equipped/focused legal academic. Despite majoring in economics at college, I feel insufficiently confident in my statistical literacy. Given your publicly available work on learning basic statistical programming, I thought I would reach out to you and ask for advice on understanding modeling and causal inference […]

About that claim in the NYT that the immigration issue helped Hillary Clinton? The numbers don’t seem to add up.

Today I noticed an op-ed by two political scientists, Howard Lavine and Wendy Rahm, entitled, “What if Trump’s Nativism Actually Hurts Him?”: Contrary to received wisdom, however, the immigration issue did not play to Mr. Trump’s advantage nearly as much as commonly believed. According to our analysis of national survey data from the American National […]

Trying to make some sense of it all, but I can see it makes no sense at all . . . stuck in the middle with you

“Mediation analysis” is this thing where you have a treatment and an outcome and you’re trying to model how the treatment works: how much does it directly affect the outcome, and how much is the effect “mediated” through intermediate variables. Fabrizia Mealli was discussing this with me the other day, and she pointed out that […]