I was at the UCLA statistics preprint site, which is full of interesting papers–we should so something like that here at Columbia–and came across this paper by Richard Berk on randomized experiments.

From the abstract to Berk’s paper:

Although it seems to be common knowledge that random assignment balances experimental and control groups on all confounders, other features of randomized field experiments are somewhat less appreciated. These include the role of random assignment in statistical inference and representations of the mechanisms by which the treatment has its impact. Randomized experiments also have important limitations and are subject to the fidelity with which they are implemented. In the end, randomized field experiments are still the best way to estimate causal effects, but are a considerable distance from perfection.

The paper is interesting and accessible, with a focus on policy analysis. Berk talks about the difficulties of estimating treatment interactions (such as in meta-analysis, where the treatment can be more effective in some settings than in others) because variation between studies is typically observational even when the studies themselves were randomized (an issue we struggled with in our analysis of the effects of incentives on survey response rates). I’d be a little less skeptical of meta-analysis than Berk is, but that probably reflects the differences in the particular applied problems we’ve worked on over the years.

Berk gives a list of advantages of randomization. I’d like to add one more (from Section 7.6 of Bayesian Data Analysis): if you are using no unit-level information in the design, then randomization is, mathematically, the only way to go. (For example, if you give treatment A to all the men and treatment B to all the women, then you’re using the “sex” variable in the treatment assignment. If you alternately assign A,B,A,B,…, then you’re using “time of entry into the study” in the assignment.) All statistical inference principles–Bayesian or sampling-theory–recognize the need to include design information into the analysis. Once unit-level information is available for use in the design, then some element of randomization is useful in making inferences more robust, as Berk notes in his paper.

A related point is that the randomization distribution can be directly used for model checking and hypothesis testing, either using sampling-theory inference or, from the Bayesian perspective, via posterior predictive checking.