Jeff Leek writes:

I just wrote this post about what the 5 most influential papers in statistics from 2000-2010. I would be really curious to know your list too?

Scarily enough I can’t think of any truly influential papers from that decade. I suppose this means I’m getting old!

P.S. I did once make a list of the top 5 unpublished papers in statistics.

I would suggest the following two, as they _finally_ implemented a Bayesian approach for addressing the real uncertainties due to lack of randomization in comparative studies by using informative priors for biases.

Wolpert, R. L., & Mengersen, K. L. (2004). Adjusted likelihoods for synthesizing empirical evidence from studies that differ in quality and design: effects of environmental tobacco smoke. Statistical Science, 19(3), 450-471.

Greenland, S. (2005). Multiple‐bias modelling for analysis of observational data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 168(2), 267-306.

As for measures of influence (e.g. citations) it might be worth pointing out that in 2007/8 neither Wolpert nor Greenland were aware of the other’s paper while they both cite back to DM Eddy as the pioneer.

(Notice I am posting the same list on the Andrew Gelman’s blog)

Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein (2010). “Particle Markov chain Monte Carlo methods. (with discussion)” Journal of the Royal Statistical Society: Series B 72.3: 269-342.

Aït‐Sahalia, Yacine (2002). “Maximum Likelihood Estimation of Discretely Sampled Diffusions: A Closed‐form Approximation Approach.” Econometrica 70.1: 223-262.

Marjoram P, Molitor J, Plagnol V, Tavare’ S (2003) Markov chain Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences 100(26):15,324–15,328

Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). Journal of the royal statistical society: Series b, 71(2), 319-392.

Some motivation: in Andrieu et al. it is proved for the first time that the use of an unbiased approximation to the likelihood (for example via sequential Monte Carlo), embedded in an MCMC algorithm, result in exact Bayesian inference, strengthening considerably the (already widespread) use of SMC methods for Bayesian inference even beyond the framework of state-space models.

In Aït‐Sahalia a method for approximating the transition density of a diffusion process is given in closed-form, boosting (when applicable) the computational speed and the quality of the approximation.

Marjoram et al. offer the first MCMC algorithm for Approximate Bayesian Computation (ABC), the latter being an increasingly important methodological and computational tool.

Rue et al. is the reference for the so-called INLA.

I meant to write: “Notice I am posting the same list on simplystatistics.org”

Yes, I think that the INLA paper by Rue et al is the most influential paper in this list.

No mention of the elastic net? (Zou, Hui & Trevor Hastie (2005): “Regularization and variable selection via the Elastic Net”, JRSS (B)67(2):301-320). As far as I can tell from Google Scholar, it has over 2000 citations.

Minka’s 2001 thesis on EP? DIC from 2002?

“It is too early to say”

Zhou Enlai, asked for his assessment of the 1789 French Revolution

Same here.

I’m surprized at the omission of:

Spiegelhalter, D. J.; Best, N. G.; Carlin, B. P. & van der Linde, A.(2002) Bayesian measures of model complexity and fit (with discussion) Journal of the Royal Statistical Society, Series B, 64, 583-639.

This paper, which introduced DIC, had a profound effect on a lot of my work, and a lot of other people’s as well.

Peronsally, I would add:

Plummer, M. (2008) Penalized loss functions for Bayesian model comparison Biostatistics, 9, 523-539.

Which really helped me understand the first reference, but I don’t think it is as well known.

This was the decade where the influence of papers in journals was overtaken by the influence of published software – especially open-source software. Consider the gap of nearly thirty years between Don Rubin introducing multiple imputation and it becoming this season’s must-have once it popped up in R and Stata (not open-source but definitely affordable compared to SAS or SPSS!!!)