Progress in the past decade

Posted on January 1, 2020 9:00 AM by Andrew

It’s been a busy decade for our research.

Before going on, I’d like to thank hundreds of collaborators, including students; funders from government, nonprofits, and private industry; blog commenters and people who have pointed us to inspiring research, outrages, beautiful and ugly graphs, cat pictures, and all the rest; all those of you who have shared your disagreements; pointing out my errors and my failures in communication; and, most of all, family and friends for your love and support.

Bayes and Stan

Our biggest contribution was Stan, which represents a major research effort in itself (thanks, Bob, Matt, Daniel, and so many others!), has motivated lots of research in Bayesian inference and computation, has facilitated tons of applied work by ourselves and others, and has inspired other probabilistic programming languages targeted to particular classes of models and applications.

Relatedly, during the past decade we completed the third edition of BDA (thanks, Aki!) and Regression and Other Stories (thanks, Jennifer and Aki!). Here’s a list of our published papers on Bayesian methods and computation in the past decade, in reverse chronological order:

[2019] Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in Stan. {\em Spatial and Spatio-temporal Epidemiology}.
(Mitzi Morris, Katherine Wheeler-Martin, Daniel Simpson, Stephen Mooney, Andrew Gelman, and Charles DiMaggio)
[2019] The experiment is just as important as the likelihood in understanding the prior: A cautionary note on robust cognitive modelling. {\em Computational Brain and Behavior}.
(Lauren Kennedy, Daniel Simpson, and Andrew Gelman)
[2018] R-squared for Bayesian regression models. {\em American Statistician}.
(Andrew Gelman, Ben Goodrich, Jonah Gabry, and Aki Vehtari)
[2018] Limitations of “Limitations of Bayesian leave-one-out cross-validation for model selection.” {\em Computational Brain and Behavior}.
(Aki Vehtari, Daniel P. Simpson, Yuling Yao, and Andrew Gelman)
[2018] Yes, but did it work?: Evaluating variational inference. {\em Proceedings of the 35th International Conference on Machine Learning}.
(Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman)
[2018] Using stacking to average Bayesian predictive distributions (with discussion). {\em Bayesian Analysis} {\bf 13}, 917–1003.
(Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman)
[2017] The prior can often only be understood in the context of the likelihood. {\em Entropy} {\bf 19}, 555.
(Andrew Gelman, Daniel Simpson, and Michael Betancourt)
[2017] Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. {\em Statistics and Computing} {\bf 27}, 1413–1432.
(Aki Vehtari, Andrew Gelman, and Jonah Gabry)
[2017] Stan: A probabilistic programming language. {\em Journal of Statistical Software} {\bf 76} (1).
(Bob Carpenter, Andrew Gelman, Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell)
[2017] Consensus Monte Carlo using expectation propagation. {\em Brazilian Journal of Probability and Statistics} {\bf 31}, 692–696.
(Andrew Gelman and Aki Vehtari)
[2017] Automatic differentiation variational inference {\em Journal of Machine Learning Research} {\bf 18}, 1–45.
(Alp Kucukelbir, Dustin Tran, Rajesh Ranganath, Andrew Gelman, and David M. Blei)
[2017] Fitting Bayesian item response models in Stata and Stan. {\em Stata Journal} {\bf 17}, 343–357.
(Robert Grant, Daniel Furr, Bob Carpenter, and Andrew Gelman)
[2015] Automatic variational inference in Stan. {\em Neural Information Processing Systems}.
(Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David Blei)
[2015] Stan: A probabilistic programming language for Bayesian inference and optimization. {\em Journal of Educational and Behavioral Statistics} {\bf 40}, 530–543.
(Andrew Gelman, Daniel Lee, and Jiqiang Guo)
[2015] Simulation-efficient shortest probability intervals. {\em Statistics and Computing} {\bf 25}, 809–819.
(Ying Liu, Andrew Gelman, and Tian Zheng)
[2014] Weakly informative prior for point estimation of covariance matrices in hierarchical models. {\em Journal of Educational and Behavioral Statistics} {\bf 40}, 136–157.
(Yeojin Chung, Andrew Gelman, Sophia Rabe-Hesketh, Jingchen Liu, and Vincent Dorie)
[2014] Difficulty of selecting among multilevel models using predictive accuracy. {\em Statistics and Its Interface} {\bf 7}.
(Wei Wang and Andrew Gelman)
[2014] Bootstrap averaging: Examples where it works and where it doesn’t work. {\em Journal of the American Statistical Association} {\bf 109}, 1015–1016.
(Andrew Gelman and Aki Vehtari)
[2014] Multiple imputation for continuous and categorical data: Comparing joint and conditional approaches. {\em Political Analysis} {\bf 22}, 497–519.
(Jonathan Kropko, Ben Goodrich, Andrew Gelman, and Jennifer Hill)
[2014] On the stationary distribution of iterative imputations. {\em Biometrika} {\bf 1}, 155–173.
(Jingchen Liu, Andrew Gelman, Jennifer Hill, Yu-Sung Su, and Jonathan Kropko)
[2014] Understanding predictive information criteria for Bayesian models. {\em Statistics and Computing} {\bf 24}, 997–1016.
(Andrew Gelman, Jessica Hwang, and Aki Vehtari)
[2014] The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. {\em Journal of Machine Learning Research} {\bf 15}, 1351–1381.
(Matt Hoffman and Andrew Gelman)
[2013] Two simple examples for understanding posterior p-values whose distributions are far from uniform. {\em Electronic Journal of Statistics} {\bf 7}, 2595–2602.
(Andrew Gelman)
[2013] Inherent difficulties of non-Bayesian likelihood-based inference, as revealed by an examination of a recent book by Aitkin. {\em Statistics \& Risk Modeling} {\bf 30}, 1001–1016.
(Andrew Gelman, Christian Robert, and Judith Rousseau).
[2013] Nonparametric models can be checked. {\em Bayesian Analysis}.
(Andrew Gelman)
[2013] A nondegenerate estimator for hierarchical variance parameters via penalized likelihood estimation. {\em Psychometrika} {\bf 78}, 685–709.
(Yeojin Chung, Sophia Rabe-Hesketh, Andrew Gelman, Jingchen Liu, and Vincent Dorie)
[2013] Does quantum uncertainty have a place in everyday applied statistics? {\em Behavioral and Brain Sciences} {\bf 36}, 285.
(Andrew Gelman and Michael Betancourt)
[2012] Why we (usually) don’t have to worry about multiple comparisons. {\em Journal of Research on Educational Effectiveness} {\bf 5}, 189–211.
(Andrew Gelman, Jennifer Hill, and Masanao Yajima)
[2011] Inference from simulations and monitoring convergence. In {\em Handbook of Markov Chain Monte Carlo}, ed.\ S. Brooks, A. Gelman, G. Jones, and X. L. Meng. CRC Press.
(Andrew Gelman and Kenneth Shirley)
[2011] Multiple imputation with diagnostics (mi) in R: Opening windows into the black box. {\em Journal of Statistical Software} {\bf 45} (2).
(Yu-Sung Su, Andrew Gelman, Jennifer Hill, and Masanao Yajima)
[2010] Review of {\em The Search for Certainty}, by Krzysztof Burdzy. {\em Bayesian Analysis}.
(Andrew Gelman)
[2010] Adaptively scaling the Metropolis algorithm using expected squared jumped distance. {\em Statistica Sinica} {\bf 20}, 343–364.
(Cristian Pasarica and Andrew Gelman)

That method developed in that paper with Pasarica did not directly come to much. But then a few years later the idea of maximizing expected squared jumped distance helped motivate Matt Hoffman to develop the very useful Nuts algorithm. This demonstrates the potential benefit of pushing through our research ideas, even when they don’t lead to anything right away.

Voting, public opinion, and sample surveys

The motivation for all the above work on Bayesian methods and computing was to make progress in applied problems. Here’s our recent published work in voting, public opinion, and sample surveys:

[2019] Voter registration databases and MRP: Toward the use of large scale databases in public opinion research. {\em Political Analysis}.
(Yair Ghitza and Andrew Gelman)
[2018] Bayesian inference under cluster sampling with probability proportional to size. {\em Statistics in Medicine}.
(Susanna Makela, Yajuan Si, and Andrew Gelman)
[2018] Disentangling bias and variance in election polls. {\em Journal of the American Statistical Association}.
(Houshmand Shirani-Mehr, David Rothschild, Sharad Goel, and Andrew Gelman)
[2017] 19 things we learned from the 2016 election (with discussion). {\em Statistics and Public Policy} {\bf 4}.
(Andrew Gelman and Julia Azari)
How special was 2016? (rejoinder to discussion).
(Julia Azari and Andrew Gelman)
[2017] The 2008 election: A preregistered replication analysis. {\em Statistics and Public Policy} {\bf 4}.
(Rayleigh Lei, Andrew Gelman, and Yair Ghitza)
Online appendix.
[2017] Learning about networks using sampling. {\em Journal of Survey Statistics and Methodology} {\bf 5}, 22–28.
(Andrew Gelman)
[2016] High-frequency polling with non-representative data. In {\em Political Communication in Real Time: Theoretical and Applied Research Approaches}, 89–105.
(Andrew Gelman, Sharad Goel, David Rothschild, and Wei Wang)
[2016] Will public opinion about inequality be packaged into neatly partisan positions? {\em Pathways}, Winter, 27–32.
(Andrew Gelman and Leslie McCall)
[2016] The mythical swing voter. {\em Quarterly Journal of Political Science} {\bf 11}, 103–130.
(Andrew Gelman, Sharad Goel, Douglas Rivers, and David Rothschild)
[2015] Incorporating the sampling design in weighting adjustments for panel attrition. {\em Statistics in Medicine} {\bf 34}, 3637–3647.
(Qixuan Chen, Andrew Gelman, Melissa Tracy, Fran H. Norris, and Sandro Galea)
[2015] Bayesian nonparametric weighted sampling inference. {\em Bayesian Analysis} {\bf 10}, 605–625.
(Yajuan Si, Natesh Pillai, and Andrew Gelman)
[2015] American democracy and its critics. Review of {\em American Democracy}, by Andrew Perrin. {\em American Journal of Sociology} {\bf 120}, 1562–1564.
(Andrew Gelman)
[2015] Forecasting elections with non-representative polls. {\em International Journal of Forecasting} {\bf 31}, 980–991.
(Wei Wang, David Rothschild, Sharad Goel, and Andrew Gelman)
[2015] Hierarchical models for estimating state and demographic trends in U.S. death penalty public opinion. {\em Journal of the Royal Statistical Society A} {\bf 178}, 1–28.
(Kenneth Shirley and Andrew Gelman)
[2014] The twentieth-century reversal: How did the Republican states switch to the Democrats and vice versa? {\em Statistics and Public Policy} {\bf 1}, 1–5.
(Andrew Gelman)
[2014] How Bayesian analysis cracked the red-state, blue-state problem. {\em Statistical Science} {\bf 29}, 26–35.
(Andrew Gelman)
[2013] Deep interactions with MRP: Election turnout and voting patterns among small electoral subgroups. {\em American Journal of Political Science} {\bf 57}, 762–776.
(Yair Ghitza and Andrew Gelman)
[2013] Charles Murray’s {\em Coming Apart} and the measurement of social and political divisions. {\em Statistics, Politics and Policy} {\bf 4}, 70–81.
(Andrew Gelman)
[2013] A practical guide to measuring social structure using indirectly observed network data. {\em Journal of Statistical Theory and Practice} {\bf 7}, 120–132.
(Tyler McCormick, Amal Moussa, Johannes Ruf, Thomas DiPrete, Andrew Gelman, Julien Teitler, and Tian Zheng)
[2012] Red state / blue state divisions in the 2012 presidential election. {\em The Forum} {\bf 10}, 127–141.
(Avi Feller, Andrew Gelman, and Boris Shor)
[2012] Estimating partisan bias of the electoral college under proposed changes in elector apportionment. {\em Statistics, Politics and Policy} {\bf 3}, 1–13.
(Andrew C. Thomas, Andrew Gelman, Gary King, and Jonathan Katz)
[2012] Understanding persuasion and activation in presidential campaigns: The random walk and mean-reversion models. {\em Presidential Studies Quarterly} {\bf 42}, 843–866.
(Noah Kaplan, David K. Park, and Andrew Gelman)
[2012] Discussion of {\em Left Turn}, by Tim Groseclose. {\em Perspectives on Politics} {\bf 10}, 775–779.
(Justin Gross, Cosma Shalizi, and Andrew Gelman)
[2012] What is the probability your vote will make a difference? {\em Economic Inquiry} {\bf 50}, 321–326.
(Andrew Gelman, Nate Silver, and Aaron Edlin)
[2011] Economic divisions and political polarization in red and blue America. {\em Pathways} (Summer), 3–6.
(Andrew Gelman)
[2010] Protecting minorities in binary elections: A test of storable votes using field data. {\em B.E. Journal of Economic Analysis \& Policy} {\bf 10} (1)
(Alessandra Casella, Shuky Ehrenberg, Andrew Gelman, and Jie Shen)
[2010] Breaking down the 2008 vote. In {\em Atlas of the 2008 Election}, ed.\ S. Brunn.
(Andrew Gelman)
[2010] Voting by education in 2008. {\em Chance} {\bf 23}, 8.
(Andrew Gelman and Yu-Sung Su)
[2010] What do we know at 7pm on election night? {\em Mathematics Magazine} {\bf 83}, 258–266.
(Andrew Gelman and Nate Silver)
[2010] Public opinion on health care reform. {\em The Forum} {\bf 8} (1), article 8.
(Andrew Gelman, Daniel Lee, and Yair Ghitza)
[2010] A snapshot of the 2008 election. {\em Statistics, Politics and Policy} {\bf 1} (1), article 3.
(Andrew Gelman, Daniel Lee, and Yair Ghitza)
[2010] Bayesian combination of state polls and election forecasts. {\em Political Analysis} {\bf 18}, 337–348.
(Kari Lock and Andrew Gelman)
[2010] Economics and voter irrationality. Review of {\em The Myth of the Rational Voter}, by Bryan Caplan. {\em Political Psychology} {\bf 31}, 139–142.
(Andrew Gelman)
[2010] Income inequality and partisan voting in the United States. {\em Social Science Quarterly} {\bf 91}, 1203–1219.
(Andrew Gelman, Lane Kenworthy, and Yu-Sung Su)

Wow! I’d forgotten about a lot of that.

Other applied work

And here’s our recent published work in other applied areas:

[2018] Gaydar and the fallacy of decontextualized measurement. {\em Sociological Science} {\bf 5}, 270–280.
(Andrew Gelman, Greggor Matson, and Daniel Simpson)
[2018] The Millennium Villages Project: A retrospective, observational, endline evaluation. {\em Lancet Global Health} {\bf 6}.
(Shira Mitchell, Andrew Gelman, Rebecca Ross, Joyce Chen, Sehrish Bari, Uyen Kim Huynh, Matthew W. Harris, Sonia Ehrlich Sachs, Elizabeth A. Stuart, Avi Feller, Susanna Makela, Alan M. Zaslavsky, Lucy McClellan, Seth Ohemeng-Dapaah, Patricia Namakula, Cheryl A. Palm, and Jeffrey D. Sachs)
Supplementary appendix.
[2018] Review of {\em New Explorations into International Relations: Democracy, Foreign Investment, Terrorism, and Conflict}, by Seung-Whan Choi. {\em Perspectives on Politics}.
(Andrew Gelman)
[2018] Global shifts in the phenological synchrony of species interactions over recent decades. {\em Proceedings of the National Academy of Sciences}.
(Heather M. Kharouba, Johan Ehrlén, Andrew Gelman, Kjell Bolmgren, Jenica M. Allen,
Steve E. Travers, and Elizabeth M. Wolkovich)
[2018] Bayesian aggregation of average data: An application in drug development. {\em Annals of Applied Statistics} {\bf 12}, 1583–1604.
(Sebastian Weber, Andrew Gelman, Daniel Lee, Michael Betancourt, Aki Vehtari, and Amy Racine-Poon)
[2017] Exploring the relationships between USMLE performance and disciplinary action in practice: A validity study of score inferences from a licensure examination. {\em Academic Medicine} {\bf 92}, 1780–1785.
(Monica M. Cuddy, Aaron Young, Andrew Gelman, David B. Swanson, David A. Johnson, Gerard F. Dillon, and Brian E. Clauser)
[2016] Age-aggregation bias in mortality trends. {\em Proceedings of the National Academy of Sciences} {\bf 113}, E816–E817.
(Andrew Gelman and Jonathan Auerbach)
[2015] A model-based approach to climate reconstruction using tree-ring data. {\em Journal of the American Statistical Association} {\bf 111}, 93–106.
(Matthew Schofield, Richard Barker, Andrew Gelman, Edward Cook, and Keith Briffa)
[2015] Centralized analysis of local data, with dollars and lives on the line: Lessons from the home radon experience. In {\em Data Science for Politics, Policy and Government}, ed.\ R. Michael Alvarez. Cambridge University Press.
(Phillip N. Price and Andrew Gelman)
[2014] “How many zombies do you know?”: Using indirect survey methods to measure alien attacks and outbreaks of the undead. In {\em Writing Today}, third edition, ed.\ Richard Johnson-Sheehan and Charles Paine.
(Andrew Gelman)
[2014] Stop and frisk: What’s the problem? {\em Criminal Law and Criminal Justice Books}.
(Andrew Gelman)
[2013] Rates and correlates of HIV and STI infection among homeless women. {\em AIDS and Behavior} {\bf 17}, 856-864.
(Carol L. M. Caton, Nabila El-Bassel, Andrew Gelman, Susan Barrow, Daniel Herman, Eustace Hsu, Ana Z. Tochterman, Karen Johnson, and Alan Felix)
[2012] Freakonomics: What went wrong? {\em American Scientist}.
(Andrew Gelman and Kaiser Fung)
[2011] Segregation in social networks based on acquaintanceship and trust. {\em American Journal of Sociology} {\bf 116}, 1234–1283.
(Thomas A. DiPrete, Andrew Gelman, Tyler McCormick, Julien Teitler, and Tian Zheng)
[2010] Economic disparities and life satisfaction in European regions. {\em Social Indicators Research} {\nf 96}, 339–361.
(Maria Grazia Pittau, Roberto Zelli, and Andrew Gelman)
[2010] Can fractals be used to predict human history? Review of {\em Bursts}, by Albert-Laszlo Barabasi. {\em Physics Today}.
(Andrew Gelman)

I included the zombies paper in the above list, but I really could’ve counted it as survey methods.

Open science and ethics

Recently we’ve been thinking a lot about open science and ethics:

[2019] A consensus-based transparency checklist. {\em Nature Human Behaviour}, doi:10.1038/s41562-019-0772-6.
(Balazs Aczel, Barnabas Szaszi, Alexandra Sarafoglou, Zoltan Kekecs, Šimon Kucharský, Daniel Benjamin, Christopher Chambers, Agneta Fisher, Andrew Gelman, et al.)
[2019] When we make recommendations for scientific practice, we are (at best) acting as social scientists. {\em European Journal of Clinical Investigation}.
(Andrew Gelman)
[2019] Childhood obesity intervention studies: A narrative review and guide for investigators, authors, editors, reviewers, journalists, and readers to guard against exaggerated effectiveness claims. {\em Obesity Reviews}.
(Andrew Brown, Douglas Altman, Tom Baranowski, J. Martin Bland, John Dawson, Nikhil Dhurandhar, Shima Dowla, Kevin Fontaine, Andrew Gelman, Steven Heymsfield, Wasantha Jayawardene, Scott Keith, Theodore Kyle, Eric Loken, J. Michael Oakes, June Stevens, Diana Thomas, and David Allison)
[2019] The implementation of randomization requires corrected analyses. Comment on “Comprehensive nutritional and dietary intervention for autism spectrum disorder—A randomized, controlled 12-month trial, Nutrients 2018, 10, 369. {\em Nutrients} {\bf 11}, 1126.
(Colby J. Vorland, Andrew W. Brown, Stephanie L. Dickinson, Andrew Gelman, and David B. Allison)
[2019] Multiple perspectives on inference for two simple statistical scenarios. {\em American Statistician} {\bf 73} (S1), 328–339.
(Noah N. N. van Dongen, Johnny B. van Doorn, Quentin F. Gronau, Don van Ravenzwaaij, Rink Hoekstra, Matthias N. Haucke, Daniel Lakens, Christian Hennig, Richard D. Morey, Saskia Homer, Andrew Gelman, Jan Sprenger, and Eric-Jan Wagenmakers)
[2019] Large scale replication projects in contemporary psychological research. {\em American Statistician} {\bf 73} (S1), 99–105.
(Blakeley B. McShane, Jennifer L. Tackett, Ulf Bockenholt, and Andrew Gelman)
[2018] Do researchers anchor their beliefs on the outcome of an initial study? Testing the time-reversal heuristic. {\em Experimental Psychology} {\bf 65}, 158–169.
(Anja Ernst, Rink Hoekstra, Eric-Jan Wagenmakers, Andrew Gelman, and Don van Ravenzwaaij)
[2018] Ethics in statistical practice and communication: Five recommendations. {\em Significance}.
(Andrew Gelman)
[2018] Don’t characterize replications as successes or failures. Discussion of “Making replication mainstream,” by Rolf A. Zwaan et al. {\em Behavioral and Brain Sciences}.
(Andrew Gelman)
[2018] How to think scientifically about scientists’ proposals for fixing science. {\em Socius}.
(Andrew Gelman)
[2018] Learning from and responding to statistical criticism. {\em Observational Studies}.
(Andrew Gelman)
[2017] Some natural solutions to the p-value communication problem—and why they won’t work. {\em Journal of the American Statistical Association} {\bf 112}, 899–901.
(Andrew Gelman and John B. Carlin)
[2017] Honesty and transparency are not enough. {\em Chance} {\bf 30} (1), 37–39.
(Andrew Gelman)
[2017] The statistical crisis in science: How is it relevant to clinical neuropsychology? {\em Clinical Neuropsychologist} {\bf 31}, 1000–1014.
(Andrew Gelman and Hilde Geurts)
[2016] Questionable association between front boarding and air rage. {\em Proceedings of the National Academy of Sciences} {\bf 113}, E7348.
(Marcus Crede, Andrew Gelman, and Carol Nickerson)
[2016] A Bayesian bird’s eye view of `Replications of important results in social psychology.’ {\em Royal Society Open Science} {\bf 4}, 160426.
(Maarten Marsman, Felix Schoonbrodt, Richard Morey, Yuling Yao, Andrew Gelman, and
Eric-Jan Wagenmakers)
[2016] Commentary on “Crisis in science? Or Crisis in statistics! Mixed messages in statistics with impact on science,” by Donald A. S. Fraser and Nancy M. Reid. {\em Journal of Statistical Research} {\bf 48–50}, 11–12.
(Andrew Gelman)
[2016] Increasing transparency through a multiverse analysis. {\em Perspectives on Psychological Science} {\bf 11}, 702–712.
(Sara Steegen, Francis Tuerlinckx, Andrew Gelman, and Wolf Vanpaemel)
Supplemental materials.
[2016] The problems with p-values are not just with p-values. {\em American Statistician}. (Andrew Gelman)
[2015] Political attitudes in social environments. Discussion of “Political diversity will improve social psychological science,” by Jose Duarte et al. {\em Behavioral and Brain Sciences} {\bf 38}, 26–27.
(Andrew Gelman and Neil Gross)
[2015] Statistics and the crisis of scientific replication. {\em Significance} {\bf 12} (3), 33–35.
(Andrew Gelman)
[2015] How is ethics like logistic regression? Ethics decisions, like statistical inferences, are informative only if they’re not too easy or too hard. {\em Chance} {\bf 28} (2), 31–33.
(Andrew Gelman and David Madigan)
[2015] Statistics and research integrity. {\em European Science Editing} {\bf 41} (1), 13–14.
(Andrew Gelman)
[2015] Disagreements about the strength of evidence. {\em Chance} {\bf 28}, 55–59.
(Andrew Gelman)
[2014] A world without statistics. {\em Significance} {\bf 11} (4), 47.
(Andrew Gelman)
[2014] The statistical crisis in science. {\em American Scientist} {\bf 102}, 460–465.
(Andrew Gelman and Eric Loken)
[2014] The Commissar for Traffic presents the latest Five-Year Plan. {\em Chance} {\bf 27} (2), 58–60.
(Andrew Gelman and Phillip N. Price)
[2014] The AAA tranche of subprime science. {\em Chance} {\bf 27} (1), 51–56.
(Andrew Gelman and Eric Loken)
[2013] Is it possible to be an ethicist without being mean to people? {\em Chance} {\bf 26} (4), 51–53.
(Andrew Gelman)
[2013] In praise of the referee. {\em International Society for Bayesian Analysis Bulletin} {\bf 20} (1), 13–19.
(Nicolas Chopin, Andrew Gelman, Kerrie Mengersen, and Christian Robert)
[2013] It’s too hard to publish criticisms and obtain data for replication. {\em Chance} {\bf 26} (3), 49–52.
(Andrew Gelman)
[2013] To throw away data: Plagiarism as a statistical crime. {\em American Scientist} {\bf 101}, 168–171.
(Andrew Gelman and Thomas Basboll)
[2013] They’d rather be rigorous than right. {\em Chance} {\bf 26} (2), 45–49.
(Andrew Gelman)
[2013] The war on data. {\em Chance} {\bf 26} (1), 57–60.
(Andrew Gelman and Mark Palko)
[2013] Preregistration of studies and mock reports. {\em Political Analysis} {\bf 21}, 40–41.
(Andrew Gelman)
[2012] Ethics and the statistical use of prior information. {\em Chance} {\bf 25} (4), 52–54.
(Andrew Gelman)
[2012] Statistics for sellers of cigarettes. {\em Chance} {\bf 25} (3), 43–46.
(Andrew Gelman)
[2012] Ethics in medical trials: Where does statistics fit in? {\em Chance} {\bf 25} (2), 52–54.
(Andrew Gelman)
[2011] Ethics and statistics: Open data and open methods. {\em Chance} {\bf 24} (4), 51–53.
(Andrew Gelman)

There are a ton of papers in that list, in part in response to recent concerns about scientific replication, and in part because at the beginning of the decade I had the idea of running a regular column on ethics and statistics for Chance magazine, with the idea of putting the columns into a book. I doubt I’ll write a book specifically on ethics and statistics—I just don’t think there would be that much of an audience for it—but I’ve learned a lot from thinking about these issues.

My favorite of my articles on open science is What has happened down here is the winds have changed, from 2016; it’s not on the above list because I forgot to ever send it to a magazine or journal to be officially published, so it exists only as a blog entry.

Understanding the statistical properties of statistical methods as they are used

Related to work in open science is our research into the statistical properties of the statistical methods that people actually use. Theoretical statistics is the theory of applied statistics, so this all might be labeled real-world frequentist statistics:

[2019] Are confidence intervals better termed “uncertainty intervals”? {\em British Medical Journal} {\bf 366}. (Andrew Gelman and Sander Greenland)
[2019] Objective Randomised Blinded Investigation With Optimal Medical Therapy of Angioplasty in Stable Angina (ORBITA) and coronary stents: A case study in the analysis and reporting of clinical trials. {\em American Heart Journal}.
(Andrew Gelman, John Carlin, and Brahmajee Nallamothu)
[2019] Abandon statistical significance. {\em American Statistician} {\bf 73} (S1), 235–245.
(Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tackett)
[2018] Post-hoc power using observed estimate of effect size is too noisy to be useful. {\em Annals of Surgery}.
(Andrew Gelman)
[2018] The statistical significance filter leads to overconfident expectations of replicability. {\em Journal of Memory and Language} {\bf 103}, 151–175.
(Shravan Vasishth, Daniela Mertzen, Lena A. Jäger, and Andrew Gelman)
[2018] Don’t calculate post-hoc power using observed estimate of effect size. {\em Annals of Surgery}.
(Andrew Gelman)
[2018] The failure of null hypothesis significance testing when studying incremental changes, and what to do about it. {\em Personality and Social Psychology Bulletin} {\bf 44}, 16–23.
(Andrew Gelman)
[2017] Measurement error and the replication crisis. {\em Science} {\bf 355}, 584–585.
(Eric Loken and Andrew Gelman)
[2017] Type M error might explain Weisburd’s Paradox. {\em Journal of Quantitative Criminology}.
(Andrew Gelman, Torbjørn Skardhamar, and Mikko Aaltonen)
[2015] The connection between varying treatment effects and the crisis of unreplicable research: A Bayesian perspective. {\em Journal of Management} {\bf 41}, 632–643.
(Andrew Gelman)
[2014] Beyond power calculations: Assessing Type S (sign) and Type M (magnitude) errors. {\em Perspectives on Psychological Science} {\bf 9}, 641–651.
(Andrew Gelman and John B. Carlin)
[2014] Revised evidence for statistical standards. {\em Proceedings of the National Academy of Sciences USA}.
(Andrew Gelman and Christian Robert)
[2014] Difficulties in making inferences about scientific truth from distributions of published p-values. {\em Biostatistics} {\bf 1}, 18–23.
(Andrew Gelman and Keith O’Rourke)
[2013] Interrogating P-values. {\em Journal of Mathematical Psychology} {\bf 57}, 188–189.
(Andrew Gelman)
[2013] P-values and statistical practice. {\em Epidemiology} {\bf 24}, 69–72.
(Andrew Gelman)

History, philosophy, and statistics education

My collaborators and I have also written some things on history, philosophy, and statistics education. Much of this represents ideas that my colleagues and I have been discussing for decades, that we finally got an opportunity to write up and discuss formally. Others were responses to new ideas or developments:

[2019] The principles of uncertainty. Review of “Do Dice Play God,” by Ian Stewart. {\em Nature} {\bf 569}, 628–629.
(Andrew Gelman)
[2019] Laplace’s theories of cognitive illusions, heuristics, and biases (with discussion). {\em Statistical Science}.
(Joshua B. Miller and Andrew Gelman)
[2018] Donald Rubin. In {\em Encyclopedia of Social Research Methods}, ed.\ Paul Atkinson, Sara Delamont, Melissa Hardy, and Malcolm Williams. Thousand Oaks, Calif.: Sage Publications.
(Andrew Gelman)
[2017] Beyond subjective and objective in statistics (with discussion and rejoinder). {\em Journal of the Royal Statistical Society A} {\bf 180}, 967–1033.
(Andrew Gelman and Christian Hennig)
[2015] The state of the art in causal inference: Some changes since 1972. {\em Observational Studies} {\bf 1}, 182–183.
(Andrew Gelman)
[2015] Regression: What’s it all about? Review of {\em Bayesian and Frequentist Regression Methods}, by Jon Wakefield. {\em Statistics in Medicine}.
(Andrew Gelman)
[2015] Moving forward in statistics education while avoiding overconfidence. Discussion of “Mere Renovation is Too Little Too Late: It’s Time to Rebuild the Undergraduate Curriculum from the Ground Up,” by George Cobb. {\em American Statistician} {\bf 69}.
(Andrew Gelman and Eric Loken)
[2014] How do we choose our default methods? For the Committee of Presidents of Statistical Societies (COPSS) 50th anniversary volume.
(Andrew Gelman)
[2014] When do stories work? Evidence and illustration in the social sciences. {\em Sociological Methods and Research} {\bf 43}, 547–570.
(Andrew Gelman and Thomas Basboll)
[2013] Convincing evidence. For a volume on theoretical or methodological research on authorship, functional roles, reputation, and credibility on social media, ed.\ Sorin Matei and Elisa Bertino.
(Andrew Gelman and Keith O’Rourke)
[2013] “Not only defended but also applied”: The perceived absurdity of Bayesian inference (with discussion). {\em American Statistician} {\bf 67}, 1–17.
(Andrew Gelman and Christian Robert)
The anti-Bayesian moment and its passing (rejoinder to discussion).
(Andrew Gelman and Christian Robert)
[2013] Philosophy and the practice of Bayesian statistics (with discussion). {\em British Journal of Mathematical and Statistical Psychology} {\bf 66}, 8–38.
(Andrew Gelman and Cosma Shalizi)
[2013] Rejoinder to discussion. {\em British Journal of Mathematical and Statistical Psychology} {\bf 66}, 76–80.
(Andrew Gelman and Cosma Shalizi)
[2012] What made Bell Labs special? Review of {\em The Idea Factory: Bell Labs and the Great Age of American Innovation}, by Jon Gertner. {\em Physics World}, December, 39–40.
(Andrew Gelman)
[2012] Statisticians: When we teach, we don’t practice what we preach. {\em Chance} {\bf 25} (1), 47–48.
(Andrew Gelman and Eric Loken)
[2011] Induction and deduction in Bayesian data analysis. {\em Rationality, Markets and Morals}, special topic issue “Statistical Science and Philosophy of Science: Where Do (Should) They Meet In 2011 and Beyond?”, ed.\ Deborah Mayo, Aris Spanos, and Kent Staley.
(Andrew Gelman)
[2011] Going beyond the book: toward critical reading in statistics teaching. {\em Teaching Statistics} {\bf 34}, 82–86.
(Andrew Gelman)
[2011] Bayesian statistical pragmatism. {\em Statistical Science} {\bf 26}, 10–11.
(Andrew Gelman)
[2011] Philosophy and the practice of Bayesian statistics in the social sciences. In {\em Oxford Handbook of the Philosophy of the Social Sciences}, ed.\ Harold Kincaid. Oxford University Press.
(Andrew Gelman and Cosma Shalizi)
[2010] Bayesian statistics then and now. {\em Statistical Science} {\bf 25}, 162–165.
(Andrew Gelman)

Also, Deb Nolan and I came out with the second edition of Teaching Statistics: A Bag of Tricks.

Causal inference

We’ve also done some research on causal inference:

Not a lot of papers on the topic, as I’m not always clear on what I can add to these discussions—there’s a reason that Jennifer is the main author of the causal chapters in our books—but causal inference is central to statistics (recall the title of this blog!), so I’m glad to contribute to it in some way.

Statistical graphics and visualization

Visualization is one of my favorite topics that we keep coming back to, as part of our larger effort to incorporate statistical practice into formal statistical theory and methods:

[2018] Visualization in Bayesian workflow (with discussion). {\em Journal of the Royal Statistical Society A}.
(Jonah Gabry, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman)
[2016] Graphical visualization of polling results. In {\em Oxford Handbook on Polling and Polling Methods}, ed.\ Lonna Atkeson and Michael Alvarez.
(Susanna Makela, Yajuan Si, and Andrew Gelman)
[2014] Statistical graphics for survey weights. {\em Revista Colombiana de Estadistica} {\bf 37}, 285–295.
(Susanna Makela, Yajuan Si, and Andrew Gelman)
[2013] Infovis and statistical graphics: Different goals, different looks (with discussion). {\em Journal of Computational and Graphical Statistics} {\bf 22}, 2–49.
(Andrew Gelman and Antony Unwin)
Tradeoffs in information graphics (rejoinder to discussion).
(Andrew Gelman and Antony Unwin)
[2011] Tables as graphs: The Ramanujan principle. {\em Significance} {\bf 8}, 183.
(Andrew Gelman)
[2011] Statistical graphics: making information clear — and beautiful. {\em Significance} {\bf 8}, 135–137.
(Jarad Niemi and Andrew Gelman)
[2011] Why tables are really much better than graphs (with discussion). {\em Journal of Computational and Graphical Statistics} {\bf 20}, 3–40.
(Andrew Gelman)

That’s all only part of the story

The above list is incomplete, in that it does not include unpublished papers, blogging (we’ve had something like 6000 posts and 100,000 comments in the past decade), case studies, wiki pages, and other modes of research communication.

Let me emphasize that all this work is collaborative. Even the articles published only under my name are collaborative in that they are the results of lots of reading and discussions with others. Let’s remember to avoid the scientist-as-hero narrative.

It’s been an eventful decade in the world: economic development, environmental challenges, social and political opportunities, and nearly a billion new people. Statistical modeling, causal inference, and social science is only a tiny part of all of this, and the work of my collaborators and myself is only a tiny part of statistical modeling, causal inference, and social science—but we still try in some way to develop tools for people to be able to understand and improve our physical and social environments. My colleagues and I have been privileged to have a working environment that has allowed us to make efforts in these directions, and we’ve also worked hard—with books, research articles, journalism, blogging, software, documentation, and online forums—to engage with and build communities of people who can do similar work.

8 thoughts on “Progress in the past decade”

Sameera Daniels on January 1, 2020 11:25 AM at 11:25 am said:

Thank you so much. Have an even more fruitful research 2020. My mind has been enriched by everyone here. Hugs.

Reply ↓
D Kane on January 1, 2020 11:47 AM at 11:47 am said:

Happy New Year!

> My favorite of my articles on open science is What has happened down here is the winds have changed.

Indeed. Highly recommended to anyone who has not read it.

Reply ↓
Diana Senechal on January 1, 2020 2:05 PM at 2:05 pm said:

Congratulations, thank you, and happy new decade! There’s a lot here to read, and it’s good to see so many people (and types of contributions) acknowledged.

Reply ↓
- Andrew on January 1, 2020 3:48 PM at 3:48 pm said:
  
  Diana:
  
  One reason I want to credit others is that I remember what it was like to be a kid. When I was a kid, I remember thinking that lots of adults must have forgotten what it had been like, and I resolved to always remember. And I did.
  
  Reply ↓
jim on January 2, 2020 12:38 PM at 12:38 pm said:

Wow, what an amazing pile of work! Awesome.

Thanks to everyone for so many great insights and thoughts and so much amazing work.

Reply ↓
jd on January 2, 2020 2:21 PM at 2:21 pm said:

It’s cool to see all of this in one blog post. This is a great resource too.

Thanks and kudos to everyone working on all of this!

Also, thanks for this blog. Happy new decade

Reply ↓
Kaiser on January 3, 2020 1:35 AM at 1:35 am said:

Happy new year! That’s an amazing list and a super productive decade. It’s hard to catch up with everything you’re doing. The blogging – free education – is most appreciated. Thank you!

Reply ↓
- Martha (Smith) on January 3, 2020 11:09 AM at 11:09 am said:
  
  +1
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Progress in the past decade

8 thoughts on “Progress in the past decade”

Leave a Reply Cancel reply