Discussion of the paper by Girolami and Calderhead on Bayesian computation

Here’s my discussion of this article for the Journal of the Royal Statistical Society:

I will comment on this paper in my role as applied statistician and consumer of Bayesian computation. In the last few years, my colleagues and I have felt the need to fit predictive survey responses given multiple discrete predictors, for example estimating voting given ethnicity and income within each of the fifty states, or estimating public opinion about gay marriage given age, sex, ethnicity, education, and state. We would like to be able to fit such models with ten or more predictors–for example, religion, religious attendance, marital status, and urban/rural/suburban residence in addition to the factors mentioned above.

There are (at least) three reasons for fitting a model with many predictive factors and potentially a huge number of interactions among them:

1. Deep interactions can be of substantive interest. For example, Gelman et al. (2009) discuss the importance of interactions between income, religion, religious attendance, and state in understanding how people vote.

2. Deep interactions can increase predictive power. For example Gelman and Ghitza (2010) show how the relation between voter turnout and the combination of sex, ethnicity, education, and state has systematic patterns that would be not be captured by main effects or even two-way interactions.

3. Deep interactions can help correct for sampling problems. Nonresponse rates in opinion polls continue to rise, and this puts a premium on post-sampling adjustments. We can adjust for known differences between sampling and population using poststratification, but to do so we need reasonable estimates of the average survey response within narrow slices of the population (Gelman, 2007).

Our key difficulty–familiar in applied statistics but not always so clear in discussions of statistical computation–is that, while we have an idea of the sort of model we would like to fit, we are unclear on the details. Thus, our computational task is not merely to fit a single model but to try out many different possibilities. My colleagues and I need computational tools that are:

(a) able to work with moderately large datasets (aggregations of surveys with total sample size in the hundreds of thousands);
(b) able to handle complicated models with tens of thousands of latent parameters;
(c) flexible enough to fit models that we haven’t yet thought of;
(d) fast enough that we can fit model after model.

We all know by now that hierarchical Bayesian methods are a good way of estimating large numbers of parameters. I am excited about the article under discussion, and others like it, because the tools therein promise to satisfy conditions (a), (b), (c), (d) above.

2 thoughts on “Discussion of the paper by Girolami and Calderhead on Bayesian computation

  1. Perhaps obvious that b-d is currently hopeless without being Bayesian.

    Given simulation based calculations are so so important, improvements are critical.

    K?

  2. Being marched in by those we think we are marching in on.

    Given this http://www.stat.columbia.edu/~cook/movabletype/ar

    I think I can make this curious quip make sense.

    As the use of Bayesian methods in clinical trails has resulted in the FDA requiring/suggesting simulations to pin down type 1 and 2 error rates – these now Frequentist methods – require even more computation (repeated simulations of the Bayesian MCMC)

    And hence much better computational resources will perhaps encourage this more widely – the turning of Bayesian methods into Frequency Methods via simulation …

    K?
    p.s. guess I better actually start reading that Berger material on Bayesian p-values

Comments are closed.