Continuing with my discussion of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:

Larry Wasserman, “Low Assumptions, High Dimensions”:

This article was refreshing to me because it was so different from anything I’ve seen before. Larry works in a statistics department and I work in a statistics department but there’s so little overlap in what we do. Larry and I both work in high dimesions (maybe his dimensions are higher than mine, but a few thousand dimensions seems like a lot to me!), but there the similarity ends. His article is all about using few to no assumptions, while I use assumptions all the time. Here’s an example. Larry writes:

P. Laurie Davies (and his co-workers) have written several interesting papers where probability models, at least in the sense that we usually use them, are eliminated. Data are treated as deterministic. One then looks for adequate models rather than true models. His basic idea is that a distribution P is an ad- equate approximation for x1,…,xn, if typical data sets of size n, generated under P look like x1,…,xn. In other words, he asks whether we can approximate the deterministic data with a stochastic model.

This sounds cool. And it’s so different from my world! I do a lot of work with survey data, where the sample is intended to mimic the population, and a key step comes in the design, which is all about probability sampling. I agree that Wassserman’s (or Davies’s) approach *could* be applied to surveys—the key step would be to replace random sampling with quota sampling, and maybe this would be a good idea—but in the world of surveys we would typically think of quota sampling or other nonprobabilistic approaches as an unfortunate compromise with reality rather than as a desirable goal. In short, typical statisticians such as myself see probability modeling as a valuable tool that is central to applied statistics, while Wasserman appears to see probability as an example of an assumption to be avoided.

Just to be clear: I’m not *at all* saying Wasserman is wrong in any way here; rather, I’m just marveling on how different his perspective is from mine. I can’t immediately see how his assumption-free approach could possibly be used to estimate public opinion or votes cross-classified by demogtaphics, income, and state. But, then again, maybe my models wouldn’t work so well on the applications on which Wasserman works. Bridges from both directions would probably be good.

With different methods and different problems come different philosophies. My use of generative modeling motivates, and allows, me to check fit to data using predictive simulation. Wasserman’s quite different approach motivates him to understand his methods using other tools.

That Davies approach sounds very similar to your idea of posterior predictive checks. It clearly is very different from the probability sampling world, but it would seem to me that posterior predictive checks are as well. Not sure what I’m missing (but I haven’t read Wasserman’s paper, only yours, Mayo/Cox, and Senn so far).

When the quote says “Data are treated as deterministic” does it actually imply anything about how the data is collected, which seems to be the your take away? It sounds more like what Mark says in his comment. Or perhaps a very machine-learning kind of perspective. I’m sure I’m missing the subtleties on both sides, though.

data is deterministic… low assumptions?