The journal Rationality, Markets and Morals has finally posted all the articles in their special issue on the philosophy of Bayesian statistics.
My contribution is called Induction and Deduction in Bayesian Data Analysis. I’ll also post my reactions to the other articles. I wrote these notes a few weeks ago and could post them all at once, but I think it will be easier if I post my reactions to each article separately.
To start with my best material, here’s my reaction to David Cox and Deborah Mayo, “A Statistical Scientist Meets a Philosopher of Science.” I recommend you read all the way through my long note below; there’s good stuff throughout:
1. Cox: “[Philosophy] forces us to say what it is that we really want to know when we analyze a situation statistically.”
This reminds me of a standard question that Don Rubin (who, unlike me, has little use for philosophy in his research) asks in virtually any situation: “What would you do if you had all the data?” For me, that “what would you do” question is one of the universal solvents of statistics.
2. Mayo defines scientific objectivity as concerning “the goal of using data to distinguish correct from incorrect claims about the world” and contrasts this with so-called objective Bayesian statistics. All I can say here is that the terms “subjective” and “objective” seem way overloaded at this point. To me, science is objective in that it aims for reproducible findings that exist independent of the observer, and it’s subjective in that the process of science involves many individual choices. And I think the statistics I do (mostly, but not always, using Bayesian methods) is both objective and subjective in that way.
3. Cox discusses Fisher’s rule that it’s ok to use prior information in design of data collection but not in data analysis. Like a lot of hundred-year-old ideas, this rule makes sense in some contexts but not in others. Consider the notorious study in which a random sample of a few thousand people was analyzed, and it was found that the most beautiful parents were 8 percentage points more likely to have girls, compared to less attractive parents. The result was statistically significant (p<.05) and published in a reputable journal. But in this case we have good prior information suggesting that the difference in sex ratios in the population, comparing beautiful to less-beautiful parents, is less than 1 percentage point. A classical design analysis reveals that, with this level of true difference, any statistically-significant oberved difference in the sample is likely to be noise. (Even conditional on statistical significance, the observed difference has an over 40% chance of being in the wrong direction and will overestimate the population difference by an order of magnitude.) At this point, you might well say that the original analysis should never have been done at all---but, given that it has been done, it is essential to use prior information to interpret the data and generalize from sample to population. Where did Fisher's principle go wrong here? The answer is simple---and I think Cox would agree with me here. We're in a setting where the prior information is much stronger than the data. If one's only goal is to summarize the data, then taking the difference of 8% (along with a confidence interval and even a p-value) is fine. But if you want to generalize to the population---which was indeed the goal of the researcher in this example---then it makes no sense to stop there. Cox illustrates the difficulty in a later quote: "[Bayesians'] conceptual theories are trying to do two entirely different things. One is trying to extract information from the data, while the other, personalistic theory, is trying to indicate what you should believe, with regard to information from the data and other, prior, information treated equally seriously. These are two very different things." Yes, but Cox is missing something important! He defines two goals: (a) Extracting information from the data. (b) A "personalistic theory" of "what you should believe." I'm talking about something in between, which is inference for the population. I think Laplace would understand what I'm talking about here. The sample is (typically) of no interest in itself, it's just a means to learning about the population. But my inferences about the population aren't "personalistic"---at least, no more than the dudes at CERN are personalistic when they're trying to learn about particle theory from cyclotron experiments, and no more than the Census and the Bureau of Labor Statistics are personalistic when they're trying to learn about the U.S. economy from sample data. 4. Cox: "There are situations where it is very clear that whatever a scientist or statistician might do privately in looking at data, when they present their information to the public or government department or whatever, they should absolutely not use prior information, because the prior opinions on some of these prickly issues of public policy can often be highly contentious with different people with strong and very conflicting views." Maybe. But I don't think Cox even believes this statement himself if it were taken literally. For example, right now I'm working on the politically controversial problem of reconstructing historical climate from tree rings. We have a lot of prior information on the processes under which tree rings grow and how they are measured. I don't think anyone would want to just take raw numbers from core samples as a climate estimate! All the tools from Statistical Methods for Research Workers won't take you from tree rings to temperature estimates. You need some scientific knowledge and prior information on where these measurements came from. So let me interpret what I think Cox was saying. I take him to be dividing any scientific inference into two parts, inside and outside. Priors are allowed in the inside work of scientific modeling, which uses lots of external information, from the basic assumptions that the data correspond to your scientific goals, through the mathematical form of the transfer function, down to details such as an assumption of normally-distributed measurement errors, which might be supported based on prior experimental evidence. But Cox would prefer to avoid priors in the outside problem. In my example, I assume he'd allow prior information on the tree-ring measurement process---I don't see how you can get anywhere otherwise---but he'd rather not combine with external estimates of the temperature series. That's a tenable position. It doesn't avoid all the controversy---manipulations of the data model can map in predictable ways to changes in the final inferences---but it could make sense. I've followed this approach in much of my own applied work, using noninformative priors and carefully avoiding the use of prior information in the final stages a statistical analysis. But that can't always be the right choice. Sometimes (as in the sex ratio example above), the data are just too weak---and a classical textbook data analysis can be misleading. Imagine a Venn diagram, where one circle is "Topics that are so controversial that we want to avoid using prior information in the statistical analysis" and the other circle is "Problems where the data are weak compared to prior information." If you're in the intersection of these circles, you have to make some tough choices! More generally, there is a Bayesian solution to the problem of sensitivity to prior assumptions. That solution is sensitivity analysis: perform several analyses using different reasonable priors. Make more explicit the mapping from prior and data to conclusions. Be open about sensitivity, don't try to sweep the problem under the rug, etc etc. And, if you're going that route, I'd also like to see some analysis of sensitivity to assumptions that are not conventionally classified as "prior." You know, those assumptions that get thrown in because they're what everybody does. For example, Cox regression is great, but additivity is a prior assumption too! (One might argue that assumptions such as additivity, logistic links, etc., are exempt from Fisher's strictures by virtue of being default assumptions rather than being based on prior information---but I certainly don't think Mayo would take that position, given her strong feelings on Bayesian default priors.) My point here is that all statistical methods require choices---assumptions, if you will. Not all your choices can be determined or even validated from the data at hand. If you don't want your choices to be based on prior information, what other options do you have? You can rely on convention---using methods that appear in major textbooks and have stood the test of time---or maybe on theory. Both these meta-foundational approaches have their virtues but neither is perfect: Conventional methods are not necessarily good (as can be seen by noting that for many problems there are multiple conventional methods that give different results), and theory often doesn't help (for example classical confidence intervals and hypothesis tests are insufficient in the simple sex-ratio problem noted above).