The Royal Statistical Society (U.K.) has organized a discussion of a new paper, Frequentist accuracy of Bayesian estimates, by Brad Efron. The discussion will be an online event (a “webinar”) on 21 Oct 2015 (that’s right, “Back to the Future Day”) at ~~noon~~ 11am eastern time (4pm in the U.K.). Brad will present, I’ll ask some questions, and Peter Diggle will moderate.

Anyone can join in the online discussion; just follow the instructions here.

When you join the webinar, you sign in as a guest, then once you have the GlobalMeet screen, you can see a “Q&A” tab near the top, you can click on that to send in your question which will be discussed at the end:

Or you can ask your question directly via voice using the discussion period.

To get things started, I prepared 3 questions for Brad ahead of time. I was told that it would ruin the spontaneity if I were to post the exact questions here ahead of time, but I will tell you roughly what I’ll be asking about.

1. My first question addresses questions of sparsity and density, whether it be the sparse-parameter, dense-data scenario described so eloquently by Brad’s colleague Rob Tibshirani, or the dense-parameter, sparse-data scenario in which I often find myself. In recent years, Brad’s done a lot of work in genetics, an area which, according to Xihong Lin, features both sparsity and density in different ways.

2. In my second question, I ask what is the practical message of Brad’s paper as it relates to Bayesian methods.

3. My third question relates to what I consider to be the positive aspects of incoherence in statistical methods, a topic that I think should be of particular interest to Brad given the ways in which his work has linked strands from different areas in statistics.

Enjoy!

**P.S.** The webinar is 11am, not noon.

Andrew, it starts at 11 am Eastern time, not noon, right? Thanks!

Noon or 11AM EDT? (their website says 11am)

What relevant for Bayesian analysis?

If the high probability region of P(x|E) represents those possibilities for the true x which are compatible /consistent with E, then there’s absolutely no reason to think P(x|E) is equal to the frequency at which x is seen in repeated trials. It’s usually not even a meaningful question. Frequentists may believe

theirP(x|E) does represent the frequency of x in repeated trails, but there’s overwhelming evidence this is hardly ever true outside of their simulations no matter how much they subjectively hope and dream it is.If frequencies in repeated trails are important, then you need to look at something like P(x_1,…,x_n|E) which requires a separate analysis [as is evidenced by the fact that infinitely many P(x_1,….,x_n|E) can have the same marginal distribution P(x_i|E) ]. This can be used to estimate any function F(x_1,…,x_n), including the “frequency function”, with appropriate errors bars/ranges.

That’s what needs to be compared to “Frequentist” estimates if you’re so inclinedSo I’m going to go ahead and say it has no relation to Bayesian stats. Then of course, there’s this:

http://www.bayesianphilosophy.com/test-for-anti-bayesian-fanaticism-part-ii/

Even simpler: Any Bayesian frequency estimate should come from P(x_1,…,x_n|E). Since there are infinitely many such joint distributions that yield a given P(x_1|E), it makes no sense to take P(x_1|E) and talk about “the” Bayesian frequency estimate.

The more I think about it, the less enamored I am with putting empirical bayes and calibrated priors on a pedestal, as opposed to treating it as just an additional layer in a hierarchical model. I don’t see what it buys other than obfuscation of more nomenclature + sweeping top level hyperpriors under the rug to claim ‘objectivity’.

On the webinar site it says that the start time is 11am eastern time, which is consistent with the 4pm UK time. Which is correct (11am or noon)?

Hi all. Yes, 11am, not noon.

For symmetry, we should do a “Bayesian properties of Frequentist answers”.

How well does a Confidence Interval represent the range of possibilities compatible with the evidence? Well since some 95% CI’s contain values all of which are provably impossible from the same assumptions used to derive the CI, they suck.

I’ll never get tired of these buzzfeed titles.

Dustin:

I’m glad to hear, because we have about 100 of them in the pipeline!

Might I suggest you up your ! game with some ¡!. Por ejemplo: “Click here to get your ¡FREE tix to my webinar!”

When you read that out loud, ¿doesn’t it sound more exciting? Plus then you could pronounce it “booze feed”.

Ja hoor

As far as the joys of incoherence: If you have doubts about a model and act accordingly, then this will appear to be “incoherent” with any Bayesian analysis that implicitly assumes Pr(model)=1.

If however, you do a proper Bayesian treatment when Pr(model) is less than 1, it turns out a whole host of things which everyone claims are violations of Bayesian coherence are in reality just another example of the power of Bayesian coherence. It’s just coherent on a bigger space which includes “Models”.

While in spirit I think you’re right, in terms of actual calculation, I don’t think we’re typically in a position to really put numerical probabilities on models. We wind up trying out a model, seeing if it agrees with various things we suspect, and moving on if it doesn’t fit very well. Sometimes it’s reasonable to actually have a small discrete set of models over which we have formal probabilities, like maybe if we know some data comes from one of several situations but the records of which situation have been lost (say concrete from several different processing plants, radar echos from several different possible airplanes, temperature measurements from several different types of remote undersea instruments… whatever). Discrete model choice typically isn’t very formal.

I don’t like to put numerical probabilities on models. I prefer continuous model expansion: building a larger model that includes the smaller models as special cases.

That said, in practice we’re often in the situation of having fit a few models and we need to combine them somehow in some simple and quick way. In this case I’m ok with some model averaging which can be taken as equivalent to putting probabilities on models. But I think I’d assign these probabilities based on cross-validated predictive accuracy, not on Bayes factors.

Andrew:

Should all models be accommodated as special cases of larger models? Isn’t it better for some to be just thrown away?

I couldn’t join the webinare online (the GlobalMeet app just displays “your meeting will start soon”). But on the the webpage there was also displayed a list of regular telephone numbers, so I dialed the one in Stockholm (I’m in Sweden) and it worked! I’m now listening on Brad Efron

on my phone! This is truly the future! :)