Ryan King writes:

I was wondering if you have a brief comment on the state of the art for objective priors for hierarchical generalized linear models (generalized linear mixed models). I have been working off the papers in Bayesian Analysis (2006) 1, Number 3 (Browne and Draper, Kass and Natarajan, Gelman). There seems to have been continuous work for matching priors in linear mixed models, but GLMMs less so because of the lack of an analytic marginal likelihood for the variance components. There are a number of additional suggestions in the literature since 2006, but little robust practical guidance.

I’m interested in both mean parameters and the variance components. I’m almost always concerned with logistic random effect models. I’m fascinated by the matching-priors idea of higher-order asymptotic improvements to maximum likelihood, and need to make some kind of defensible default recommendation. Given the massive scale of the datasets (genetics …), extensive sensitivity analysis won’t really be an option.

My reply:

First, I don’t think it’s helpful to speak of “objective priors.” As a scientist, I try to be objective as much as possible, but I think the objectivity comes in the principle, not the prior itself. A prior distribution–any statistical model–reflects information, and the appropriate objective procedure will depend on what information you have.

That said, I do like the idea of weakly informative priors and I also respect the need for default procedures.

When it comes to non-varying parameters, I’m currently happy with my weakly informative t priors (after appropriately rescaling the predictors), as discussed in my 2008 paper with Jakulin et al.

For variance parameters estimated using full Bayes, I like the half-t family from my 2006 paper you noted above, or else something hierarchical-hierarchical if you have multiple variance parameters. For point estimation, I prefer as default choice the gamma (2, 1/A) prior for hierarchical scale parameters with A set to some large value (a bit larger than you expect the scale parameter to reasonably be, as is the usual practice for weakly informative priors), as it keeps the estimate away from 0. This is discussed in detail in my paper with Yeojin, Sophia, Jingchen, and Vince, and it’s implemented in blmer/bglmer.

I have a problem I wonder if you can help me with. I’ve been thinking about the “two boys” problem (two children, one’s a boy, the probability they’re both boys is 1/3) and the “boy born on Tuesday” problem (two children, one’s a boy born on a Tuesday, the probability they’re both boys is 13/27, much closer to 1/2 than the first situation was). (we are of course in an alternative universe where children are boy or girl with exactly 1/2 probability, and could be equally born on any of 7 days. Not true in real life)

My problem is this: Your co-worker tells you she has two children, one a boy born on a day I will not name, but which she revealed to you. Is the probability that she has two sons 1/3 or 13/27?

If the latter, how was the information that the boy was born on a day of the week relevant? No child is born on no days, or on two, so we already knew he was born on one of seven days! If the former, which I think, could you confirm my reasoning is sound?

I think for “named day” it’s (7+7-1)/(14+14-1)=13/27. For unnamed day, it’s (7+7-7)/(14+14-7)=7/21=1/3, because if I don’t tell the puzzle-solver what the day is, he has to solve the puzzle seven times, even though I said the co-worker had “told” the day.

An arguably sensible way to define information is the ratio

P(2 boys)/P(2 boys given at least one “boy on Tues”)

Given you have P(gender1,day1,gender2,day2) the above is available deductively

(that is, it is mathematically obtainable without error)

In general any attempts at explanation – giving a _why_ for the ratio P(a)/P(a|b) – will almost surely be misleading for some P(a,b) – so why seek it or share it?

An example from Charlie Guyer’s web site on the Monte Hall Problem might help make this clearer. At http://www.stat.umn.edu/~charlie/goats.html he explains how it gets this explanation “You are right now if and only if you were right originally. Hence the probability of winning if you stick with your original choice is 1/3 — no case splitting and no calculation necessary. Hence the probability of winning if you switch is 2/3 because probabilities add to one, and there is no other option”. Now this is wrong for some models of the problem – in fact any model where

P(original choice wins)/P(original choice wins given door opened) is not a constant.

So, why do we expend effort to come up with explanations that are error prone when error free deductive methods are feasible????

[…] I’m making my way through old Gelman posts, I found this post which might particularly be interesting to students in my STAT 615 class (Advanced Bayesian […]

Derek, the reason you have a problem with the 13/27 answer, is because the correct answer to both the “two boys” and “boy born on Tuesday” problems, as you phrased them, is 1/2.

Don’t get me wrong – the *proportion* of two boy families, within families that have at least one boy, is 1/3. And the *proportion* of two boy families, within families that have at least one boy born on a Tuesday, is 13/27. But proportion and probability aren’t always the same thing. This was first pointed out in 1889 by French Mathematician Joseph Bertrand, in his famous “Box Paradox,” which is a simpler variation of the paradox you contrived about your co-worker. I’ll paraphrase his example as the “two boy” puzzle, to show you why the answer is 1/2.

Suppose I tell you that I have two children. What is the probability that they share the same gender? That’s easy – it’s 1/2. But suppose I tell you a gender that applies to one, or both, of my children. You might be tempted to say the probability that they share the same gender has now changed to 1/3. But you’d have to say the same thing no matter what gender I told you. And if that is correct, it means that the probability has to be 1/3 even if I don’t tell you anything. Yet we know that probability is 1/2.

The resolution of this paradox, is that the condition in the conditional probability is not “One is a boy,” it is “I choose to tell you that one is a boy.” Since it is possible that I could choose to tell you “one is a girl” when I also have a boy, the probability that the condition is met is not the same as the proportion where the fact is true. If we assume – and it is all we can reasonably assume – that I will choose fact (gender, or gender+day) at random from the set of facts available, then you have to divide the cases where the facts are different for my two children in half. The answer to the “two boy” problem is not 1/(1+2)=1/3, it is 1/(1+2/2)=1/2. And the answer to the “boy born on Tuesday” problem is not (1+12)/(1+26)=13/27, it is (1+12/2)/(1+26/2)=7/14=1/2.