In response to this post the other day on prior distributions for climate sensitivity, Nicholas Lewis wrote in:
Your post refers to comments I made at ATTP’s blog about the use of Jeffreys’ prior in estimating climate sensitivity. I would like to explain why, in some but not all cases, the Jeffreys’ prior for estimating climate sensitivity peaks at zero, a physically implausible sensitivity.
Unfortunately, observational uncertainty is high, and the choice of prior has a substantial effect on the posterior density when estimating climate sensitivity. The value of perfect information about a less uncertain parameter that is primarily determined by climate sensitivity was estimated in a recent paper as $10 trillion, so prior selection for this problem is quite an important issue.
Roughly speaking, when estimating climate sensitivity (ECS) from observational data over the industrial period, the reciprocal of climate sensitivity, the climate feedback parameter, is a location parameter with approximately normal (or t distributed) uncertainty. If one does not want to incorporate prior information about the parameter being estimated, then a uniform prior seems to me the obvious choice when estimating a location parameter, here the climate feedback parameter. Even if one knows that the climate feedback parameter can’t be infinite (corresponding to ECS = 0), using a uniform prior does not bias estimation. On changing variable back to ECS, that prior transforms to c/ECS^2 (c being a constant), which peaks at zero. But the cut off from the declining likelihood function at is very sharp at low ECS, and makes the posterior density negligible at ECS values well above zero.
Although I use computed noninformative (or minimally informative) priors in my published Bayesian climate sensitivity studies, I think I am the only climate scientist to do so. Many published climate sensitivity studies take a subjective Bayesian approach and either use a wide uniform prior for ECS (which hugely fattens the upper tail of the posterior PDF, as well as shifting the central estimate, relative to use of Jeffreys’ prior), or an “expert” prior that typically dominates the likelihood function. Some studies do, however, use non-Bayesian methods, and so avoid use of subjectively chosen priors.
No doubt, as a Bayesian, you don’t like likelihood ratio and profile likelihood methods of parameter estimation. But I will nevertheless point out that when I use such methods, they give results very close to those obtained by use of Jeffreys’ prior. So also does use of a reference prior.
In any case, I think this is a challenging problem because of the decision aspect. As you note, there’s lots of attention drawn to the center of the prior distribution, but the tails are crucial when considering expected costs. This came out in some of the discussion in comments, for example here.
Another point that came up, but which I did not emphasize in my post, is that the Jeffreys prior depends on the likelihood function, thus indirectly on various assumptions. I’m not in general opposed to noninformative priors, I just think they should be taken for what they are. It always makes sense to understand the mapping from assumptions to conclusions.
Lewis then added:
The upper tail of the estimated posterior density for climate sensitivity is indeed critical for estimates of expected damages. Unfortunately, the likelihood typically only declines gently as ECS goes to high values, as they correspond to the climate feedback parameter approaching zero more and more slowly. Hence, in my view at least, the importance here of using a prior that declines at least approximately as 1/ECS^2, to reflect the data-parameter relationship, and not a uniform prior. It is a pity that the standard parameterization is in terms of ECS rather than climate feedback parameter; if it had been I think this issue would never have arisen.
I accept that Jeffreys’ prior depends on the likelihood function, and hence on various assumptions, and I agree that one needs to understand how those assumptions affect the conclusions. In observationally-based climate sensitivity estimation, usually the likelihood is normal or t-distribution in the observations, or in some simple transformation thereof. I usually carry out the Bayesian inference in that parameterization, where the choice of prior appears more straightforward, and obtain the Jeffreys’ prior for the climate system parameters being estimated by carrying out a change of variable. (Some climate sensitivity studies by other scientists use the same method, but without saying that they are using a Bayesian approach.) The assumptions about the likelihood function for the observables and those about the relationship between the observables and the climate system parameters (model accuracy) can then be examined separately, and some sensitivity testing carried out.
Also, nuisance parameters don’t seem to be a particular problem in the studies I have carried out, I think partly because (unlike some other Bayesian studies) they estimate no more than two other uncertain parameters in addition to climate sensitivity and partly because plug-in estimates of observational etc. uncertainty taken from other sources are used, as is usual in such studies.
Lots to think about here. 20 or 30 years ago we would’ve agonized over what’s the appropriate noninformative prior, but now we’re all more comfortable talking about prior distributions as encoding useful information, and priors regularizing by downweighting scientifically implausible regions of parameter space.