I’m curious what Andrew recommends here. I don’t recall seeing hierarchical variance structure in any of the books.

]]><src img="..." width = "50%" />

Anyone can link to externally hosted images this way, too.

]]>> I’m afraid I got images

Anything wrong with doing that? ]]>

For some applications I’ve used a normal distribution for the log variance, and from there modeled the hierarchy much as one would for the mean structure.

]]>I reckon, when using uniform or truncated normal, Stan will still transform the parameter under the hood. Not with tanh though, if I’m reading the manual correctly. In hand-rolled algorithms, without using any transformations at any point, I think using uniform or truncated distribution would save one from adjusting the prior density, which might or might not be practical.

]]>atanh_y < - rnorm(10000) y <- tanh(atanh_y)

With ggplot,

ggplot(data = data.frame(y = y)) + geom_histogram(aes(y), color="lightgray", bins=50) ggsave(filename = "std-normal.jpg", width=4, height=3)

It gives you a U-shaped distribution, which is probably not what you want as a correlation prior.

A simple uniform(-1, 1) would be close, but uniform. We tend to prefer distributions concentrated around 0 to regularize correlation estimates. You can do that by cutting down on the scale of the normal, say normal(0, 0.5) instead of normal(0, 1).

With Stan, you could just truncate a normal(0, whatever) to [-1, 1] rather than transforming using tanh(). That would still allow the boundaries to have non-zero density.

Using uniform(-1, 1) is what you get in 2 dimensions setting the LKJ parameter to 1. As dimensionality increases, LKJ(1) is still uniform on correlation matrices, but each marginal correlation becomes more peaked around 0 because of the positive-definiteness constraint.

]]>atanh(rho) ~ normal

This was my first idea, but it is excluded from the wiki and there doesn’t seem to be much about it on the internet (or at least I couldn’t find much in a couple of tired minutes).

Here’re some R plots, they’re always fun, aren’t they:

prior_density = function(y, par){

dnorm(atanh(y), par[1], par[2]) * abs(1 / (1 – y^2))

}

pars = matrix(c(0.0, 1.0,

0.3, 0.8,

0.0, 0.3,

-0.8, 0.8), ncol = 2, byrow = T)

y = seq(-1, 1, length.out = 400)

par(mfrow = c(2, 2))

for(i in 1:nrow(pars)){

y_density = prior_density(y, pars[i,])

plot(y, y_density, type = “l”, ylab = “Density”, xlab = expression(rho),

ylim = c(0, 2), col = “red”, main = paste(“mu =”, pars[i,1], “sigma =”, pars[i,2]))

hist(tanh(rnorm(50000, pars[i,1], pars[i,2])), prob = T, add =T)

}