Ah for anova coding yes i agree, i thought you were saying contrast coding in general must be orthogonal contrasts.

]]>Why do they have to be orthogonal? I’ve never really worked out the math but my guess is that it has something to do with the way ANOVA works, you are basically seeking the solution to a quadratic minimization problem. You turn that into a matrix algebra problem by taking derivatives and setting them equal to zero. Now you don’t have a solvable problem so you add on some auxiliary assumptions, namely that the sum of the coefficients is zero at each level. Now you can change basis from the coefficients of each predictor to the coefficients of the contrast vectors. This change of basis must be to a new linearly independent complete basis. Technically it doesn’t have to be orthogonal but if it isn’t then the values of the coefficients are linearly related to each other. The standard errors become non independent, and interpretation becomes difficult because we don’t have a nice posterior distribution to show how the uncertainty in coefficients is co dependent.

That’s all a guess. But what I do know is that if you sample from the posterior of a proper Bayesian model you automatically can answer any question you want about interdependencies between the variables. The orthogonal linear algebra stuff basically insists on geometrically enforced independence.

]]>Sure, I agree that we should focus on fitting models directly in Stan. (In fact when people ask me why I don’t teach ANOVA I say, of course I do, look at this multiple regression model that I teach.) Why do the contrasts *have* to be orthogonal to each other?

]]>Great suggestion Christopher — thanks! — and of course weights can be arbitrary (eg depending on covariates via suitable transform, or following any strictly positive distribution) without changing the family of the prior for scale invariant families. This is the same strategy used in nlme, gamlss, etc, for modelling heteroskedasticity, sans the prior…

]]>I think it’s difficult even for the mathematically sophisticated to properly code the appropriate contrasts, and then when you do, to get the right answer unless the design has exactly the same number of cases in each batch, (ie. in an unbalanced design). Let’s just give up on this formalism that is in my opinion a kind of special case of the multilevel model, and move on to coding the model directly in Stan and fitting it with moderately informative priors ;-)

Also the contrasts need to be orthogonal to each other… but we have many questions that are not geometrically orthogonal.

]]>We’ve written a tutorial paper on this topic, it may be of interest to readers of this blog: https://arxiv.org/abs/1807.10451. Comments are welcome.

]]>I’m with Jeff. I really like the graphical ANOVA aesthetically, and as a way to summarize multi-level models. It is a communication challenge to invoke the whole framework of ANOVA, but then say “don’t interpret this like the ANOVA you were taught”. I would be curious to see more examples of its use in publications.

]]>Neat idea! I’ll try this out right away.

]]>One approach I’ve played with is based on diagonal component of rstanarm’s decov() priors. If you have K groups, you can put a half normal / half t prior on the average standard deviation (sigma-bar) and use a simplex (phi) with a symmetrical dirichlet distribution to describe how evenly the variance is distributed among the groups.

sigma_bar ~ normal(0, s); // s is hyper parameter; sigma_bar > 0

phi[1:K] ~ dirichlet(a,a,a,…,a) // a is a hyper parameter

sigma[i] = sigma_bar * sqrt(K * phi[i]); // mean(sigma[1:k]^2) == sigma_bar^2;

It has worked well enough the couple of times I’ve used it

]]>I revisit your ANOVA paper every year and I think it’s time for a refresher. I have a general repulsion to ANOVA tables (I don’t find them that informative) but I think because most papers that I read use them as summary tables of null hypothesis tests (and admittedly, this is the way I read them because this is how it was taught to me).

]]>Jeff:

The regression and Anova *models* are a special case of generalized linear models. But Anova is not just a statistical model, it’s also a way of structuring and displaying the model, batching coefficients and comparing their variances.

You’re raising a different important point which is that statisticians typically focus on the outcome variable (continuous, binary, count, zero-inflated count, etc), whereas practitioners often focus on the predictors (discrete, continuous, etc). So I agree there can be communication difficulties. Perhaps we can clarify this in Regression in Other Stories.

]]>We often see variability among subjects/classes/groups in the variability of their responses.

My usual approach is to hierarchically model the variance components as lognormal, but that doesn’t retain the benefits of half-normal priors on the variances. Is there something like a multi-level half-normal model?

]]>