David Afshartous writes,

Verbeke & Molenberghs (2000; Linear Mixed Models for Longitudinal Data; Section 23.2, p.392) discuss an analytic procedure for power calculations. Specifically, for the case of testing a general linear hypothesis, under the alternative hypothesis the test statistic distribution can be approximated by a noncentral F distribution (supported by simulation results of Helms 1992). And power calculations are thus obtained by incorporating the appropriate quantile from the null F distribution (as opposed to a null chi-squared distribution that doesn’t account for variability introduced in estimating variance components).

What is your opinion of this approach versus your suggested simulation approach (p.437 of recent book)? Perhaps the analytic method above is not as appealing due to the dependence of the results on the method to estimate the denominator degrees of freedom in the null F distribution (albeit lower dependence for larger sample sizes)?

My reply: I have mixed feelings about power calculations in general. The topic of statistical power is hugely important (see here for my recent thoughts on the perils of underpowered studies), but I have real problems with the standard “NIH-style” power calculations. Briefly, the problem is that the calculations are set up as if there is a certain power goal, and you get the data necessary to reach it, but realistically it’s often the other way around, with a sample size set from practical considerations and then a power calculation set up to get the answer you need. We have a whole chapter on power calculations in our book but I don’t really know if I like the idea at all.

In answer to the specific question: I hate thinking about these F distributions–the last time I thought hard about them was for my 1992 paper with Rubin–so I prefer simulation despite its occasional awkwardness.

See here for a link to a useful article on power calcuations by Russ Lenth. (The link is from June, 2005; many of you probably weren’t reading this blog back then…)

I like looking at credible intervals on effects better, anyway. If your CIs are too big, you know you are underpowered. The size of the CIs give you the same information in a power analysis for free.