We were looking at some correlations–within each state, the correlations between income and different measures of political ideology–and we wanted to get some sense of sampling variability. I vaguely remembered that the sample correlation has a variance of approximately 1/n–or was that 0.5/n, I couldn’t remember. So I did a quick simulation:

> corrs <- rep (NA, 1000) > for (i in 1:1000) corrs[i] <- cor (rnorm(100),rnorm(100)) > mean(corrs) [1] -0.0021 > sd(corrs) [1] 0.1

Yeah, 1/n, that’s right. That worked well. It was quicker and more reliable than looking it up in a book.

In your example, you produced samples of normal random variables which were uncorrelated. However, if they were normal with correlation rho, then the asymptotic variance would be (1/n)* [1-rho^2]^2, which depends on the true rho, but does in fact equal 1/n if rho=0.

However, it can be shown that Fisher's z-transformation, (1/2)*ln[(1+r)/(1-r)] has asymptotic variance equal to 1/n regardless of the true correlation.

… or even

> sd( replicate( 1000, cor(rnorm(100), rnorm(100) ) ) )

[1] 0.09903454

nice to hear someone else jogs their memory this way

In a similar vein, I tend to use simulation instead of more formal power tests. This has the benefit that since I am analyzing the simulated data in EXACTLY the same way as I will analyze the final data, I know that I haven't selected the wrong option when using power test software. Not only that, but I can get real probabilities of false-positive and false-negative for various size effects and inference methods.

For example, what is the probability that the mysterious decision function f will say that two coins are different based on the outcome of n flips if they have probabilities of landing heads of p1 and p2 respectively?

<pre>

k1 = rbinom(10000, n, p1)

k2 = rbinom(10000, n, p2)

for (i in 1:10000) {

z[i] = f(k1, n, k2, n)

}

mean(z)

</pre>

Current,

Thanks for the correction. I'm off the hook in this particular case: the actual rho in our example was around 0.2, so 1-rho^2 = .96, which is close enough to 1 that I don't mind missing the factor. But in general your point is a good one–and also an illustration of the limitations of simulation!