## Inference from simulations

We talk a lot about inference from Markov chain simulation (Gibbs sampler and Metropolis algorithm), convergence, etc. But inference from simple random samples can be nonintuitive as well.

Consider this example of inference from direct simulation. The left column shows five replications of 95% intervals for a hypothetical parameter theta that has a unit normal distribution, each based on 100 independent simulation draws. Right column: five replications of the same inference, each based on 1000 draws. For both columns, the correct answer is [-1.96, 1.96].

```Inferences based on         Inferences based on
100 random draws            1000 random draws
[-1.79, 1.69]               [-1.83, 1.97]
[-1.80, 1.85]               [-2.01, 2.04]
[-1.64, 2.15]               [-2.10, 2.13]
[-2.08, 2.38]               [-1.97, 1.95]
[-1.68, 2.10]               [-2.10, 1.97]
```

From one perspective, these estimates are pretty bad: even with 1000 simulations, either bound can easily be off by more than 0.1, and the entire interval width can easily be off by 10%. On the other hand, for the goal of inference about theta, even the far-off estimates above aren’t so bad: the interval [-2.08, 2.38] has 97% probability coverage, and [-1.79, 1.69] has 92% coverage.

So, in this sense, my intuitions were wrong and wrong again: first, I thought 100 or 1000 independent draws were pretty good, so I was surprised that these 95% intervals were so bad. But, then, I realized that my corrected intuition was itself flawed: actually, nominal 95% intervals of [-2.08, 2.38] or [-1.79, 1.69] aren’t so bad at all.

This is an example I came up with for my chapter that’s going into our forthcoming Handbook of Markov Chain Monte Carlo (edited by Galin Jones, Xiao-Li Meng, Steve Books, and myself).

1. Daniel Lakeland says:

My intuition is that those seem like pretty reasonable intervals. I guess we have different intuitions here but I might have difficulty figuring out the point you were trying to make if you put this in a book and claimed that these intervals were intuitively "pretty bad".

Also, consider this:

covrg x[2];};

sum(replicate(10000,covrg(100)));

this last sum gives 0. In other words, in none of my 95% "confidence" intervals was the true value 0 excluded from the interval. So it seems like I should have much higher than 95% confidence that my true value is in my interval…

2. Daniel Lakeland says:

Duh Whoops. confidence intervals for the parameter… the "true value" isn't 0, only the expected value of the true value is zero…

Still I think there's some kind of useful point to be made there…

covrg x[2];};

sum(replicate(10000,covrg(100)));

gives 656/10000 = .0656 which means that the expected value for your parameter is only excluded from the interval 6.6% of the time when sample size is 100.

3. john says:

Or, you could just plot the Beta distribution with parameters(0.05N, 0.95N) to see what the sampled probabilities of coverage will be, although of course expected probability will be 0.05, or whatever level you pick – assuming your model is true and you aren't using asymptotic-only results.

4. Andrew Gelman says:

Daniel: Yes, I agree. My point is that the intervals are fine for inference about theta, but they're not particularly precise as computations of the precise 2.5% and 97.5% points.

John: Yes, in this case all could be done analytically. I was just using this example, in which the true answer is simple and known, to explore aspects of simulation-based computation of posterior intervals.