A question about AIC

Jacob Oaknin asks:

Akaike‘s selection criterion is often justified on the basis of the empirical risk of a ML estimate being a biased estimate of the true generalization error of a parametric family, say the family, S_m, of linear regressors on a m-dimensional variable x=(x_1,..,x_m) with gaussian noise independent of x (for instance in “Unifying the derivations for the Akaike and Corrected Akaike information criteria”, by J.E.Cavanaugh, Statistics and Probability Letters, vol. 33, 1997, pp. 201-208).

On the other hand, the family S_m is known to have finite VC-dimension (VC = m+1), and this fact should grant that empirical risk minimizer is asymtotically consistent regardless of the underlying probability distribution, and in particular for the assumed gaussian distribution of noise(“An overview of statistical learning theory”, by V.N.Vapnik, IEEE Transactions On Neural Networks, vol. 10, No. 5, 1999, pp. 988-999)

What am I missing?

My reply: I’m no expert on AIC so let’s see if I can wing this and give a reasonable response without fully understanding either the question or the answer. . . . Here goes: As the saying goes, asymptotically we’re all dead. The AIC correction is of order O(1). If you divide by n so that you’re estimating average prediction error rather than total prediction error, the correction is of order O(1/n). So asymptotically the uncorrected estimate of the average prediction error is consistent, just as you’d like.

1 thought on “A question about AIC

  1. I think the missing concept is “asymptotically unbiased”, a property which is necessary but not sufficient for consistency.

Comments are closed.