Efficient leave-one-out cross-validation for Bayesian non-factorized normal and Student-t models

Paul, Jonah, and Aki write:

Cross-validation can be used to measure a model’s predictive accuracy for the purpose of model comparison, averaging, or selection. Standard leave-one-out cross-validation (LOO-CV) requires that the observation model can be factorized into simple terms, but a lot of important models in temporal and spatial statistics do not have this property or are inefficient or unstable when forced into a factorized form. We derive how to efficiently compute and validate both exact and approximate LOO-CV for any Bayesian non-factorized model with a multivariate normal or Student-t distribution on the outcome values. We demonstrate the method using lagged simultaneously autoregressive (SAR) models as a case study.

Aki’s post from last year, “Moving cross-validation from a research idea to a routine step in Bayesian data analysis,” connects this to the bigger picture.

8 thoughts on “Efficient leave-one-out cross-validation for Bayesian non-factorized normal and Student-t models

  1. Tangential to the post: I never understood the benefit to LOOCV compared to 10-fold or bootstrap validation. Training on N-1 datapoints and predicting for hold-out 1 datapoint and repeating across all datapoints seems like a way for models to be overtrained: the original (full) dataset barely changed at all!!! Maybe LOOCV is more useful for small datasets?

    • LOO is a way to estimate the expected loss for a single new data point, having trained the model on all of the other available data points. I’m not sure what overtrained means, if you have an overfit model that’ll show up in this score like any other, and generally when I have deployed models I don’t want to leave any training data on the table, why would I?

      • You don’t leave training data on the table by fitting your selected model to your full dataset. This is different than selecting a model that generalizes best, based on cross-validation results.

    • LOO typically has a lower variance. For estimating OOS performance, it can also lower the bias of having a smaller training set than you will actually use.

      There aren’t a whole lot of general results for the behavior of CV estimators.

    • There are computational trick for computing exact LOO-CV without having to refit the model (the classic version uses the hat matrix I think, can’t remember the Bayesian). So it’s a lot more efficient.

Leave a Reply

Your email address will not be published. Required fields are marked *