The fundamental role of data partitioning in predictive model validation

David Zimmerman writes:

I am a grad student in biophysics and basically a novice to Bayesian methods. I was wondering if you might be able to clarify something that is written in section 7.2 of Bayesian Data Analysis. After introducing the log pointwise predictive density as a scoring rule for probabilistic prediction, you say:

The advantage of using a pointwise measure, rather than working with the joint posterior predictive distribution … is in the connection of the pointwise calculation to cross-validation, which allows some fairly general approaches to approximation of out-of-sample fit using available data.

But would it not be possible to do k-fold cross-validation, say, with a loss function based on the joint predictive distribution over each full validation set? Can you explain why (or under what circumstances) it is preferable to use a pointwise measure rather than something based on the joint predictive?

My reply: Yes, for sure you can do k-fold cross validation. Leave-one-out (LOO) has the advantage of being automatic to implement in many models using Pareto-smoothed importance sampling, but for structured problems such as time series and spatial models, k-fold can make more sense. The reason we made such a big deal in our book about the pointwise calculation was to emphasize that predictive validation fundamentally is a process that involves partitioning the data. This aspect of predictive validation is hidden by AIC and related expressions such as DIC that work with the unpartitioned joint likelihood. When writing BDA3 we worked to come up with an improvement/replacement for DIC—the result was chapter 7 of BDA3, along with this article with Aki Vehtari and Jessica Hwang—and part of this was a struggle to manipulate the posterior simulations of the joint likelihood. At some point I realized that the partitioning was necessary, and this point struck me as important enough to emphasize when writing all this up.

And here’s Aki’s cross validation FAQ and two of his recent posts on the topic:

from 2020: More limitations of cross-validation and actionable recommendations

from 2022: Moving cross-validation from a research idea to a routine step in Bayesian data analysis

1 thought on “The fundamental role of data partitioning in predictive model validation

Leave a Reply

Your email address will not be published. Required fields are marked *