Cross validation and model assessment comparing models with different likelihoods? The answer’s in Aki’s cross validation faq!

Nick Fisch writes:

After reading your paper “Practical Bayesian model evaluation using leave-one-outcross-validation and WAIC”, I am curious as to whether the criteria WAIC or PSIS-LOO can be used to compare models that are fit using different likelihoods? I work in fisheries assessment, where we are frequently fitting highly parameterized nonlinear models to multiple data sources using MCMC (generally termed “integrated fisheries assessments”). If I build two models that solely differ in the likelihood specified for a specific data source (one Dirichlet, the other Multinomial), would WAIC or loo be able to distinguish these or must I use some other method to compare the models (such as goodness of fit, sensitivity, etc). I should note that the posterior distribution will be the unnormalized posterior distribution in these cases.

My response: for discrete data I think you’d just want to work with the log probability of the observed outcome (log p), and it would be fine if the families of models are different. I wasn’t sure what was the best solution with continuous variables, so I forwarded the question to Aki, who wrote:

This question is answered in my [Aki’s] cross validation FAQ:

12 Can cross-validation be used to compare different observation models / response distributions / likelihoods?

First to make the terms more clear, p(y∣θ) as a function of y is an observation model and p(y∣θ) as a function of θ is a likelihood. It is better to ask “Can cross-validation be used to compare different observation models?”

– You can compare models given different discrete observation models and it’s also allowed to have different transformations of y as long as the mapping is bijective (the probabilities will the stay the same).
– You can’t compare densities and probabilities directly. Thus you can’t compare model given continuous and discrete observation models, unless you compute probabilities in intervals from the continuous model (also known as discretising continuous model).
– You can compare models given different continuous observation models, but you have exactly the same y (loo functions in rstanarm and brms check that the hash of y is the same). If y is transformed, then the Jacobian of that transformation needs to be included. There is an example of this in mesquite case study.

It is better to use cross-validation than WAIC as the computational approximation in WAIC fails more easily and it’s more difficult to diagnose when it fails.

P.S. Nick Fisch is a Graduate Research Fellow in Fisheries and Aquatic Sciences at the University of Florida. How cool is that? I’m expecting to hear very soon from Nick Beef at the University of Nebraska.

7 thoughts on “Cross validation and model assessment comparing models with different likelihoods? The answer’s in Aki’s cross validation faq!

Leave a Reply

Your email address will not be published. Required fields are marked *