Multiple imputation for model checking

Posted on April 8, 2008 12:33 AM by Andrew

Greg Ward writes,

I have a question regarding a statement in your publication, “Multiple Imputation for Model Checking…” In the introduction, section 1.1, you mention: “Even if there is a full model for the observation process (and, hence, it is not a problem to simulate replications of the observed data)…”.

I was wondering if you could offer a reference for, or any guidance on, the part in parenthesis. I have an issue (that I am unsure how to solve) involving data that is grouped, by rounding error, to the nearest cm. The grouped data exhibits a log-normal distribution. I would like to transform the data to normaility for use in a regression; however, the binning (coarsening?) will not permit this. Because I have the mean and variance, I was thinking, perhaps I could simulate the distribution and use that data as a transformable, continuous distribution)? Your statement, to me,seemed to agree with this thought… Am I correct?

My reply: You can model the underlying values and then integrate this to have a rounded-data likelihood–I think we have an example in a homework exercise of chapter 3 of Bayesian Data Analysis, also we have a similar censored-data example in chapter 18 of ARM. Having fit the model, you can then check it by comparing observed to replicated data, as discussed in the paper you mention above.