“Missing at random” and “distinct parameters”

Etienne Rivot sends in a question about models for missing data. The issues are subtle and I think could be of general interest (since we all have missing data!) These issues are covered in Chapter 7 of Bayesian Data Analysis, but it always helps to see these theoretical ideas in the context of a specific example.

Rivot writes:

I [Rivot] am currently writing a paper to be submitted in a fisheries review and one of the referee raised a problem in the treatment of the missing data in our model. In a first version of the manuscript, we argue that the missing data generating process was “ignorable” (because we argue that the 2 conditions – 1) missing at random ; 2) distinct parameters – were verified). But the referee argue that the “distinct” parameters conditions was NOT verified.
I would greatly appreciate to have your opinion about that. Please find below a short description of the model and of the problem :

The problem
——————————–
Objective : estimate the number of fish, say N, in a particular site in a river
Method : successive removal method via electrofishing
Data :
– site i=1,…,n
– C1 : capture at the first pass (fish are captured by means of electro-fishing)
sampling equation : C1(i) ~ Binom(N(i),p(i))
– C2 : capture at the second pass (the same experiment with the same capture probability p(i))
sampling equation : C2(i) ~ Binom(N(i)-c1(i),p(i))
Hierarchical Bayesian model :
– priors on p(i) and N(i) have a hierarchical structure accros sites i=1,…,n

The missing data problem :
Some times, the population size N(i) is so low that the result of the first pass is C1(i) = 0. In that case, the field crew often do not perform the second pass. Then C2(i) = NA.

The argumentation of the referee :
The “distinct parameters” condition IS NOT verified because the probability of a missing data at the site i depends upon the population size N(i). Indeed, the smaller the population size N(i), the greater the probability of obtaining C1 = 0 at the first pass, and the greater the probability of a missing data at the second pass. Then, the parameters of the missing data generating process (i.e. the probability of a missing data C2 = NA at site i) are NOT independent of the parameters of the data generating process (i.e. N(i)).

Our argumentation (but we are maybe wrong) :
The “distinct parameters” condition IS verified because the probability of a missing data at the site i ONLY depends upon the observed value C1(i) : if C1(i) = 0, then the probability of having a missing data is greater. In that point of view, the parameters of the missing data generating process (i.e. the probability of a missing data C2 = NA at site i) only depends upon the observations C1(i) and can then be considered as independent of the non observed latent variable N(i).
——————————–

Who is wrong ? Is it only a question of “point of view” ?

My reply: Certainly the fact that C1=0 is providing information about N. But, yes, if C1 is observed, and only C1 determines whether C2 is measured, then you are missing at random. However, you say that the field crew “often do not perform the second pass.” If, for example, they are more likely to perform the second pass when they believe there are fish at the site, then the data are _not_missing at random.

But that has nothing to do with “distinct parameters.” The “distinct parameters” issue is different. Here, the two sets of parameters that must be “distinct” are:

(a) The usual parameters of your model: N, p, and your hierarchical parameters;

(b) The parameters of your missingness model: in this case, the probability that C2 is measured, if C1=0.

Violation of “distinct parameters” would occur if the parameters in (a) and (b) are dependent in their prior distribution; that is, if the proportion of missing cases is informative about N,p,etc. I would see no reason to believe this is the case.

So I think you’re right.

1 thought on ““Missing at random” and “distinct parameters”

  1. I do find this counter intuitive (=> True???)

    The C1 and C2 are repeated count measurements within the same sampling unit. For purposes of this example, we can simplify things and say that the second measurement, C2, is observed if and only if C1>0. (So, there isn't any phi parameter in the notation of Gelman et al. 2004).

    Intuition tells me that this is like the nonignorable design on p. 206 of Gelman et al 2004, in which the number of patients receiving a trmt depends on the expected size of the effect of the trmt. In the above example, there is a dependence between the expected count in the unit and the observation of the second count (within that unit). Doesn't that mean that the model parameters are influence the "missingness", and hence the distinct parameters assumption is not satisfied???

    If I'm wrong, then doesn't that imply that we can stop measuring a patient if they start getting sick (say).

    Any comments appreciated,

    Russell

Comments are closed.