Modeling positive or negative correlations within groups

Gregor writes in with a (long) question. I’ll give the question and then my reply for each.

Gregor’s first question:

I [Gregor] have analysed data from etological (animal behaviour) experiment, where behaviour of laying hens was observed. The intention of an experiment was to estimate the effect of environmental enrichment in hen cages on behaviour of hens. The block in the design was like this

cage 1, cage 2, cage 3, cage 4
cage 5, cage 6, cage 7, cage 8

Two hens were in each cage together. Treatment (yes or no) was randomly applied to cages. 5 blocks were used and the whole thing was observed for one day in 5 weeks. Some traits (say, drinking, bitting, …)were recorded as a count event when they happened. Since values per hour showed substantial (sometimes more than 50 %) excess of 0, I summed the values for each hen per day and used model bellow, where offset = log(24). Now my question relates to independence between hens. Note that there were two hens in a cage and it there may be some dependency between them. Researchers in this field are quite notorious about independence between units (animals) and I would like to hear your opinion. I guess that when I sum the values over the day, this more or less cancels out, but what if I would analyse values per hour not per day? This might be also more general question, since “cages” might be big for many animals and a friend of mine got once a rejection of a paper because of this issue.

My reply: Yes, you can have positive interaction within cages (e.g., the hens stimulate behaviore in each other, or some cages are in better locations than others), or negative interaction (e.g., the hens compete, or play different roles, so that if one bites more, the other bites less). To simplify, consider a continuous outcome. Here, you can consider the 2 hens’ outcomes as a bivariate normal distribution, with within-cage correlation being a parameter that is estimated from the data. In your particular Poisson regression model, it would be the log-intensity parameters that would be given the bivariate normal model.

Gregor’s second question:

I [Gregor] have talked yesterday with a friend who got rejected paper due to “independence between animals”. They had a similar experiment as bellow (animal behaviour) with environmental enrichment (with or without some material to keep animals busy). The design was like this

group1 group4
group2 group5
group3 group6

group1 and group4 had treatment1 – hay
group2 and group5 had treatment2 – straw
group3 and group6 had treatment3 – control

Each group had 16 animals (pigs). Castrated males were in groups 1-3, while females were in groups 4-6. Animal behaviour was recorded per animal for some events and per group for events, where it was not possible to accurately identify individual animals – for example agressive encounters. Observations were done three times in growing period. The whole experiment was done in two replications.

Analysis was not properly done since it did not accounted for repeated observations per animals, but the main critique from reviewer was that data per animal are not independent, since one animal may be agressive and this has influence on behaviour of other animals in a group. Hierarchy is really evident in pigs and I agree with reviewer on this point to some extent. However, I don’t know any way out from here for analysis of such data.

My reply: the best would be to get more outcome data at the pig level, I’d think. Otherwise you could again fit a model with the 16 outcomes, allowing a negative correlation. There are probably other useful approaches out there, but I’m just not familiar with the animal-statistics literature. I expect they have some standard models.