# Panels of judges

John Kastellec writes,

My [John’s] research question involves the voting behavior of judges on the U.S. Courts of appeals. To give a little background, which you or may not know already, there are 12 circuits in the US, divided by geographic region, that consist of anywhere from 6 to 28 judges. Almost all cases heard by these courts are heard by panels of three judges, selected (for all intents and purposes) randomly from all the judges in the circuit. Once the panel reaches a decision, the losing party can appeal to the entire circuit as a whole — which can decide to hear the case en banc, meaning every judge on the circuit sits together to decide the case — or directly to the Supreme Court.

The response variable is whether the panel issues a liberal or conservative decision (defined in conventional legal terms). I want to model how the ideological composition of the panel (i.e. whether there are two conservative judges sitting with one liberal judge, or vice versa) in combination with the ideological composition of the entire circuit and the Supreme Court affects the probability of a liberal decision. So, for example, if two conservative judges are on a panel but the entire circuit is liberal, they are more likely to issue a liberal ruling for fear of being reversed by the entire circuit.

The data are randomly sampled (but see below) cases from the courts of appeals over many years (probably I will use 1961 to 1996). Thus, cases are individuals and circuits are groups.
The main individual level predictors are the composition of the panel. The group-level predictors are the circuit median and the size of the circuit, which are measured yearly. Here are my questions:

1) Is the best strategy to run separate models for each year? I had initially thought about pooling cases from every 5 years or so, but this would introduce a lot of measurement error into the group-level variables, since they change each year. Downside is that this would mean lots and lots of models.

2) Many of the cases in the data are heard by the same three judges, and thus are not independent observations. Should I use robust standard errors, clustered on the panel? If so, are there special considerations for doing so in multilevel observations? (I didn’t see anything in Gelman/Hill about this.)

3) The data are not randomly sampled across the entire courts of appeals, but within each circuit. That is, for each year, about 25 cases are sampled for each circuit, even though some circuits hear many more cases than other. Is it necessary to weight the data, given that my main inferences will be within-circuit?

My response: