Matching at two levels

Steve Porter writes with a question about matching for inferences in a hierarchical data structure. I’ve never thought about this particular issue, but it seems potentially important.

Maybe one or more of you have some useful suggestions?

Porter writes:

After immersing myself in the relatively sparse literature on propensity scores with clustered data, it seems as if people take one of two approaches. If the treatment is at the cluster-level (like school policies), they match on only the cluster-level covariates. If the treatment is at the individual level, they match on individual-level covariates. (I have also found some papers that match on individual-level covariates when it seems as if the treatment is really at the cluster-level.) But what if there is a selection process at both levels?

For my research question (effect of tenure systems on faculty behavior) there is a two-step selection process: first colleges choose whether to have a tenure system for faculty; then faculty choose whether to work for a college that has a tenure system. My concern is that there will be differences between treated and untreated at both levels, and matching at only one level will not achieve balance for covariates at the other level. My idea for handling this is a three-step process: first, match multiple controls to treated schools to balance at the cluster-level, then using only faculty in the matched school sample, match again using individual-level variables. Hopefully at this point I would have enough schools and faculty within schools for a two-level HLM, using covariates at both levels to handle any remaining bias.

Any thoughts on this? Have you come across any applications where someone tries to match at two levels rather than one? Or am I missing something and overthinking this?

I really don’t know. You could start with Rubin’s question–“What would you do if you had all the data?”–to think about what comparisons you’d like to make, if sample size were not an issue. Also, if you do end up fitting your model on a relatively small subset of your data, you could evaluate some aspects of your inferences on your larger data, to see if your fitted model gives reasonable predictions.

3 thoughts on “Matching at two levels

  1. For Rubin's question–"What would you do if you _could randomize_"

    you likley would refrain from borrowing information between differe randomized units (strata) on nuissance parameters like control rates(aka Simpson's paradox)

    some people were working on this (where the hierarchical structure was meta-analysis of observational studies) but a quick web search did not turn anything up yet.


  2. In our paper, we matched on multiple levels. In doing so, we just loaded the group (in our case, community) level variables into the matching algorithm along with the individual level variables, and matched all at once. We had to do it this way because of the rather small sample size; it is justified so long as you think there are no important unmeasured group level confounders. That is an identifying assumption. In the end, the optimal matching solution pretty much ended up looking like what would have happened had we matched on groups first and then individuals across matched groups (see the map of post-matched in the Figures, which shows entire communities being knocked out). I think there were a few exceptions though—that is, I think there were a few cases where a treated unit was matched to two control units, and those two control units were from different groups, although the groups must have shared similar group-level characteristics. Our approach allowed this to happen. It is justified if you think that bias reduction is not necessarily maximized by balancing group covariates prior to matching any individual covariates (that is, sometimes individual level imbalance may matter more).

    The multi-step procedure Steve is suggesting makes perfect sense to me. The only potential problem I see is that the data within matched groups may be too sparse for second level matching. If that should be the case, then you could do something like what we did. Or, you could use propensity weighting to balance as best you can the marginal distributions of individual characteristics across matched groups. This disregards interactions, so if you think some interactions are important, then you could include interaction terms as covariates. I think there should be some efficiency gains from this over a second round of matching. To see why, consider that matching IS a weighting scheme, but one that assigns only 0 and 1 weights. There is a neat package in R called TWANG that could automate this.

Comments are closed.