## Ranking colleges

Christopher Avery, Mark Glickman, Caroline Hoxby, and Andrew Metrick wrote a paper recently ranking colleges and universities based on the “revealed preferences” of the students making decisions about where to attend. They apply, to data on 3000 high-school students, statistical methods that have been developed to evaluate chess players, hospitals, and other things. If a student has been accepted to colleges A and B, and he or she chooses to attend A, this counts as a “win” for A and a “loss” for B.

A multinomial paired-comparisons model

Putting these data together, Avery et al. fit a logistic-regression paired comparisons model, in which each college has its own “desirability” parameter. Under the model, the probability that student i prefers college A to college B depends on the desirability of college A, minus the desirability of college B, plus or minus some additional terms corresponding to interactions between the student and the colleges (for example, a plus term if the college offers financial aid to the student). The model is estimated from all the data at once using multinomial logistic regression (multinominal because students typically choose among more than two colleges) using Bayesian inference to capture the uncertainty in all the parameters.

Ranking the colleges

The official ranking of colleges isn’t the only important thing to consider; you should read through more thorough ONLINE Education RESOURCEs to get a better idea of what each college can offer, including things like the funding that each one is able to provide. You also need to make sure that you’re considering an independent source of education guides rather than those written by the colleges themselves, as these are bound to be biased. However, the ranking results are fun to read through and ponder. For example, Harvard (#1) has a ranking of 2800 and Columbia (#8, hmmm, not bad!) has a ranking of 2392, so the probability that a student will prefer Harvard to Columbia (among those students who are admitted by both and decide to attend one or the other) is invlogit((2800-2392)/173) = 91%. Wow–91% seems like a high number. I wonder if that is actually true in the data.

They also consider models interacting with geography–that is, separate models for students from each of several regions of the country. It’s all pretty cool.

Why it could make a difference

And, as the authors of the paper point out, it’s not just a fun little example. Colleges have motivations to manipulate various statistics such as admissions rates which are used in commercial college rankings (such as the U.S. News ratings), and by comparison the ratings derived in this paper would seem much more difficult to manipulate. This is related to Val Johnson’s motivation for his method for assessing students within a college using a multilevel model for classroom grades, which removes the motivation for grade inflation.

Desirability vs. quality

As Avery et al. put it in their paper, they are ranking the “desirability”, not quality, of each college. I can see how this “desirability” information could be extremely important to the colleges in making admissions decisions and also in figuring out how to make the college appealling to the students who are undecided about whether to go there.

But for a student considering where to go to college, I don’t see the relevance of the “desirability” measure. Wouldn’t it be better to have some measures of how the colleges perform–even a simple customer satisfaction survey after 4 years? This is not meant as a criticism of Avery, Glickman, Hox, and Metrick, just more of a concern about what the results will be used for.

Picky, picky, picky

The Avery, Glickman, Hox, and Metrick paper is very nice, and I just have a couple thoughts of how they could make it nicer. Most notably, they could present some of their results graphically, including the table on page 24 (should be a curve, not a list of numbers) and similarly Table 3, and then Table 4 should be presented as a gray-scale plot, not a matrix of percentages. Or maybe curves. Or maybe could be combined with Table 3. Actually, all the tables should be graphs! Also it would be nice to see some raw data as well as comparisons to the model (for example, calibration and residual plots). They went to some effort to gather the data; they might as well display it and check its fit to the model.

In summary

A cool example, an interesting paper. Seems like a step up from the current college ranking schemes. I’d be interested to see rankings of colleges’ performance as well as desirability. I assume some data are out there?

1. Tian says:

I wonder whether the study accounted for those "early decision" options. If a student applied for Columbia's "early decision" option, he/she is legally bounded not to apply for other universities at the same time (or at least this is what I heard. Am I right here?). So some (hidden) desicions on which school to choose have already been made at this stage, which would not have been captured by Avery et al. study, which used "admission offer" information.

Such options have been adapted by some top universities trying to "help" the students make up their minds earlier. Potentially, this will affect the "ranking" mentioned here. Are these universities hurting their "Avery et al. Rankings" by excluding some of their definitive first-choicers from Avery et al. data, which used admission offer pairs.

2. Andrew says:

Hi Tian. Actually, the Avery et al. paper spent a bit of time discussing issues such as early decision and other admissions policies that can be used to manipulate ratings. They talk about the ways in which traditional rating systms can be manipulated and they imply that their rating system is less easily manipulated, but I don't recall any definitive results along these lines.

3. Tonisha says:

Looking for a school that does modeling

4. David Kane says:

Well there is modeling and then there is modeling. Perhaps they teach both at Columbia.