How to set up a voting system for a Hall of Fame?

Micah Cohen writes:

Our company is establishing a Hall of Fame and I am on a committee to help set it up which involved figuring out the voting system to induct a candidate. We have modeled it somewhat off of the voting for the Baseball Hall of Fame.

The details in short:
· Up to 40 candidates,
· 600 voters
· Each elector has up to 10 votes
· A candidate has to have 75% of the votes to get inducted

Our current projected model:
· Up to 20 candidates
· About 100-120 voters (lets say 100)
· Each elector has up to 3 votes
· A candidate has to have 75% of the votes to get inducted

The last 2 points are the variables in question that we need help with: How many votes should each elector get and what percentage should the candidate have to have to get inducted?

We don’t want to make it too easy and don’t want to make it too hard. Our thought is to have 2-5 people inducted per year but we want to avoid having 0, no one, inducted.

We will assume that each candidate has an equal chance of being voted in so it’s not weighted at all.

My initial thought was to increase the number of votes that each elector can have to the same ratio as the baseball system (10 votes for up to 40 candidates) so we could increase to up to 5 votes for the 20 candidates. Or should we keep it at 3 candidates and decrease the percent that the candidate would need? The other factor would be if there are less than 20 candidates, lets say 15, how would this all change?

With all that being said, is there a way to find out what the right number of votes each elector should have and what the percentage should be?

Is there a way to visually see this in a graph where we could plug in the variables to see how it would change the probability of no one being elected or how many would be elected in a year?

My reply: I have no specific ideas here but I have four general suggestions:

1. To get a sense of what numbers would work for you, I recommend simulating fake data and trying out various ideas. Your results will only be as good as your fake data; still, I think this can give a lot of insight.

2. No need for a fixed rule, right? I’d recommend starting with a tough threshold for getting into the hall of fame, and then if you’re getting not enough people inducted each year, you can loosen your rules. This only works in one direction: If your rules are too loose, you can’t retroactively tighten them and exclude people you’ve already honored. But if you start out too tight, it shouldn’t be hard to loosen up.

3. This still leaves the question of how to set the thresholds for the very first year: if you set the bar too high at first, you could well have zero people inducted. One way you could handle this is to put the percentage threshold on a sliding scale so that if you have just 0 or 1 people passing, you lower the threshold until at least 2 people get in.

4. Finally, think about the motivations of voters, strategic voting, etc. The way the rules are set up can affect how people will vote.

1. My general opinion on voting is that all voting should be score voting. For all candidates allow each voter to rank them 0-10 say. Add up the scores for each candidate. Then as Andrew says start with a high enough threshold that you induct a reasonable number, drop the threshold if it isn’t working.

You also should scale the threshold by number of voters if that isn’t a constant.

2. Jable says:

Maybe a dumb question but if the goal is to have 2-5 inductees each year, why not simply take the first 2-5 ranked by number of votes?

• I think the biggest concern is to maintain some reasonable standard for induction, I mean some years you get Babe Ruth, and other years you get Aunt Tillie’s over 60 league softball coach… You can’t put the latter into the Hall of Fame just because he’s the highest ranked this year… Induction should have some kind of absolute time invariant scale, while the nomination process should feed a reasonable set of candidates.

The real question is does the potential nominee pool contain enough candidates to go for more than a few years without a plummeting standard. Baseball is at least a highly competitive sport with large player pool, a long history and relatively high turnover.

3. Christian Hennig says:

“We will assume that each candidate has an equal chance of being voted in so it’s not weighted at all.”
I’m not sure whether I understand this sentence. Does this mean that when making calculations/doing experiments such as simulating fake data, you want to assume that any voter votes for any candidate with the same probability? I’d imagine that this is quite certainly grossly unrealistic, and will have big influence on the result. The idea is that some people deserve to be in the Hall of Fame because of something they did that is known to the voters, right? So this whole idea implies that there are probably some candidates who have a much higher probability of being voted for than others. Now if this is so, unfortunately much depends on how unequal the voting chances actually are. All equal is one extreme, all voters agree is another. I guess that by and large, the more agreement there is (i.e., the higher the chance of the best candidates to be actually voted for), the larger the expected number of candidates that get into the Hall of Fame (because this makes it more likely to get at 75%). One implication is that if you want 2-5 candidates to go through, 3 is a pretty low number of votes per voter because it’s pretty hard to get even 3 candidates through (assuming that the number of candidates is quite a bit bigger), as 75% of the voters need to agree. If all agree you’ll have 3 candidates and I don’t see how the expected value of candidates winning can be more than three under any set of voting probabilities, although I haven’t proved or computed this so may be wrong. In reality I’d expect it to be substantially lower then 3. For equal voting probabilities it should be near impossible to get any candidate through (assuming 3 votes per voter, 100 voters, 20 candidates). Every candidate will get an average of at most 3/20 votes (lower depending on how many votes they use on average)- 75% is *very* far away from this.

75% strikes me as pretty high value of agreement to ask for in any case; with 10 votes out of 40 candidates it may work if agreement is fairly high, i.e., a low number of candidates have a substantially higher probability than the rest (don’t ask me about precise numbers – one can probably compute these but simulating fake data is probably easier to get at them).

Anyway, the core of this message is that it is key to the problem how diverse in reality the opinions of the voters are. “All candidates are equal” is extremely pessimistic but chances are very high agreement among voters on 2-5 candidates will be rare as well.

One way of approaching this is to set a level of agreement (i.e., voting probability for the best 5 candidates) under which you’d say, OK, if they are so much better than the others, they should all qualify – and then to figure out using simulations how to achieve that in this situation that indeed all 5 go through is high enough. (If you set the bar high, i.e., the best 5 have to be much better than the others, chances are that in reality this will mostly deliver you fewer than 5.)

• Jonathan (another one) says:

I think this is exactly right. I’d set up a prior probability for hall-worthiness in which, say, there are five excellent candidates every year and another 5 marginal candidates and then a long tail of non candidates with much lower probabilities. Then, in the fake data, set up preferences of the voters which probabilistically mirror these these assessments. So, for example, in the top 5 candidates, there ought to be about a 80 percent chance for each voter that they vote for this candidate, while with the next tier it should be maybe 60 percent. Then it is possible, by chance, for a top tier candidate to fail or for a second-tier candidate to succeed, but alost o chance for a third tier candidate. Simulate voting rules until you get the requisite 2-3 winners per year.

4. zbicyclist says:

Must appear on 75% of ballots, so this is 450 — i.e.3/40 of the total ballots

Must appear on 75% of ballots, so this is 75 — i.e. 1/4 of the total ballots

That second task is much more daunting.

5. markus says:

1) If having no one inducted is unacceptable, the logical consequence is that it’s not actually a hall of fame, but a best of year award. Nothing wrong with that. However, it then makes sense to be explicit about this and set up the system so the person with the most votes gets in regardless and allow an additional, variable number of entrants.

2) The BBWAA rules allow something like 13 entrants (10/.75) with perfect coordination. In a year with few excellent candidates, one would expect up to 10 entrants who everyone agrees on (1939, 1945, 1946, 2006). At the company, with 3 votes max and 75% needed for induction, the maximum number of entrants with perfect coordination and everyone using all three votes is 4.
Replace whoever suggested a system like the one the company presents considers could produce 5 entrants on the team that sets up the voting system.
(FWIW, the BBWAA rules have produced between 1 and 18 entrants per year.)

3) Crucial unknowns whose importance you might learn from simulating data: How much agreement is there among your voters? Is the marginal voter (i.e. the 20 in the 100-120) more likely to vote with everyone else or are they nonconformist? In turn, does increased turnout increase or decrease cohesion of votes? What about the distribution of ‘electability’ among candidates: Is it a mix of the obviously worthy and the potentially worthy, or is everyone just potentially worthy? Was there/will there be time to inform each voter sufficienctly and will there be time to reach an unofficial consensus before the vote (i.e. the BBWAA have 5 years to discuss who is worthy to be a candidate before the vote actually happens)

Personally, I’d set up the rules so that whoever gets most votes gets in anyway and also establish a cutoff for additional entrants based on an informal sample voting or the first years results. If possible, I’d smooth out the results by keeping people eligible for multiple years, so that excellent candidates aren’t left out in a superb year and a mediocre year (of new candidates) is not a problem because there are strong candidates from the previous years left over.

6. Alex says:

If I mocked up the voting system properly, this R code suggests that it’s extremely unlikely that anyone would ever appear on 75% of ballots. Looks like it tops out at a the top vote-getter getting <30% of ballots.

n_voters = 100
n_reps = 100 #number of fake votes you want to hold

results = matrix(NA,ncol=n_votes,nrow=n_reps) #will hold the voting results

for (i in 1:n_reps) { #cycle through number of desired vote universes
one_universe = matrix(0,ncol=n_votes, nrow=n_voters) #one possible result of voting, voters in rows and votes in columns
for (j in 1:n_voters) { #cycle through each voter
one_universe[j,] = sample(1:20, 3) #pick three values from 1 to 20 without replacement; becomes someone's votes
}

# count the votes – what % of ballots do the top three vote-getters land on?
vote_count = table(one_universe) #sums up votes for each vote recepient (i.e. # 1 through 20)
top_three = tail(sort(vote_count),3) #get top three most common votes
#since there are 100 votes, the vote count also serves as proportion of ballots the recepient appeared on
results[i,] = top_three
}

# the results matrix holds the top vote proportions for each n_rep in a row, with the proportions
# being sorted into columns in reverse order (3rd, 2nd, 1st).

7. Epiphyte says:

You’re essentially using voting to rank the candidates. How differently would they be ranked if spending was used instead? They would be ranked…

A. worse
B. the same
C. better

If your guess is A or B… then why should money ever be used to measure/communicate importance? My guess is that the correct answer is C. The money that was raised could either be given to your company or to a charity. The more money that somebody was willing to contribute, the more influence that they’d have on the rankings.

Personally I never buy makeup, so I have absolutely no influence on how the gazillion makeup products are ranked. Is this a problem? Should everybody, regardless of interest or preference, have equal influence on how makeup products are ranked? Would equal influence truly improve the rankings?

Maybe it would be prudent to first give all your employees the opportunity to use voting and spending to rank books. If Harry Potter wins with voting, while Adam Smith wins with spending, then you’ll be able to make a better informed decision whether to use voting or spending to rank candidates.