Ten ways to rank the Tokyo Olympics, with 10 different winners, and no one losing

(This post is by Kaiser, not Andrew.)

The Tokyo Olympics ended with the U.S. once again topping the medals table, whether you count golds or all medals. Boring! The journalist class moaned. So they found alternative ranking schemes. For example, BBC elevated tiny San Marino to the top based on population. These articles inspired me to write this post.

As statisticians, we all have had snarky comments thrown at us, alleging that we will manufacture any story we like out of data. In a moment of self-reflection, I decided to test this accusation. Is it possible to rank any country on top by inventing different metrics?

I start from this official medals table:

China is #2. After adjusting for the number of athletes, China is #1 in winning golds.

ROC is #3. It is #1 in medals after adjusting for the number of athletes. Its female athletes were particularly successful.

Team GB is #4. I elevate them to #1 by counting the proportion of sports they joined from which they won golds.

The host nation, Japan, came in 5th place. It is #1 when counting the proportion of medals won that were golds.

Australia finished 6th. No worries. It is #1 if I look at how much better the Aussie women were at winning golds than their male compatriots.

Italy is #7. No nation has suffered as much from close calls: it had the highest proportion of medals won that were silvers or bronzes.

Germany is #8. It has the most disappointing campaign, having the biggest drop in golds won compared to Rio 2016.

The Netherlands is #9. Its Olympics athletes showed the largest improvement in total medals compared to Rio 2016.

Our next host nation, France, is #10. It’s ranked #1 if I rank countries by how much their male athletes outperfomed their female compatriots.

       

So I completed the challenge! It is indeed possible to anoint 10 different winners using 10 different ranking schemes. No country is left behind.

I even limited myself to the Olympics dataset, and didn’t have to merge in other data like population or GDP. Of course, the more variables we have, the easier it is to accomplish this feat.

For those teaching statistics, I recommend this as an exercise: pick some subset of countries, and ask students to come up with metrics that rank each country #1 within the subset, and write appropriate headlines. This exercise trains skills in exploring data, and generating insights.

In the end, I plead guilty as a statistician. We indeed have some superpowers.

 

 

14 thoughts on “Ten ways to rank the Tokyo Olympics, with 10 different winners, and no one losing

    • Or, during the Games. Yes, the exercise is much easier if any other data can be merged in but it’s also much easier to come up with nonsensical stories. The tough part is to get the country to #1 (as opposed to top few).

  1. Disappointed to not see medals per capita here. Seems to me to be the most meaningful metric. It’s not perfect, but far better than any of those mentioned here. Of course it depends what you are trying to measure, but medals per capital answers a simple and relatable question: Which country gives its citizens the highest probability of winning an Olympic medal, in the absence of any further information?

    By this metric I feel New Zealand and Australia are the winners. They have many (a statistically significant amount?) medals at very good per capita rate.

    https://medalspercapita.com/

  2. Let’s make this interesting. For some country i, X_i = (g_i, s_i, b_i, p_i, a_i) where g_i is the country’s gold medal count, s_i is the country’s silver medal count, b_i is the country’s bronze medal count, p_i is the country’s population, and a_i is the country’s athlete count.

    We want some surjective function f from the product space of (X_1 x X_2 … X_I) to an I-tuple of integral rankings between 1 and I.

    f should satisfy the following properties:

    * (g, s, b1, p, a) should beat (g, s, b2, p, a) if b1 > b2. Ditto for g, s. So if everything else is the same, more medals is better

    * (g, s, b, p1, a) should beat (g, s, b, p2, 1) if p1 y, (x, y, b, p, a) should beat (y, x, b, p, a), (g, x, y, p, a) should beat (g, y, x, p, a), (x, s, y, p, a) should beat (y, s, x, p, a). That is, exchanging silvers for golds is always beneficial, exchanging bronzes for silvers is always beneficial, exchanging bronzes for golds is always beneficial

    Let’s see what rankings these criteria eliminate. I’m pretty confident you can apply arrow’s theorem or gibbard-sattherwaithe here, but I’m too lazy to figure out how.

    • I fucked up the second bullet point

      * (g, s, b, p1, a) should beat (g, s, b, p2, 1) if p1 y, (x, y, b, p, a) should beat (y, x, b, p, a), (g, x, y, p, a) should beat (g, y, x, p, a), (x, s, y, p, a) should beat (y, s, x, p, a). That is, exchanging silvers for golds is always beneficial, exchanging bronzes for silvers is always beneficial, exchanging bronzes for golds is always beneficial

    • I don’t know why the formatting is always wrong!

      * (g, s, b, p1, a) should beat (g, s, b, p2, 1) if p1 y, then (x, y, b, p, a) should beat (y, x, b, p, a), (g, x, y, p, a) should beat (g, y, x, p, a), and (x, s, y, p, a) should beat (y, s, x, p, a). That is, exchanging silvers for golds is always beneficial, exchanging bronzes for silvers is always beneficial, exchanging bronzes for golds is always beneficial

    • Ah fuck it.

      Fewer athletes is better, gold is better than silver is better than bronze is the second and third bullet point. Don’t know why these comments can’t handle it.

    • I’m sure the problem has been posed; someone else may have links. Based on doing this exercise, I think the feasibility of generating a ranking scheme in which each entity can be ranked first depends on (a) the number of variables available for ranking (b) the distribution of values within those ranking (c) how one standardizes the variables (d) how many entities are being ranked. The game becomes significantly harder if we require that the generated insight is “interesting”, which rules out arbitrary functions connecting variables that cannot be interpreted by humans.

  3. Writing as someone who has been working recently on ranking and selection methods, I have to say that this is truly excellent. Reminds me of the ancient Arlo Guthrie song “Alice’s Restaurant”. “You can get anything you want at Alice’s restaurant…”

  4. Not sure if this is what “somebody” was going for but…

    I think it would be interesting to see a kind of WAR or Ops for the Olympics, that combines the different measurement criteria, maybe with some kind of (subjective) weighting or the different criteria.

Leave a Reply

Your email address will not be published. Required fields are marked *