By popular demand: data on presidential nominees’ names

In response to a request here, Ubs sends this data file and writes,

On the comments someone wished there were data to play with.

I’ve typed some up. No guarantees it’s perfect. Obviously, I didn’t intentionally do anything incorrect, but I didn’t spend any time proofreading it either.

Rules for inclusion

Presidential candidates:
– Listed only if win at least one full state of electoral votes
– (Means 1789, 1792, 1820 treated as unopposed)
– Note that lots of stray 3rd candidates qualify in the early years (like when a state legislature decides to give their EV to some non-candidate). For most uses might make sense to exclude them, at the cost of losing the few legit 3rd party candidates (ie, 1912, 1968). Some later 3rd party candidates don’t even show because they get popular vote but no EV (eg, Henry Ross Perot).

Vice presidential candidates:
– Listed only if an official or de facto running-mate of qualified presidential candidate. (Sometimes requires judgment call in early years.)
– Note that its possible for a non-qualifying VP candidate to get more EV than a qualifying one (eg 1832).
– If presidential candidate has multiple running mates (eg, Bryan one year, I forget which), only list the running mate(s) who win(s) at least one full state of EV.

rank
– Ranking is by # of electoral votes (eg, 1 is the 1st most EV, etc)
– Exception: winner always gets 1, even if less EV (ie JQ Adams in 1824)
– Note that in complicated races (eg, 1824, 1832) the P order and V order might not match, so it doesn’t follow that V3 is running mate of P3, for example.

party
– anti-federalists are labeled Dem-Rep even before the party officially so named. (That party is somewhat meaningless for the period around 1820-24 when essentially there’s just the one party.)
– National Republicans labeled Whig (1828-1832)
– some candidates received support from multiple parties. Picked the bigger party, which can be a judgment call
– Lincoln/Johnson 1864 listed as Rep and Dem instead of Natl Union

The alternate first name field is for middle name guys only. did not do short/nicknames like Al for Albert, Jimmy for James.
– possible I missed some middle names, and I’m not sure about Cabot Lodge.

State
– is how they were officially designated for constitutional purposes (in some cases not an obvious match for true home)
– for some VP candidates I’ve deduced but not confirmed; a couple not determined at all, entered as ??

last ran as
– If you test this for zero that should give unique individuals counted only once, right?
– has no way of indicating if a V1 succeeded to presidency and thus was incumbent president (or even a 0, in case of Gerald Ford). Other incumbencies can be deduced, I think.

4 thoughts on “By popular demand: data on presidential nominees’ names

  1. I thought this might be a cool data set to post on Many Eyes – IBM's collaborative data visualization site. I'm kind of intrigued by the idea of wiki-style analysis and viz, but have never seen a real compelling example of it working. Nonetheless, I uploaded the data posted two simple plots, tag cloud of first names, the other is a bubble chart with election outcome conditioned on name. (Georges do well, but Charles-es don't).

    These types of visualizations are not very deep but do provide some simple insights.

    By the way, I found Many Eyes to be frustrating…I actually had to create two different data sets to accomplish what I wanted.

    I'll be interested to see if it generates any "social analysis" activity.

  2. Thanks. My first attempt at the data is here, showing the twenty names with at least three appearances, ordered by number of appearances.

    No patterns spring out as yet. Later I'll add colours for party and try to order the names with tied number of appearances by some measure of central tendency, rather than by first appearance as they seem to be here.

  3. Seeing Derek's graph, I agree with Ubs's implicit comment that each person, not each nomination, should count only once.

    Also, we need more Rufuses, already!

    More seriously, if anyone really wanted to analyze this sort of data, I'd suggest dumping in the first names of all the congressmembers and cabinet officers. This shouldn't be too hard, and then we're talking real data.

  4. Take 2: this one counts each candidate only once, in the last year he appears. I tried first year, but the absence of some recent candidates, like Bill Clinton in 1996, looked too weird– doubtless the absence of some candidates in earlier years will look strange to other people, which is one reason I wish MS Excel pivot tables understood medians.

Comments are closed.