# Why it’s not so weird that so many dentists are named Dennis: a story of conditional probability

Ian Ayers refers to the research by Brett Pelham, Matthew Mirenberg, and John Jones that people are likely to have names that are related to their occupations, places of birth, etc. Pelham et al. write:

Taken together, the names Jerry and Walter have an average frequency of 0.416%, compared with a frequency of 0.415% for the name Dennis. Thus, if people named Dennis are more likely than people named Jerry or Walter to work as dentists, this would suggest that people named Dennis do, in fact, gravitate toward dentistry. A nationwide search focusing on each of these specific first names revealed 482 dentists named Dennis, 257 dentists named Walter, and 270 dentists named Jerry.

In his blog, Ayres referred to this finding but wrote:

To be honest, I [Ayres] am not fully persuaded that either of these results is true.

I think that Ayres is saying this because the effect sounds so large: Even if there really were something going on, could it really explain the difference between 482 and 257, nearly a factor of 2?

Let me repost a simple conditional probability calculation that might put Ayres’s mind at ease:

There were 482 dentists in the United States named Dennis, as compared to only about 260 that would be expected simply from the frequencies of Dennises and dentists in the population. On the other hand, the 222 “extra” Dennis dentists are only a very small fraction of the 620,000 Dennises in the country; this name pattern thus is striking but represents a small total effect. If we assume that 222 of these Dennises are “extra” dentists–choosing the profession just based on their name–that gives 221/620000= .035% of Dennises choosing their career using this rule. I can certainly believe that the naming effect could be as high as .035%.

What percentage of people pick their job based on their name?

And here is my quick calculation that approximately 1% of Americans choose their career based on their first name:

I start with this approximate formula for the proportion of people who choose a career based on their first name:

p1 * first_letter_effect + p2 * first_2_letters_effect + p3 * first_3_letters_effect

Here, p1 is the proportion of careers that begin with the first letter of your name, and the “first letter effect” is the extra proportion of people in a specific career beginning with the same first letter of their name. Similarly, p2 is the proportion of careers that share the first 2 letters of your name, and the “first 2 letters effect” is the extra proportion with that career, and similarly for p3. One could go on to p4 etc., but the idea is that, after p3, the probability of actually sharing the first 4 letters is so low as to contribute essentially nothing to the total.

Now, for some quick estimates: The simplest estimates for p1, p2, p3 are 1/26, 1/26^2, 1/26^3, but that’s not quite right because all letters are not equally likely. Just to make a guess, I’ll say 1/10 for p1, 1/50 for p2, and 1/150 for p3.

What about the “letter effects”? For “Dennis” the effect was estimated to be about 221/(482-221) = .85–that is, about 85% more dentists named Dennis than would be expected by chance alone. But “Dennis” and “dentist” sound so much alike, so let’s take a conservative value of 50% for the “first-3-letters-effect.” The first-2-letters-effect and first-letter effects must be much smaller–I’ll guess them at 5% and 15%, respectively.

In that case, the total effect is

(1/10)*.05 + (1/50)*.15 + (1/150)*.50 = 0.011, or basically a 1% effect.

More on the Pelham et al. article

Several comments on Ayres’s blog entry focused on potential problems with the Dennis/Dentist study. But there’s really much more to the Pelham et al. paper than that single example. I recommend you read the entire article. Here’s the abstract:

Because most people possess positive associations about themselves, most people prefer things that are connected to the self (e.g., the letters in one’s name). The authors refer to such preferences as implicit egotism. Ten studies assessed the role of implicit egotism in 2 major life decisions: where people choose to live and what people choose to do for a living. Studies 1-5 showed that people are disproportionately likely to live in places whose names resemble their own first or last names (e.g., people named Louis are disproportionately likely to live in St. Louis). Study 6 extended this finding to birthday number preferences. People were disproportionately likely to live in cities whose names began with their birthday numbers (e.g., Two Harbors, MN). Studies 7-10 suggested that people disproportionately choose careers whose labels resemble their names (e.g., people named Dennis or Denise are overrepresented among dentists). Implicit egotism appears to influence major life decisions. This idea stands in sharp contrast to many models of rational choice and attests to the importance of understanding implicit beliefs.

And there’s lots more good stuff there, including:

Hardware store owners were about 80% more likely to have names beginning with the letter H as compared with R. In contrast, roofers showed the reverse pattern. They were about 70% more likely to have names beginning with R as compared with H.

Expected values dictated that 308.8 of the 45,908 women sampled should have resided in cities named after Saints who happened to share their first names. The actual number of women who showed this name-city matching effect was 445, which is 44% greater than the chance value. On the basis of expected values, 3,476.0 out of 594,305 men should have lived in Saint cities bearing their first names. The actual number of men who did so was 3,956, which is 14% greater than the chance value.

P.S. If you go to the Name Voyager, you find that, not only did the name Dennis peak during the 1940s and 1950s, but the set of names beginning with D peaked around then also. (You’ll be happy to know that E has had the opposite pattern, while F has steadily gone down the tubes.

## 18 thoughts on “Why it’s not so weird that so many dentists are named Dennis: a story of conditional probability”

1. I don't see how these effects contradict rational choice. If you get a psychological boost from having a same first-lettered occupation, then your "match quality" (what labor economists call occupation-worker specific productivity) for that occupation is high. With higher productivity in that occupation, your wages in it will be higher and thus its rational to pick that occupation.

2. I am not convinced by the "saints" names study. If a city is named for a saint then the city is likely to have a strong Christian culture. And hence the descendents of the "namers" are likely to be influenced by that. So it's not so surprising that first names that are Saints names are popular in such a city – probably all saints names are more popular compared to the general population It would be interesting to see the proportion of Peters in St Paul and vice versa.

Most of the female saint's result is based on Marys in St Mary. However, Mary has been a very popular name in the past. In England, in 1850, roughly 26% of the female population born that year had Mary as a first name (http://www.freebmd.org.uk). I suspect Scotland and Ireland had similar proportions and in Scotland and Ireland it was traditional to pass names down through the generation (http://homepages.rootsweb.ancestry.com/~scottish/ScottishNamingPatterns.html). The Saint result to me is just the tail end of Mary being a very popular Christian first name.

My husband reminded me about an English version of "Dennis the Menace". The American and the English version debuted on the same date (http://www.toonopedia.com/dennisb.htm). The English Dennis had a dog called Gnasher who was the father of a puppy called gnipper – very teethy names.

3. I am also not surprised by the saint name result.

I do think that there is some potential bias in the first letter example of the roofers and hardware store owners. So what if the letter H is more common for hardware store owners? That fact by itself is not evidence that people with H-names are drawn to owning hardware stores. What about other letters in the alphabet? The only letter it is compared to is R, but perhaps (and, I imagine, almost surely) some other letter is even further away from its predicted value. How do we explain a bunch of hardware store owners with A-letter names, for instance? This result seems like a classic case of finding some evidence in support of a hypothesis and then not looking further.

4. Push: I never said anything one way or another here about rational choice, a concept that seems pretty much irrelevant to this particular example.

Megan, Jeff: I see your points . . . but that's why Pelham et al. did 10 different experiments. I recommend reading their journal article (linked to above) before calling this "a classic case of finding some evidence in support of a hypothesis and then not looking further." They did a lot.

5. This is very interesting. I remember, as a kid, that the parrish bore the same name as my mother, though I think this was all pure coincidence.

For the case of occupation, however, I doubt this would apply easily to languages other than English. In French, the first syllable of Dentist and Denis sound completely different, and I can't remember ever having a dentist named Denis. I do know that a fair amount of family names in French have their origin in occupation (Boucher were butchers, Marchand were salesmen, etc), though I doubt this really applies anymore.

On a less serious note, what would be the perfect name for a statistician ?

6. Andrew, you didn't say anything about rational choice, but your quoted source did.

"Implicit egotism appears to influence major life decisions. This idea stands in sharp contrast to many models of rational choice and attests to the importance of understanding implicit beliefs."

I like your little story about the 1% effect. This kind of Fermi problem is common in physics and engineering (back of the envelope type calculations). But here's a question for you, how would you estimate the standard error of the estimate? I find it's much easier to think up reasons to believe some average than it is to think up reasons to believe that your estimate has a certain accuracy.

I think it's an interesting problem, because in Engineering we often have some values that are generated essentially by this kind of process (for example the design floor load in a residence is about 25 lb/ft^2 if I remember correctly). But there's no easy way to convince yourself of some particular standard error to attach to this kind of number, even though it could be extremely useful.

7. I thought the reverse causation for the name/hometown effect was more plausible and see that they consider it is possible too:

Although it is possible that people gravitate toward places that remind them of themselves, it is also possible that the places in which people live
serve as primes that influence the names that parents give to their children. To new parents living in Georgia, for example, baby
names such as George or Georgia may simply be more accessible than more attractive alternatives such as Brett, John, or Matthew.

Though they address this: "We attempted to rule out this alternate explanation by focusing on surnames rather than first name," and find a broadly similar pattern.

Of course, in the career case, an argument that name choice by parents was based on a possible future career for the child would be just plain weird.

8. There's a link to the mere exposure literature as well (Zajonc, 1968). To oversimplify, that literature suggests that the more you are exposed to something, the more you like it.

People named Dennis are continually exposed to the sound "Den", so it's natural that they'd like it a bit more. And, as noted above, it only needs to be a "bit" in one direction to show a large percentage increase in another direction.

We should ask Paul Murky of Murky Research or statistian Margie Noverra what they think.

9. How about this phrasing: "1% of people would have ended up in different jobs if they were given different names."

One way to think about this is to ask how many other factors affecting career choice of equal importance are out there? What share of them have we identified? How do they sum up? Can they exceed one?

10. Since the counts of dentists are from a professional directory, we are not guaranteed these are birth names, they are likely to be professional names.
Perhaps the surfeit of Dennis the Dentist comes from many a Tom Dick and Harry, and Jerry and Walter and William and Andrew, with middle name of Dennis finding that people remembered his name wrong rather reliably if theey heard his middle name just once.
A fictitious random example – they'd mis-remember Dentist Dennis A Gelman in place of the correct Andrew D(ennis) Gelman DMD, so he bought stationary and signage and diner place-mat sponsorhips to match — but Mom still calls him "my son Andy the Dentist", and the unpublished home number is listed under Andrew D or A.D., which the patients won't ask for, so they'll call the service, but Uncle Fred can still find it.

All I can say is I would NOT want to be in Accounts Recievable. Although I did give serious consideration to buying a house in Billerica, MA, but if they have to offer you flood insurance, walk away …

William D "Bill" Ricker in Boston
* D /not/ for dennis nor dentist
** with a Bahstan accent, 'Billerica' and
'Bill Ricker' are mispronounced identicahly.

11. The base rate given for the names Dennis, Jerry & Walter doesn't pan out when you review the NPI Registry file maintained by the Centers for Medicare & Medicaid Services (CMS). The NPI Registry lists every health care provider in the US who bills for services. The frequency distribution of these three names in the NPI file is:
Dennis 4,442 47.42%
Jerry 2,423 25.87%
Walter 2,502 26.71%

If you run the same frequency distribution where the primary taxonomy is either 122300000X (generic code for dentist) or 1223G0001X (general practice dentist), here is what you get:
Dennis 556 48.06%
Jerry 291 25.15%
Walter 310 26.79%

So there is a tiny difference, but not impressively so. I declare the Dennis dentist myth busted!

12. The NPI Registry lists every health care provider in the US who bills for services. The frequency distribution of these three names in the NPI file is:
Dennis 4,442 47.42%

If you run the same frequency distribution where the primary taxonomy is either 122300000X (generic code for dentist) or 1223G0001X (general practice dentist), here is what you get:
Dennis 556 48.06%

But isn't that just showing that the Dennis's who don't choose to be Dentists choose to be a Doctors.

But seriously…
Dennis is a name from north western Europe. People from North western Europe tend to have middle class jobs. Dentistry is a middle class job. Therefore, Dennis is over expressed in Dentistry compared to in the census.

How many Dennards, Denzels, Denelles and Denishas are dentists?
(http://www.babynames.org.uk/african-american-names-dictionary.htm)

13. Good stuff here. Great read.
Now if we could name our sons and daughters according to what careers we would like them to take…

14. >Now if we could name our sons and daughters >according to what careers we would like them to >take…

Well, I've always liked the name "Rich".

15. I agree with what zbicyclist said, but you also have to consider the phonetic connection is applicable only to the english language. I'd like to see a similar study done in spanish, german, e.t.c. but that would probably be a little too much work…

