Ethnicity and Population Structure in Personal Naming Networks

Aleks pointed me to this recent article by Pablo Mateos, Paul Longley, and David O’Sullivan on one of my favorite topics.

The authors produced a potentially cool naming network of the city of Auckland New Zealand. I say “potentially cool” because I have such difficulty reading the article–I speak English, statistics, and a bit of political science and economics, but this one is written in heavy sociologese–that I can’t quite be sure what they’re doing. However, despite my (perhaps unfair) disdain for the particulars of their method, it’s probably good that they’re jumping in with this analysis. Others can take their data (and similar datasets from elsewhere) and do better. Ya gotta start somewhere, and the basic idea (to cluster first names that are associated with the same last names, and to cluster last names that are associated with the same first names) seems good.

I have to admit, though, that I was amused by the following line, which, amazingly, led off the paper:

Personal naming practices exist in all human groups and are far from random.

Far from random, huh? Who’d a thunk it?

And also this:

Researchers have automatically classified the 2.5 million users of a mobile phone operator in Belgium into French and Flemish speaking communities based exclusively on the topological network structure of their 800 million phone calls and texts interactions [9]. In doing so they have demonstrated the enduring importance of linguistic and geographical barriers in the age of global mobile communications, and more importantly, that they can automatically be detected using network analysis.

OK, sure, any analysis of 2.5 million users is impressive on computational grounds alone, but . . . it’s hard to be impressed that you can automatically partition phone calls and texts from two different languages, right? It’s fine to do, but it’s hardly news that people like to talk in their own language.

This is partly what goes into the “sociologese” style of writing: a sort of flattening of affect, in which seemingly strange behaviors or findings are presented deadpan, while unremarkable observations can be touted as important.

P.S. [just added] This was just a coincidence—the above post about a month ago and was waiting its turn in the queue, whereas the item from yesterday was more recent—but it’s funny that I slammed economists one day and sociologist the next. I’m just full of stereotypes this week, I guess!

13 thoughts on “Ethnicity and Population Structure in Personal Naming Networks

  1. I don’t dissent from anything you say about sociology – to which I can add countless examples – but these guys, judging from their affiliation, are geographers, which perhaps makes it even worse…

  2. ” It’s fine to do, but it’s hardly news that people like to talk in their own language.”

    Agree (!), but still, you have to remember that this was said in Belgium, a member of the EUropean Union. In EU, saying such an obvious thing is revolutinary, since the EU barons seems to have forgotten it.

  3. Andrew, it’s almost like you missed the fact that they determined who is in which group SOLELY from the network structure. Yeah, it wouldn’t have been hard to do it by looking at the languages they used in the texts, but that’s not how they did it!

    I’m still not sure the result is surprising — it seems much like the example of separating liberal and conservative books by looking at the “people who bought X also bought Y” relationships on Amazon, which turned out to be pretty easy I think — but at least it’s a lot harder than saying “hey, I bet the people who send texts in Flemish are Flemish.”

    • yeah, once you’ve found the poor guy that will spend a few weeks cleaning up the data ;)

      @Andrew, i had got your point, and agreed with it. I was even quite amused to see us cited that way.
      As for your troll about Mateos et al’s paper, it made me want to read it, to check if you’re not just an as__ole (look, i’m kidding, ok?)

  4. I’ve always enjoyed your blog Andrew (although given the time lapse you can tell I’m not an everyday reader…) so it was with mixed feelings that I came across your comment on our paper – but hey, all publicity is good publicity, right?!

    Suffice to say that the paper went around the houses to get published – PLoS ONE was our fourth port of call. The language had been flattened (in several metaphorical senses) by the time it had done the rounds of the revision process multiple times. Something to do with the difficulties of publishing more quantitative sociology (or for that matter geography) which is trying to build on new approaches in ‘network science’… if you’re not a physicist-mathematician-statistician, that is (or so it sometimes seems).

    My feeling (I speak only for myself here, not my co-authors, who may have different views) on our work (and Christophe’s) is that being able to confirm that such structures exist in the networks everyone is currently so excited about is (i) interesting in itself, and (ii) an important first step, even if it doesn’t tell us anything we didn’t already know. Our intended contribution is the concept of a names network and the clustering exercise on Auckland was intended to establish that it does indeed have interesting structure to make it worth spending any further time on it.

    What we’d like to do next is see if we can use network structure to infer the cultural-ethnic-linguistic affiliations of names in a network for which we don’t have pre-classified information. In fact, that’s where we got started on this idea anyway.

    I’m glad you like the using-forenames-to-cluster-surnames, surnames-to-cluster-forenames idea, even if we don’t do the latter in the paper. Another direction we’d like to go in is running co-clustering methods to simultaneously do both, which seems like it would make sense.

Comments are closed.