Blogger Echidne pointed me to a recent article, “The Distance Between Mars and Venus: Measuring Global Sex Differences in Personality,” by Marco Del Giudice, Tom Booth, and Paul Irwing, who find:
Sex differences in personality are believed to be comparatively small. However, research in this area has suffered from significant methodological limitations. We advance a set of guidelines for overcoming those limitations: (a) measure personality with a higher resolution than that afforded by the Big Five; (b) estimate sex differences on latent factors; and (c) assess global sex differences with multivariate effect sizes. . . . We found a global effect size D = 2.71, corresponding to an overlap of only 10% between the male and female distributions. Even excluding the factor showing the largest univariate ES [effect size], the global effect size was D = 1.71 (24% overlap).
“Psychologically, men and women are almost a different species,” said study researcher Paul Irwing, of the University of Manchester, in the United Kingdom.
The new findings may explain why some careers are dominated by men (such as engineering) and others by women (such as psychological sciences), Irwing said.
“People self-select in terms of their personality… and what they think is going to be suitable in terms of the fit,” for their career, Irwing said.
It’s too bad that men and women are almost a different species. It would be so convenient if they could mate with each other and produce fertile offspring! Oh well. In any case, Irwing himself is, like Sigmund Freud, that rare oddity, the male psychologist. He must be in that elusive 10% area of overlap!
Where I’m coming from
As the above mockery (along with occasional blog entries such as this and this) should make clear, I don’t have a lot of patience for this sort of boys-do-this, girls-do-that flavor of schoolyard evolutionary biology. (Although I hate it even more when people just make stuff up and then give their non-facts a pseudo-populist political spin.) In this particular case, any observed differences between men and women can be explained in any number of ways so I see the connection to evolution as being pretty weak.
That said, based on my glance at the Del Giudice et al. paper, their analysis seems like a good idea to me, if you separate out the politics and the evolutionary speculation and treat the analysis as entirely descriptive. (Descriptive statistics is just fine, remember?) If you pick the dimensions in which men and women differ the most, you can find a large separation. In my paper with Delia Baldassarri, we did something similar (but simpler) with political attitudes and it was hard to find much, but I suspect that personality profiles are more detailed and have more repeatability than political questionnaires (at least for the general population). It makes a lot of sense to look at differences in many dimensions in addition to studying distributions for single measurements or traits.
And the results don’t seem obviously wrong to me. From my subjective judgment, there certainly appear to be some traits and behaviors for which men are much different from women, and of course there are big statistical differences in crime rates and in some opinion items. Once you interpret these findings descriptively, I can well believe that it is possible to choose a set of items for which the difference between the average man and the average woman is much larger than the average difference between two people within either sex.
What do the numbers mean?
OK, so what about that idea that the distributions of men and women overlap by only 10%? Echinde found this response from Janet Hyde, a researcher whose work was criticized in the above-linked paper. Hyde writes:
The main innovation in the Del Giudice paper is to introduce the use of Mahalanobis D to the measurement of the magnitude of gender differences. A staple of multivariate statistics for decades, D in this application measures the distance between 2 centroids in multivariate space . . . computed by taking the linear combination of the original variables that maximizes the difference between groups. What they have shown is that, if one takes a large enough set of personality measures and then takes a linear combination to maximize gender differences, one can get a pretty big gender difference. . . .
Another important point to note is that Del Giudice and colleagues’ methods rely on subjective self-reports of personality. . . . As an example, Feingold’s meta-analysis found gender differences in anxiety ranging in magnitude between d = -.15 and -.32. That is, the differences were small, with females scoring higher. . . . In a meta-analysis of research on gender differences in temperament – some of it based on parent or other adult report, some of it based on behavioral measures – the effect size for the gender difference in fear was d = -0.12, i.e., a smaller difference . . . Moreover, a behavioral study measuring children’s distress to the insertion of an intravenous needle showed no significant gender difference. That is, the boys were as anxious and fearful as the girls. Too much of the research on gender differences has relied on subjective self-reports, when objective, behavioral measures may show much different results. . . .
As Hyde says, the mathematics of multivariate distributions are such that if you have two different distributions, as you go into high dimensions the overlap goes down.
Here’s the story in statistical notation. Consider the following simple example in d dimensions with two distributions, x (the continuous personality traits for the population of women) and y (for men). Assume x and y have different centers but the same scale, and let each distribution have the identity covariance matrix. (Actual distributions would have correlations; I’m just assuming the identity for simplicity.) As for the centers, assume that the mean of x and the mean of y differ by some small amount “a” in each dimension.
What, then, is the overlap between the two distributions, as defined along the axis that separates them? The distributions live in d dimensions and are separated by “a” in each dimension, thus the distance between their centers is a*sqrt(d)—think of the distance between opposite vertices of a d-dimensional cube. Along this dimension, x and y still separately each have variance 1. I’m not quite sure how they’re defining “overlap,” but for fixed a, if you let d get bigger and bigger, the distance can get as big as you want. For example, suppose a=.2 and d=30. Then a*sqrt(d)=1.1. Or if a=.5 and d=15, you get 1.9.
I guess what I’m saying here is that I don’t find it easy to interpret the published values of 2.7 and 1.7 (or, as the authors quaintly put it, 2.71 and 1.71). I really want to see are some graphs. I’d like to get a sense of what these distributions are, and who are the people in the overlap.
And I’d like to see them replicate the method comparing other groups. They’ve done men/women. They could do young/old, Northerners/Southerners, hi/lo education, Democrats/Republicans, parents/non-parents, etc. These calculations would provide a calibration, a basis of comparison. I’m very supportive of work connecting personality to political attitudes, and if I think there can be big differences betweeen Democrats and Republicans, then it certainly seems plausible that there are big differences between men and women. If, as I expect, the average differences between men and women are much larger than the differences between those other groups, this would strengthen the published findings.
Finally, Echinde asks:
1. Would the same overall findings inevitably be produced by different researchers using the same data and the same method and would it matter if they tried to maximize or minimize gender differences?
2. When the researchers say that only 18% of men and women have the same personalities, what do they mean? That all values on all dimensions are the same or roughly so? And if we move away from this 18%, how large are the differences?
1. Yes, I assume that if others tried to replicate, they’d get similar results (with various small differences involving choices of data and model). The nature of this method is that it maximizes differences. As noted by Hyde, the more dimensions you have, the more ways you can find differences.
2. I’m not quite sure how they define overlap. I’m picturing two univariate distributions—little bell-shaped curves—with some area of overlap that is shaded. If the distributions are identical, the overlap is 100%, if they are far apart, the overlap is near 0. Again, these low numbers such as 18% make sense to me if you look at enough dimensions, but I don’t really have much intuition on the survey responses, the statistical models used to estimate underlying traits, or the summary of the multidimensional comparison.