Association for Psychological Science takes a hard stand against criminal justice reform

Here’s the full quote, from an article to be published in one of our favorite academic journals:

The prescriptive values of highly educated groups (such as secularism, but also libertarianism, criminal justice reform, and unrestricted sociosexuality, among others) may work for groups that are highly cognitively sophisticated and self-controlled, but they may be injurious to groups with lower self-control and cognitive ability. Highly educated societies with global esteem have more influence over global trends, and so the prescriptive values promulgated by these groups are likely to influence others who may not share their other cognitive characteristics. Perhaps then highly educated and intelligent groups should be humble about promoting the unique and relatively novel values that thrive among them and perhaps should be cautious about mocking certain cultural narratives and norms that are perceived as having little value in their own society.

I have a horrible feeling that I’m doing something wrong, in that by writing this post I’m mocking certain cultural narratives and norms that are perceived as having little value in my own society.

But, hey, in my own subculture, comedy is considered to be a valid approach to advancing our understanding. So I’ll continue.

The Association for Psychological Science (a “highly educated and intelligent group,” for sure) has decided that they should be humble about promoting the unique and relatively novel values that thrive among them. And these unique and relatively novel values include . . . “criminal justice reform and unrestricted sociosexuality.” If members of the Association for Psychological Science want criminal justice reform and unrestricted sociosexuality for themselves, that’s fine. Members of the APS should be able to steal a loaf of bread without fear of their hands getting cut off, and they should be able to fool around without fear of any other body parts getting cut off . . . but for groups with lower self-control and cognitive ability—I don’t know which groups these might be, you’ll just have to imagine—anyway, for those lesser breeds, no criminal justice reform or unrestricted sociosexuality for you. Gotta bring down the hammer—it’s for your own good!

You might not agree with these positions, but in that case you’re arguing against science. What next: are you going to disbelieve in ESP, air rage, himmicanes, ovulation and voting, ego depletion, and the amazing consequences of having an age ending in 9?

OK, here’s the background. I got this email from Keith Donohue:

I recently came across the paper, “Declines in Religiosity Predicted Increases in Violent Crime—But Not Among Countries with Relatively High Average IQ”, by Clark and colleagues, which is available on research gate and is in press at Psychological Science. Some of the authors have also written about this work in the Boston Globe.

I [Donohue] will detail some of issues that I think are important, below, but first a couple of disclaimers. First, two of the authors, Roy Baumeister and Bo Winegard, are (or were) affiliated with Florida State University (FSU). I got my PhD from FSU, but I don’t have any connection with these authors. Second, the research in this paper uses estimates of the intelligence quotient (IQ)s for different nations. Psychology has a long and ignoble history of using dubious measures for intellectual ability to make general claims about differences in average intelligence between groups of people – racial/ethnic groups, immigrant groups, national groups, etc. Often, these claims have aligned with prevailing prejudices or supported frankly racist social policies. This history disturbs me, and seeing echoes of it a flagship journal for my field disturbs me more. I guess what I am trying to say is that my concerns about this paper go beyond its methodological issues, and I need to be honest about that.

With that in mind, I have tried to organize and highlight the methodological issues that might interest you or might be worth commenting on.

1.) Data source for estimates of national IQ and how to impute missing values in large datasets

a. The authors used a national IQ dataset (NIQ: Becker, 2019) that is available online, as well as two other datasets (LV12GeoIQ, NIQ_QNWSAS – I can only guess at what these initialisms stand for). These datasets seem to be based on work by Richard Lynn, Tatu Vanhanen, and (more recently) David Becker for their books IQ and the Wealth of Nations and Intelligence: A Unifying Construct for the Social Sciences, and Intelligence of Nations. This data seems to be collections of IQ estimates made from proxies of intelligence, such as school achievement or scores on other tests of mental abilities. Based on my training and experience with intelligence testing, this research decision seems problematic. Also problematic is the decision to input missing values for some nations based on data from neighboring nations. Lynn and colleagues’ work has received a lot of criticism from within academia, as well as from various online sources. I find this criticism compelling, but I am curious about your thoughts on how researchers ought to impute missing values for large datasets. Let’s suppose that a researcher was working at a (somewhat) less controversial topic, like average endorsement of a candidate or agreement with a policy, across different voting areas. If they had incomplete data, what are some concerns that they ought to have when trying to guess at missing values?

2.) Testing hypotheses about interactions, when the main effect for one of the variables isn’t in the model

a. In Study 1, the authors tested their hypothesis that the negative relationship between religiosity and violent crime (over time) is moderated by national IQ, by using fixed effects, within-country linear regression. There are some elements to these analyses that I don’t really understand, such as the treatment of IQ (within each country) as a time-stable predictor variable and religiosity as a time-varying predictor variable (both should change, over time, right?) or even the total number of models used. However I am mostly curious about your thoughts on testing an interaction, using a model that does not include the main effects for each of the predictor variables that are in the interaction effect (which appears to be what the authors are doing). I don’t have my copy of Cohen, Cohen, West, & Aiken in front of my, but I remember learning test models with interactions in a hierarchical fashion, by first entering main effects for the predictor variables, and then looking for changes in model fit when the interaction effect was added. I can appreciate that the inclusion of time in these analyses (some of the variables – but not IQ? – are supposed to change) make these analyses more complicated, but I wonder if this is an example of incongruity between research hypotheses and statistical analyses.

b. Also, I wonder if this is an example of over-interpreting the meaning of moderation, in statistical analyses. I think that this happens quite a lot in psychology – researchers probe a dataset for interactions between predictor variables, and when they find them, they make claims about underlying mechanisms that might explain relationships between those predictors or the outcome variable(s). At the risk of caricaturing this practice, it seems a bit like IF [statistically significant interaction is detected] THEN [claims about latent mechanisms or structures are allowed].

3.) The use of multiverse analyses to rule-out alternative explanations

a. In Study 2, the authors use multiverse analysis to further examine the relationship between religiosity, national IQ, and violent crime. I have read the paper that you coauthored on this technique (Steegen, Tuerlinck, Gelman & Vanpaemal, 2016) and I followed some of the science blogging around Orben & Przybylski (2019) use of it, in their study on adolescent well-being and digital technology use. If I understand it correctly, the purpose of multiverse analysis is to organize a very large number of analyses that could potentially be done with a group of variables, so as to better understand whether or not the hypotheses that the researchers’ (e.g,. fertility effects political attitudes, digital technology use effects well-being) are generally supported – or, to be more Popper-ian, if they are generally refuted. In writing up the results of a multiverse analysis, it seems like the goal is to detail how different analytic choices (e.g. the inclusion/exclusion of variables, how these variables are operationalized, etc.) influence these results. With that in mind, I wonder if this is a good (or bad?) example of how to conducts and present a multiverse analysis. In reading it, I don’t get much of a sense of the different analytic decisions that the authors considered, and their presentation of their results seems a little hand-wavey – “we conducted a bunch of analyses, but they all came out the same…” But given my discomfort with the research topic (i.e., group IQ differences), and my limited understanding of multiverse analyses, I don’t really trust my judgement.

4.) Dealing with Galton’s Problem and spatial autocorrelation

a. The authors acknowledge that the data for the different nations in their analyses are likely dependent, because of geographic or cultural proximity, an issue that they identify as Galton’s Problem or spatial autocorrelation. This issue seems important, and I appreciate the authors’ attempts to address it (note: it must be interesting to be known as a “Galton’s Problem expert”). I guess that I am just curious as to your thoughts about how to handle this issue. Is this a situation in which multilevel modeling makes sense?

There are a few issues here.

First, I wouldn’t trust anything by Roy Baumeister, first because he has a track record of hyping problematic research claims, second because he has endorsed the process of extracting publishable findings from noise. Baumeister’s a big fan of research that is “interesting.” As I wrote when this came up a few years ago:

Interesting to whom? Daryl Bem claimed that Cornell students had ESP abilities. If true, this would indeed be interesting, given that it would cause us to overturn so much of what we thought we understood about the world. On the other hand, if false, it’s pretty damn boring, just one more case of a foolish person believing something he wants to believe.

Same with himmicanes, power pose, ovulation and voting, alchemy, Atlantis, and all the rest.

The unimaginative hack might find it “less broadly interesting” to have to abandon beliefs in ghosts, unicorns, ESP, and the correlation between beauty and sex ratio. For the scientists among us, on the other hand, reality is what’s interesting and the bullshit breakthroughs-of-the-week are what’s boring.

To the extent that phenomena such as power pose, embodied cognition, ego depletion, ESP, ovulation and clothing, beauty and sex ratio, Bigfoot, Atlantis, unicorns, etc., are real, then sure, they’re exciting discoveries! A horse-like creature with a big horn coming out of its head—cool, right? But, to the extent that these are errors, nothing more than the spurious discovery of patterns from random noise . . . then they’re just stories that are really “boring” (in the words of Baumeister) stories, low-grade fiction.

Second, Baumeister has some political axe to grind. This alone doesn’t mean his work is wrong—someone can have strong political views and do fine research, indeed sometimes the strong views can motivate careful work, if you really care about getting things right. Rather, the issue is that we have good technical reasons to not take Baumeister’s research seriously. (For more on problems with his research agenda, see the comment on p.37 by Smaldino here.) Given that Baumeister’s work may still have some influence, it’s good to understand his political angle.

And, no, it’s not “bullying” or an “ad hominem attack” to consider someone’s research record when evaluating his published claims.

Second, yeah, it’s my impression that you have to be careful with these cross-national IQ comparisons; see this paper from 2010 by Wicherts, Borsboom, and Dolan. Relatedly, I’m amused by the claim in the abstract by Clark et al. that “Many have argued that religion reduces violent behavior within human social groups.” I guess it depends on the religion, as is illustrated by the graph shown at the top of this page (background here).

Third, ok, sure, the paper will be published in Psychological Science, flagship journal bla bla bla. I’ve written for those journals myself, so I guess I like some of what they do—hey, they published our multiverse paper!—but the Association for Psychological Science is also a bit of a member’s club, run for the benefit of the insiders. In some way, I have to admire that they’d publish a paper on such a politically hot topic as IQ differences between countries. I’d actually have guessed that such a paper would push too many buttons for it to be considered publishable by the APS. But Baumeister is well connected—he’s “one of the world’s most prolific and influential psychologists”—so I guess that in the battle between celebrity and political correctness, celebrity won.

And, yes, that paper is politically incorrect! Check out these quotes:

Educated societies might promote secularization without considering potentially disproportionately negative consequences for more cognitively disadvantaged groups. . . .

We suspect that similar patterns might emerge for numerous cultural narratives. The prescriptive values of highly educated groups (such as secularism, but also libertarianism, criminal justice reform, and unrestricted sociosexuality, among others) may work for groups that are highly cognitively sophisticated and self-controlled, but they may be injurious to groups with lower self-control and cognitive ability.

OK, I got it. We won’t throw you in jail if you’re from a group with higher self-control and cognitive ability. But if you’re from one of the bad groups, you can do what you want: you can handle a bit of “libertarianism, criminal justice reform, and unrestricted sociosexuality, among others.” Hey, it worked for Jeffrey Epstein!

Fourth, there are questions about the statistical model. I won’t lie: I’m happy they did a multiverse analysis and fit some multilevel regressions. I’m not happy with all the p-values and statistical significance, nor am I happy with some of their arbitrary modeling decisions (“the difference led us to create two additional dummy variables, whether a country was majority Christian or not and whether a country was majority Muslim or not, and to test whether either of these dummy variables moderated the nine IQ by religiosity interactions (in the base models, without controls). None of the 18 three-way interactions were statistically significant, and so we do not interpret this possible difference between Christian majority countries and Muslim majority countries”) or new statistical methods they seemed to make up on the spot (“We arbitrarily decided that a semipartial r of .07 or higher for the IQ by religiosity interaction term would be a ‘consistent effect’ . . .”), but, hey, baby steps.

Donohue’s questions above about multiverse analysis and spatial correlations are good questions, but it’s hard for me to answer them in general, or in the context of this sort of study, where there are so many data issues.

To see where I’m coming from, consider an example that I’ve though about a lot: the relation between income, religious attendance, geography, and vote choice. My colleagues and I wrote a whole book about this!

The red-state-blue-state project and the homicide/religion/IQ project have a lot of similarities, in that we’re understanding social behavior through demographics, and looking at geographic variation in this relationship. We’re interested in individual and average characteristics (individual and state-average incomes in our case; individual and national-average IQ’s in theirs). Clark et al. have data issues with national IQ measurements, but it’s not like our survey measurements of income are super-clean.

So, if we take Red State Blue State as a template for this sort of analysis, how does the Clark et al. paper differ? The biggest difference is that we have individual level data—survey responses on income, religious attendance, and voting—whereas Clark et al. only have averages. So they have a big ecological correlation problem. Indeed, one of the big themes of Red State Blue State is that you can’t directly understand individual correlations by looking at correlations among aggregates. The second difference between the two projects is that we had enough data that we can analyze each election year separately, whereas Clark et al. pool across years, which makes results much harder to understand and interpret. The third difference is that we developed our understanding through lots of graphs. I can’t imagine us figuring out much, had we just looked at tables of regression coefficients, statistical significance, correlations, etc.

This is not to say that their analysis is necessarily wrong, just that it’s hard for me to make sense of this sort of big regression; there are just too many moving parts. I think the first step in trying to understand this sort of data would be some time series plots of trends of religiosity and crime, with a separate graph for each country, ordering the countries by per-capita GDP or some similar measure of wealth. Just to see what’s going on in the data before going forward.

Fifth, I’m kinda running out of energy to keep staring at this paper but let me point out one more thing which is the extreme stretch from the empirical findings of this paper, such as they are, to its sociological and political conclusions. Set aside for a moment problems with the data and statistical analysis, and suppose that the data show exactly what the authors claimed, that time trends in religious attendance correlate with time trends in homicide rates in low-IQ countries but not in high-IQ countries. Suppose that’s all as they say. How can you, from that pattern, draw the conclusion that “The prescriptive values of highly educated groups (such as secularism, but also libertarianism, criminal justice reform, and unrestricted sociosexuality, among others) may work for groups that are highly cognitively sophisticated and self-controlled, but they may be injurious to groups with lower self-control and cognitive ability”? You can’t. To make such a claim is not a gap in logic, it’s a chasm. Aristotle is spinning in his goddam grave, and Lewis Carroll, Georg Cantor, and Kurt Godel ain’t so happy either. This is story time run amok. I’d say it’s potentially dangerous for the sorts of reasons discussed by Angela Saini in her book, but I guess nobody takes the Association for Psychological Science seriously anymore.

I’m surprised Psych Science would publish this paper, given its political content and given that academic psychology is pretty left-wing and consciously anti-racist. I’m guessing that it’s some combination of: (a) for the APS editors, support of the in-group is more important than political ideology, and Baumeister’s in the in-group, (b) nobody from the journal ever went to the trouble of reading the article from beginning to end (I know I didn’t enjoy the task!), (c) if they did read the paper, they’re too clueless to have understood its political implications.

But, hey, if the APS wants to take a stance against criminal justice reform, it’s their call. Who am I to complain? I’m not even a member of the organization.

I’d love to see the lords of social psychology be forced to take a position on this one—but I can’t see this ever happening, given that they’ve never taken a position on himmicanes, ESP, air rage, etc. At one point one of those lords tried to take a strong stand in favor of that ovulation-and-voting paper, but then I asked him flat-out if he thought that women were really three times more likely to wear red during certain days of the month, and he started dodging the question. These people pretty much refuse to state a position on any scientific issue, but they very strongly support the principle that anything published in their journals should not be questioned by an outsider. How they feel about scientific racism, we may never know.

P.S. One more thing, kinda separate from everything else but it’s a general point so I wanted to share it here. Clark et al. write:

Note also that noise in the data, if anything, should obscure our hypothesized pattern of results.

No no no no no. Noise can definitely obscure true underlying patterns, but it won’t necessarily obscure your hypothesized patterns. Noise can just give you more opportunities to find spurious, “statistically significant” patterns. The above quote is an example of the “What does not kill my statistical significance makes it stronger” fallacy. It’s an easy mistake to make; famous econometricians have done it. But a mistake it is.

141 thoughts on “Association for Psychological Science takes a hard stand against criminal justice reform

    • IQ related research, like much of social science, is full of junk. However, it’s intellectually dishonest to put IQ in the same realm of fiction.

      • yyw:

        intellectual dishonest for something that was invented to help children with learning difficulties, and it has been transformed into an “objective” measure of human intelligence. Intelligence, funny to underline, we cannot even define it.

        • I’m a bit confused by the objections I see raised to the validity of IQ testing (not just here, but practically everywhere the subject is mentioned).

          There are problems with definitions, and difficulties in testing. But there are clear an unambiguous differences between individuals measured as IQ 80, 100, 120, 140, and 160 (just for instance). It’s clearly a real and impactful concept. A simplified abstraction of what’s really involved in intelligence, no doubt, but more real and measurable than most other mental/psychological attributes.

        • Sameera,

          As the saying goes…Life is a tragedy to those who feel, but a comedy to those who think.

          The circular definition games that psychology researchers like to play with themselves are funny on one level. But they put stuff out there that, every so often, is used to do real-world harm. Take one circularly-defined construct like “IQ” and mix it with a slippery concept like “Race” and you can fashion yourself a very real cudgel created entirely out of silly academic games.

        • One thing that is often missed in discussions about IQ is how cognitive tests are actually used by those with the training to use them. In keeping with the original intentions of early test creators, tests are still used today to help in the identification of learning difficulties and cognitive strengths and weaknesses relevant to education. They are used to diagnose the cognitive sequelae of brain injury and disease, to monitor the impacts of treatment interventions on cognitive symptoms, to inform rehabilitation efforts, and so on. These are important applications that often go unmentioned whenever the topic turns to IQ tests. The public understanding of cognitive assessment tools and their applications is thus impoverished. And it doesn’t help when there are cranks out there—some of whom are even commenting on this post (e.g., Emil)—who try to pass off their junk science as a legitimate application of the tests in question.

        • You need a stable definition of what’s being approximated in order to evaluate if it’s a useful approximation. Otherwise you’re using Wittgenstein’s ruler.

          IQ predicts some stuff. You could say “well clearly that means IQ captures some kind of meaningful variation, it’s a real and meaningful simplification of intelligence.” But you can also predict almost all the same things with parental income. Can you then say that “one’s parental income is a real and meaningful simplification of the concept of intelligence”?

          To be clear, IQ may still be a useful summary statistic for predictive tasks. But “predicts stuff” is distinct from “real concept”. The first principle component of a wide set of predictors is generally a strong predictor but nobody generally claims it can be interpreted as some concept.

        • Ok… is the objection that IQ test scores don’t bring any additional information to the table, that you wouldn’t get from various social-economical, familial, etc., predictors? And that using IQ scores in turn to predict any of those things, as is often done, leads you to get causation all screwed up? I can see that it’s been poorly used in many studies.

          I guess I would argue that there is a real concept (albeit complex, poorly defined, and hard to measure) we call intelligence, and that IQ tests (or other types of intelligence testing) can give you a bit of additional evidence about it, beyond those other factors. That seems the Bayesian approach, philosophically.

          I wouldn’t object to IQ being labeled a flawed or imprecise metric, just that being labeled meaningless or that the presence of it “puts this work into fiction” is too strong of a criticism. But I feel like I’ve walked into a discussion that’s been ongoing, so perhaps I should tentatively back away and lurk more.

        • Keith E said,
          “Ok… is the objection that IQ test scores don’t bring any additional information to the table, that you wouldn’t get from various social-economical, familial, etc., predictors? And that using IQ scores in turn to predict any of those things, as is often done, leads you to get causation all screwed up? I can see that it’s been poorly used in many studies.”

          I think a big part of the problem is the (very common) confusion between “predicts” and “causes” — which is a special case of confusing correlation and causality.

        • > I guess I would argue that there is a real concept (albeit complex, poorly defined, and hard to measure) we call intelligence, and that IQ tests (or other types of intelligence testing) can give you a bit of additional evidence about it, beyond those other factors.

          The issue is; what is “it”? To say that we have this number that is a measure or even a flawed measure of a “thing” when we can’t describe what the thing is to GUARANTEE abuses of interpretation. If a “correct” interpretation doesn’t exist, then every interpretation is an abuse.

          To draw a chemistry analogy, intelligence is a bit like phlogiston. We knew vaguely that there are common characteristics that make some substances more or less combustible, but we didn’t actually understand why in any generalizable way. IQ is a bit like if people started computing materials’ “phlogiston densities” before Lavosier. You could call it “the relative quantity of phlogiston in a material”, but what does this actually mean? Trying to do anything other than within-sample interpolation and prediction would have been doomed to failure.

        • Somebody said,
          “The issue is; what is “it”? To say that we have this number that is a measure or even a flawed measure of a “thing” when we can’t describe what the thing is to GUARANTEE abuses of interpretation. If a “correct” interpretation doesn’t exist, then every interpretation is an abuse.”

          Brings back memories of being given the definition, “Intelligence is that which an intelligence test measures.” Googling this “definition” now, I get a whole slew of hits. Is this definition still used?!?!

        • Naturally, IQ researchers being quite good at statistics, counting among them the inventors of modern statistics, the correlation and regression concept, factor analysis, adjustment for unreliability and so on (Francis Galton, Karl Pearson, Charles Spearman etc.). So your objection here is that family variables predict stuff as well, and IQ predictions might just be a confounded by that. Naturally, there is plenty of research into that, for instance, using siblings. These studies show that the validity of IQ is not mostly due to association with parental social status traits. Here’s a recent Danish study of 360k Danes and register data.

          https://www.sciencedirect.com/science/article/abs/pii/S0191886919303411

          One can of course also adjust for measured parental traits in a regression using between family dad. Charles Murray and Richard Herrnstein did just that in The Bell Curve, finding the same result: IQ validity not mostly due to association with parental social status traits. Murray also did a sibling study using the NLSY dataset, again similar conclusion. Similar studies also exist for the SAT, with the same results.

          https://emilkirkegaard.dk/en/?p=7952

          One can go further and look at what parental income etc. actually causes using a clever sibling design. The answer is: not much. Amir Sariaslan et al has conducted a number of such studies using Swedish register data:

          https://onlinelibrary.wiley.com/doi/abs/10.1111/jcpp.12140

          https://www.cambridge.org/core/journals/the-british-journal-of-psychiatry/article/childhood-family-income-adolescent-violent-criminality-and-substance-misuse-quasiexperimental-total-population-study/A5CF371A1776F376ED11FCB5A22305A5

          https://academic.oup.com/ije/article-abstract/42/4/1057/656274

          Also, parental income does not predict the future as well as children’s own IQ, so even if we ignored the above, you would still be wrong.

        • You are completely misreading my point. My point is not that IQ does not predict as well as other things or that it is confounded by other things. My point is that predicting things arbitrarily well does not make something a well defined, interpretable concept. If you took parental income, SES, schooling, and IQ and took the first principal component, that would predict income even better. What’s the interpretation there? It predicts better; is that then the real measure of intelligence? If you put all the variables you could find through a random forest or a multilayer feedforward neural network or a boosted tree, the array of neuron weights or tree splits would be even better. Would that collection of loosely structure data then be a real measure of intelligence?

          Emil, this is not the first time you’ve responded to a discussion of construct validity with within-population predictive performance. Predictive performance =/> construct validity. Based on the list of statisticians you’ve cited, I feel like you might not be understanding that g is not the same thing as IQ. The work of people like Spearman on g-factors provides somewhat convincing evidence (with some disputes) that performance on a variety of tasks is tightly coupled to one “thing”. That’s not the same thing as presuming to know what that “thing” is, or that we can measure it. You seem to be under the impression that I’m attacking the idea of a general intelligence, which I’m not. I’m saying that you cannot establish that IQ is an approximate measure of g without a clearer definition of g, and that evidence about predictive performance is irrelevant.

        • Are you aware of any sizable databases of raw IQ data and predicted outcomes openly available for re-analysis?

        • James Whanger,

          There’s many of these, but one has know where to look. Some quick places you can start: NLSY (there’s 3 of these), NLSY Add Health (partially public). GSS and ANES each have Wordsum (10 item vocabulary), which is not ideal but not useless either. Project Talent is partially open, only the first wave, the later, follow-up waves are somehow behind application access. I think (ECLS) Early Child Longitudinal Study is also open. One wave of NHANES has an IQ test as well, not a great one. OKCupid dataset has IQ, but not high quality.

          Just off the top of my head. THere’s many more with relatively loose access requirements.

        • I think Pinker is wrong:

          chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/https://res.mdpi.com/d_attachment/jintelligence/jintelligence-02-00012/article_deploy/jintelligence-02-00012.pdf

        • And yet, as you can see, one the leading psychometricians in the world clearly describes the flaws in this perspective. That does not bode well for the rest of psychology.

        • Sorry if I would look like I’m simplifying too much, but I feel I won’t.

          Let’s say that you and I make the same IQ test. You get 120 (no offence), I get 100.
          Can you tell me please the objective difference between you and me based on these 2 numbers?

        • From my point of view, unless trained in psychometrics and cognitive ability tests in particular, most people are against IQ tests. But, anecdotally, IQ is one of the best predictors of work performance, so they must be getting something right. I don’t think we should stop using IQ tests in selection and training. As well, IQ is a very important stream of research that intersects work on human rationality. Denying it would partially deny the importance of studying rationality.

          Prof Stanovich by any chance?

  1. The basic idea that the demonstration effects of high SES culture are deleterious for low SES culture is the thesis of Charles Murray’s Coming Apart. The difference is that the notion that the effects of these behaviors on the high SES groups is benign is nowhere in Murray. (Interestingly, for those who evaluate Murray entirely by his past work, IQ is not a variable is his modeling.) The other difference is that Murray’s statistical evidence is really simple, but in this case the simplicity is a virtue: if you can’t avoid an ecological fallacy problem, aren’t you better off analyzing the data using aggregates that are easy for everyone to understand? (IIRC, he uses aggregates a ZIP code level, which is miles better than country-level in any case.)

    • Yes. Not much of a fan, but Murray has learned to tread the line carefully: while of course innate characteristics act like a sorting hat, family and community and cultural characteristics are their own sorting functions. The older, really unsuccessful formulation stressed ‘cognitive ability’ and imagined a choice function identiying a cognitive elite using IQ as the governor. The better working model accepts more of reality, that this idealized sorting hat exists only in Harry Potter. But he would have done well in Slytherin.*

      I had an interesting exposure to this in recent years because I gave a kid some help at a charter school. This school has an odd mission: to identify smart kids who have passed through the schools without learning enough, and then give them a year and a half of extra work, sort of a gap year between middle and high school, cramming them with material they should have learned, catching them up so they can apply to better high schools. Some of these kids were really intelligent. All of them were bright. Biggest problem: they needed to have one responsible adult in their lives to be accepted … and that was simply untrue for most applicants. When I went to graduation, there was a crowd of family with each kid, but I knew their home lives had been disasters, sometimes shuffled around from grandparent to aunt to mom or worse. Made me cry, not because of any hint of hypocrisy but because they all really did want better outcomes, and they couldnt see a way to do that.

      So, Murray removed the absurdity of IQ. Correct choice. The idea that there is a growing ‘cognitive’ divide has proven true with the rise of the knowledge/skill based tech economy. You dont need IQ to talk about that. You dont need race. Instead, you focus on what is different over here versus over there. People have a need to identify ‘innate’ measures which they then impose on data. To me, people invoke IQ to address questions about potential, but without recognizing they are judging other societies and groups as having lower absolute human potential.

      *He would have done well in Slytherin because he would still be Harry Potter. The story would come out to the same end, the destruction of Voldemort and his ambitions, but along other pathways. That’s the difference between stories and real lives.

      • “The idea that there is a growing ‘cognitive’ divide has proven true with the rise of the knowledge/skill based tech economy. You dont need IQ to talk about that. You dont need race. Instead, you focus on what is different over here versus over there. People have a need to identify ‘innate’ measures which they then impose on data. To me, people invoke IQ to address questions about potential, but without recognizing they are judging other societies and groups as having lower absolute human potential.”

        Good points.

  2. I won’t go anywhere near the central conceptual and methodological issues in this paper, but I do want to bring attention to the question of imputations. This has troubled me for a long time, since I’ve done a lot of UN work where missing observation problems were rife. Imputation was a big part of the overall job.

    When I first started in the early 00’s it was common in a lot of UN work to impute by geographic proximity. I was horrified by this on multiple grounds, one of which is that, by construction, you are constraining the ex post data set to have the same range of values on the missing obs variables as the ex ante. When a big chunk of your data have imputations, this is serious.

    So of course I used a regression approach. The way I usually did it was to come up with a rough and ready model of what might influence the variable to be imputed, find the needed data somewhere, and then play around until I found a specific implementation that was relatively high up in explanatory power but also had sensible outcomes to combat overfitting. I would hang on to rejected models however to get a range of possible imputations. Then, in the “sensitivity analysis” appendix to the report, I would use the full set of possible imputations to construct a range of potential outcomes for the variables of interest (for which the imputed variable was an input) and report bookend values. I’m not sure my approach was ideal — I would welcome second-guessing — but it was successful “politically” in the sense that consumers (govts, NGOs) came away thinking that the reports were being honest with them.

    During those years I would read from time to time in the literature on recommended imputation procedures, but it seemed to me that not much thought had been given to first principles; it was very cookbook-y. I suppose one of the motivating questions for me remains this: in a lot of applied work where there is high public interest in outcome estimates that pertain to units for which observations are missing, imputation can play a large role. Are the boundaries (limits on how much of this to do) that researchers need to respect, even if it means giving bad news to their sponsors? And how do we communicate the sponginess that enters in as a result of extensive imputation?

    I realize I’m off on a tangent here, but I was struck by the adoption by the authors of the Psy Sci piece of the “bad old” geographical approach. (“Also problematic is the decision to input missing values for some nations based on data from neighboring nations. Lynn and colleagues’ work has received a lot of criticism from within academia, as well as from various online sources. I find this criticism compelling, but I am curious about your thoughts on how researchers ought to impute missing values for large datasets.”)

    • Peter:

      I think part of the problem is that the authors have a racial essentialist attitude, which from the U.S. perspective can manifest itself as thinking of faraway countries, or even continents, as being homogeneous. Impute one African with another African, whatever, that’s the attitude.

      • That’s exactly right, that’s one of the huge issues with this paper. At least one of the authors, Bo Winegard, made a “name” for himself by pushing for the idea that differences in average “IQ” between “races” are due partially (but substantially) to genetic differences. (Of course, naturally he believes that IQ measurement itself is solid and meaningful, both for individuals and the averages for populations/”races”/countries…lots of issues here.) He’s argued this on numerous podcasts (many with the lead author, Cory Clark), on Twitter, in various Quillette articles, etc.

        Until recently he was pre-tenure faculty at a college in Georgia I think, but his contract was apparently not renewed after this term. So of course he’s an alt-right martyr etc. now.

        He’s also a big fan of evolutionary psychology, pushes back on the numerous criticisms of that field, and has uncritically put forward the “Cold Winters Hypothesis” as an explanation for racial IQ differences (basically, this is the idea that Africa is warm and easy living, whereas living up north is hard and requires higher IQ to plan and get through the winter, or something — it’s an idea popular with white supremacists for 100+ years I think).

        Basically, I think the Clark et al. article was/is an attempt to legitimize the “national IQ” dataset they rely on, which is derived from an online compilation by some race-IQ fan that attempts to update Richard Lynn’s total-crap national IQ data from decades past. None of the very obvious criticisms of this dataset that could be made are in the Clark et al. paper.

        Some clarifications to the Opening Post:

        1. This isn’t “stand” by the Association for Psychological Science on criminal justice reform or anything else. It’s just an article published in their journal.

        2. Here’s the article reference, the article was already out in January 2020:

        Cory J. Clark, Bo M. Winegard, Jordan Beardslee, Roy F. Baumeister, Azim F. Shariff (2020). Declines in Religiosity Predict Increases in Violent Crime—but Not Among Countries With Relatively High Average IQ. Psychological Science
        1-14. http://dx.doi.org/10.1177/0956797619897915

        • Nick:

          When I said it was a hard stand by the Association for Psychological Science, I was joking. What I really meant was that I’m pretty sure that most of the people who run the Association for Psychological Science are very politically liberal and would not have wanted an article with this political content to have been published in their top journal. So how did the paper get published? I’m guessing some mixture of (a) connections (Baumeister is a big deal in academic psychology, and it’s my impression that the APS has been swayed by such factors in the past), (b) carelessness (maybe nobody really read the paper carefully except the referees, who may have been friends of the authors, people who work in the same subfield, etc.), (c) desire for worldly relevance, and (d) general support for this sort of theory (“psychological” explanations for social phenomena) without realizing exactly what the paper was claiming.

        • Ah! I missed the joke, sorry. I thought maybe it was a summary that your correspondent had said or something.

          I agree with your guesses about how it got published. That plus the generic persistence of IQ & psychometrics in Psychology departments despite the field seeming to be in stasis for decades (my uninformed opinion) and never advancing much beyond linear models as a method.

        • Maybe it was published because it’s, overall, a better explanation for what we’ve been watching on the news every night recently than is the conventional wisdom?

        • Steve:

          If the paper had just presented its data analysis, I think it has weaknesses, but, sure, maybe some value in sharing the analysis, even if flawed (see discussion above). But the leap from the empirics to the political conclusions is just ridiculous What surprises me is not that a paper published in Psychological Science makes a huge and unsupported leap from shaky empirics to a strong conclusion—that happens all the time! What was notable in this case was that the leap was in a right-wing rather than a left-wing direction.

          And, yes, in all cases the argument is made that the big leap is a better explanation than the conventional wisdom. A bold, controversial theory advertised as being a better explanation than the conventional wisdom: that describes Freudian theory, implicit bias theory, embodied cognition, all sorts of things. It’s the common element in so many theories in psychology.

        • The framework I often find useful for thinking about The Sixties is as the Smart Liberation era. Traditional pre-1960s society came with a lot of rules that generally assumed that most members of society weren’t either very educated or very smart, so rules were phrased in simplistic fashions and justified in simplistic ways such as being the Word of God. For a long time, small numbers of very bright people chafed under The Rules, such as Bertrand Russell’s struggles with the Rule against trading in your wife for a younger model every decade or two.

          The growth of higher education meant that more people were educated, and as Herrnstein and Murray said, the top colleges could now use the SAT test to find the smartest kids from across the country, thus creating more critical masses of the intelligent and educated.

          So, a lot of rules got changed around the 1960s, such as divorce and non-marital activities becoming more accepted. The educated and intelligent who’d been demanding these changes in the Rules, then more or less went back to arranging their own affairs much like people did before the Sixties: two parent married families, high investment in offspring, etc etc. Smart Lib was invented by the smart for the good of the smart, and it has mostly worked out pretty well for the smart.

          For people lower down on the intelligence scale, however, family life got more chaotic than before The Sixties. Smart Lib hasn’t been so good for the non-Smart.

      • Come on, this is an exceedingly weak, but exceedingly telling objection: we are policing for the “right” attitudes.

        Africa can be homogeneous with respect to IQ — and not homogeneous with respect to height or language.

        What positive evidence is there that “a racial essentialist attitude” is actually a problem — in the relevant sense and for answering relevant questions?

        Does it look like a problem?

        Does Africa look like a oh-so-remarkably rich and varied patchwork of cognitive ability?

        Because as everyone knows, you can’t impute one African country for another, right?
        Some areas are making remarkable strides in analytic number theory.
        Others sprout semiconductor foundries. Interspersed with a few mired in poverty, violence, and disease,
        a prosperous number are on the brink of commercial Thorium reactors.

        Indeed there is an African Lost Tribe of Israel, numbering about .002% of the population, who routinely produced scientific genius the equal of Schwinger, Von Neumann, and Mandelbrot. They routinely compose about 33% of Africa’s billionaires.

        Look, “racial essentialists” have no problem thinking of faraway variation in height — in Africa or Latin America. Or in IQ between Sephardi and Ashkenazi. Or Indian castes. Weird. Maybe that’s because actually looks that way and has a prayer of turning out to actually be that way.

        Africa is always too, too Diverse — you’ll never have the permissible computational resources to come to any conclusion about it.

        Yet paradoxically, Africa is homogeneously enduring the wounds of historical racial oppression. Africa and her diaspora are *essentially* oppressed. That’s the only thing we know for certain!

        • > Come on, this is an exceedingly weak, but exceedingly telling objection: we are policing for the “right” attitudes.

          What are you talking about? This is policing for the methodological correctness. How is ragging on shaky imputations policing for attitudes? If you just use naive geographic imputations as described, your imputations CANNOT be outside the range of observed values. The problem with this is obvious; data is not MCAR, the data you do have are almost certainly biased to locations with more resources. To take this naive imputation to an extreme example, applying this to South America you’ll impute that a nation of uncontacted peoples have a population at least as high as the least populous South American nation with a census bureau.

          > What positive evidence is there that “a racial essentialist attitude” is actually a problem — in the relevant sense and for answering relevant questions?

          What evidence is there that implicitly making obviously very wrong assumptions in the data analysis distorts your results?

  3. What a great idea, to average over a society! It’s like sitting down to a meal starting with an apertif, a Caesar salad with a fine vinaigrette sauce, shrimp linguine, sauteed brussels sprouts, garlic roasted potatoes, a fine red Bordeaux, a dessert of ice cream topped with strawberries and chocolate sauce, and some port to top it off — and, instead of eating it course by course, putting it all in a blender and drinking it. After all, it all goes into the same stomach. And then you can compare the caloric value of different meals, controlling for how much alcohol or chocolate is in each meal.

    And for some reason, they left off any religions of the Asian civilizations. China’s goes back a few thousand years earlier than Judaism.

    • As a Chinese, I have to admit that I don’t know a Chinese religion (that still exists) that goes back a few thousand years earlier than Judaism.

      • Divination, such as the I Ching, counts as religion in my judgment. Religion and medicine have a powerful overlap, so Yin Yang, Five Elements, School of Naturalists, practices such as feng shui count as well. They are quite ancient. Daoist thought probably precedes the Dao De Jing (4th century BC) close in time to the origins of Judaism as the state religion of Judah. The practices referred to as ancestor worship were popular religion that appears to be very ancient indeed.

        The Babylonian Talmud and the Palestinian, which might be argued to be normative in contemporary Judaism, were assembled in the Common Era. Most religions that do not exalt a founder claim to be ancient, but it is foolhardy I think to accept their claims at face value.

      • Or even, putting ancient Chinese rituals and beliefs systems in the same category as Judaism is problematic. But, this relates to the problem of averaging over a society. We’re pretending our categories are reality and not just convenient ways of summarizing information, which works until we take them to seriously.

    • “What a great idea, to average over a society!”

      Let’s also think of other cases where we average over a group — for example, a clinical trial. Often the statistical analysis compares the *average* of the outcomes in the treatment group with the *average* of the outcomes in the comparison group. If the average for the treatment is group is larger than the average for the comparison group, then the treatment is considered “effective” (or “empirically supported”.) But this is just “better on average” — and it is possible that when the average outcome in the treatment group is larger than the average outcome in the comparison group, some people in the treatment group might have gotten worse, while some people in the comparison group might have improved in outcome. So what are the ethics of comparing group averages in this situation?

  4. > nobody from the journal ever went to the trouble of reading the article from beginning to end (I know I didn’t enjoy the task!)

    As much as we debate interesting technical, logical, and scientific issues (and I, for one, learn much from those debates), I think this is the primary reason why cruddy research gets published.

    My worst experiences as a reviewer have been with manuscripts by big names that are nothing more than clumps of words tenuously held by gossamer strands of illogic that dissolve at the slightest thought.

    My guess is that other reviewers do one of two things when confronted by something like this: First, they skim the paper to ensure that it has the right form and most of the right words to be science and assume because it has a noteworthy author that the content is sound. Second, they do read the paper but assume that the headaches it induces are their own fault for lack of understanding, rather than the result of deep problems with a manuscript with a famous co-author.

    Either way, it’s a “deference trap” that serves to perpetuate crummy work, and it ain’t fun!

    • “Second, they do read the paper but assume that the headaches it induces are their own fault for lack of understanding..”

      When I was younger I had several instances of not being able to figure something out in a paper and presuming it was because I just didn’t get it, only to find out later that I couldn’t figure it out because it was wrong! Now my attitude is that it’s the author’s job to make their point and it’s relationship to data and previous work expressly clear; if they can’t do that, they probably don’t get it themselves and it’s probably wrong.

      • “Now my attitude is that it’s the author’s job to make their point and it’s relationship to data and previous work expressly clear; if they can’t do that, they probably don’t get it themselves and it’s probably wrong.”

        May your tribe increase!

    • Sentinel:

      Someone can be a serious scientist and still speak for pay, no?

      What interests me is the attitude of the academic psychology establishment. On the plus side, Baumeister is one of them, he gets the sort of media exposure that they love, he publishes in Psychological Science, he traffics in catchy theories that are supported by the trappings of science, and his popularity is the same sort of popularity that they crave. On the minus side, his theories are kinda right-wing.

      I’m guessing that the way the academic psychology establishment will handle this is to celebrate Baumeister’s worldly success and just look away from the content of his theories.

      • > Someone can be a serious scientist and still speak for pay, no?

        Only someone special like you, Andrew. For the rest of us human scientists, it is likely much more difficult.

        I find it hard to see how taking money from a company to fund your research is very different than actively seeking money for speaking gigs in its potential to compromise one’s work and integrity. The corrupting influence of the profit motive should not be underestimated. Of course, the irony here is that, according to the Association for Psychological Science, a serious social psychologist like Baumeister should be giving psychological science away!

        I wouldn’t be too quick to assume that the field will look the other way. When he first started putting out the ideas that became the basis for his men’s movement masterpiece, ‘Is there anything good about men?’ he did get some pretty pointed blowback from the field. I believe this is why those ideas ended up in a book and not a peer-reviewed journal. But hey, he also wrote a book about masochism, so maybe he enjoys abuse.

        • Sentinel:

          One reason I posted this is that I don’t want the field to look the other way. I want them to own their choices! If they want to publish this sort of work, fine, that’s their choice. They should just acknowledge their choices.

        • P.S. Over the years, I’ve taken and spent many millions of dollars of other people’s money for research, so I’ve had many occasions to reflect on the corrupting power of money. I could imagine a hypothetical world in which I’d never accepted $ for speaking engagements, consulting, or private or government research support (the last of these represents the vast majority of my total outside funding), just taking my salary as a professor and nothing else. I don’t think my work would’ve been much different, but I think I would’ve gotten a lot less done. Most of this difference comes from Stan, which has been funded by several government and private grants. Other than Stan, I don’t know how much effect the funding has had. It’s allowed me to pay many students and postdocs to work with me, and they’ve done great work—but in the absence of the funding, I would’ve done much of this work on my own, and maybe others would’ve worked on these projects without me needing to pay them. Overall, I think it’s still been a plus. For one thing, if these students and postdocs had been on other people’s grants instead of mine, they might have worked on less important projects that they were funded to do. Also, Stan does count; it’s a big deal all on its own. Finally, the general system of funding scientists is a way for the university to be supported: I could decide to refuse grant funding myself, but if nobody at the university took grant funding, maybe they would not have been able to hire me in the first place.

  5. As near as I can tell, the academy as a whole is shifting to the right. Colleges and universities function to reproduce acceptable thought, and what is acceptable is becoming more conservative.

  6. > And, no, it’s not “bullying” or an “ad hominem attack” to consider someone’s research record when evaluating his published claims.

    Narrowly, it is an ad hominem fallacy, isn’t it? I don’t think just saying “no” washes away logical fallacies. However, with that in mind, I think it’s okay to say that a person’s history modifies bayesian priors about their reliability of methods, data, etc. Personally, it modifies my probability of wanting to invest time into something.

    I use the ad hominem fallacy with pride to save my precious time and that of others, but I think it’s important to be self-aware of what one is doing when doing so.

    • Country:

      No, I’m not making an ad hominem fallacy.

      From wikipedia, an ad hominem fallacy typically “refers to a fallacious argumentative strategy whereby genuine discussion of the topic at hand is avoided by instead attacking the character, motive, or other attribute of the person making the argument, or persons associated with the argument, rather than attacking the substance of the argument itself. ”

      In the above post, my correspondent and I discussed problems with the published paper and also discussed the background of the author. I don’t think if it’s ad hominem if such discussion is made in addition, rather than instead of, discussion of the particular material at hand.

      • Yes, your non-hominem part of your argument was valid, but what’s the purpose of adding in the hominem part if it’s not required for the argument? (As mentioned, I think it’s perfectly fine to use ad hominem outside of a specific counter-argument.)

        In my brain, it’s something like this:

        1. Baumeister’s paper P claims X.
        2. X is false because of Y.
        3. Therefore, P is wrong.
        4. Although this is not needed to show #2, I don’t like Baumeister because of Z.

        Call it argumentum ad-superfluousum if you’d like, but I think the basic heuristic of the phrase “ad hominem” covers it.

        • Country:

          I think it’s useful for readers to see where I’m coming from, also I do think it’s relevant in a Bayesian sense: if someone has a track record of doing bad work, this provides information that can help us judge the new work.

          The flaw in your steps 1, 2, 3, 4 is that it implies that each step is deterministic. In real life, papers rarely make specific claims that are strictly true or false, papers are not right or wrong, and my liking or disliking of someone is on a continuous, multidimensional scale. In my Bayesian world. we can get some information from the paper at hand, and we can also get base rate information from other work by the author.

    • ACF: It’s ad hominem, just not a fallacy. Ad hominem is “Brian Wansink has published 99 terrible papers, so I’m going to be skeptical of this 100th paper of his”. Fallacious would be “Brian Wansink says that the sun is shining, so I’m going to assume it isn’t”. The fallacy arises from extrapolating beyond the data (in this case, our knowledge that his papers are terrible) to assume that anything else about the person is more likely to be as bad as well. Arguably with certain current world leaders we are close to the point where no ad hominem fallacy is possible…

  7. “National IQ”?!?!?!?!?!?!?!? Really?

    It’s not like the cultural aspects of IQ measures are news. Why, in living memory, psychologists referred to certain groups with relatively low IQ scores as “culturally disadvantaged”. Along those lines, let me report a bit of research that I heard about second hand in the 1960s. Inner city kids were known to score lower than average on analogies tests of the form, “A is to B as C is to __”. However, in the research the form was changed to “A goes with B like C goes with __”. In that form the inner city kids’ scores were no longer distinguishable from those of other kids.

    • It strikes me that the idea of a national IQ is a logical absurdity.

      The reason being that it sees IQ as being a cultural artifact.

      Which if you accept, of course, means that you don’t actually believe in the construct of IQ

        • Remember, “Ein Volk, Ein Reich!”

          Well we’re probably both too young to actually remember that, but we’ve heard of it. And that kind of idea was not confined to Nazi Germany. One of the causes of WWI was pan-Slavism, which motivated the Black Hand, which was the group that assassinated ArchDuke Ferdinand of the Austro-Hungarian Empire. Even today, what about the Kurds, who live in different nations with boundaries drawn long ago by Europeans. Should there be a Kurdistan?

          And today in the US we have people in the US who talk about “real Americans”, which they understand genetically. Some people even regard “Western Civilization” genetically.

    • Bill said,
      “Along those lines, let me report a bit of research that I heard about second hand in the 1960s. Inner city kids were known to score lower than average on analogies tests of the form, “A is to B as C is to __”. However, in the research the form was changed to “A goes with B like C goes with __”. In that form the inner city kids’ scores were no longer distinguishable from those of other kids.”

      Makes sense. — the original “test” was for vocabulary, not for concept. This is why having a variety of people critiquing tentative questions is important. One case that sticks in my mind involves trying to develop a concise questionnaire to detect psychiatric conditions. The question, “Sometimes I see things that other people don’t see, or hear things that other people don’t hear” (with choice of response “yes” or “no”) was suggested to detect schizophrenia. But when the question was given to “test” subjects, quite a number answered “yes”, and when asked to elaborate, said things like, “I have very good eyesight and often see details that others don’t see,” or “I have really good hearing, and often hear things that others don’t hear”. (So, of course, that question was scratched from the list!)

      • “Surely You’re Joking, Mr. Feynman!” tells of the time when Feynman was asked questions by a psychiatrist when he appeared for the draft. As I recall, he answered that, yes, he had auditory hallucinations. OC, it was plain that he was talking about hypnogogic hallucinations, which are quite common and not indicative of a mental disorder. He also answered that, yes, he often thought that people were looking at him. Feynman then told the psychiatrist that he believed that some of the other young men waiting against the wall were looking at him. Feynman turned around, and reported that indeed, two or three of them were doing so. The psychiatrist did not look up to check. ;)

  8. It is theoretically possible that something Baumeister says is true, of course. But his record of ridiculous and offensive statements can be held against him, because that is the rule of life. I wrote about an insanely sexist rant he co-authored here:

    https://familyinequality.wordpress.com/2014/02/24/price-of-sex/

    He wrote, with Kathleen Vohs:

    “Why have men acquiesced so much in giving women the upper hand in society’s institutions? It falls to men to create society (because women almost never create large organizations or cultural systems). It seems foolish and self-defeating for men then to meekly surrender advantageous treatment in all these institutions to women. … Because of women’s lesser motivation and ambition, they will likely never equal men in achievement, and their lesser attainment is politically taken as evidence of the need to continue and possibly increase preferential treatment for them.”

    And:

    “We speculate that today’s young men may be exceptionally ill prepared for a lifetime of sexual starvation that is the lot of many modern husbands. The traditional view that a wife should sexually satisfy her husband regardless of her own lack of desire has been eroded if not demolished by feminist ideology that has encouraged wives to expect husbands to wait patiently until the wife actually desires sex, with the result that marriage is a prolonged episode of sexual starvation for the husband. … Today’s young men spend their young adulthood having abundant sex with multiple partners, and that seems to us to be an exceptionally poor preparation for a lifetime of sexual starvation.”

    • Philip:

      Wow! Interesting that the academic psychology establishment, even thought they are so overwhelmingly left-wing, is so eager to publish work that is both of low statistical quality and also has right-wing ideology. I guess that the academic psychology establishment’s commitment to a shallow take on evolutionary psychology, along with its valuing of personal connections and fame, is more important to them than politics.

      Next time someone accuses academic psychology of being far left, we can just point to their willingness to publish the scientifically mediocre (at best) work of Baumeister.

    • Philip

      I’m not sure you’re sufficiently aware that empirical statements and claims can never be sexist (or racist, for that matter), because there’s always a chance they turn out to be true, and then you find yourself in the unenviable position of having to call an empirical fact sexist. I guess at that point you’re within shouting distance of those truly insane people that would have you banned from Twitter for saying “Men and women are different.”

      Looking at your quotes of Baumeister & Vohs, I only see empirical claims. They might all be false. They might also be true. Hence, they’re not sexist. A sexist statement would be something like “men are inherently inferior to women” or “men suck qua being men”.

      • Alex:

        See my discussion here. Racism is a framework, not a theory. What is racist is not the empirical fact (or the empirical error) but the perspective. Someone working within a racist perspective can work with a mixture of empirical facts, empirical errors, and non-empirical statements of the form, “It falls to men to create society.”

        • Andrew

          I have to admit that in light of recent world events I increasingly struggle to wrap my head around the concepts of sexism and racism, in particular the issues of how to define them and what precisely their harmfulness is seen to consist of. Even though I gave examples of sexist statements in my own previous comment, I’m not entirely sure what to think of them.

          The way you talked about racism in your critique of Wade’s book doesn’t give me a particularly strong sense that you think racism is morally bad. You call it a “way of thinking about the world”, and admit that you “don’t think racist beliefs are necessarily false”. Can you confirm/clarify?

          Then, does your view imply that someone like Wade does not merely believe honestly that there are facts about genetic differences between races (or “races”) and that these differences cause the superiority in some traits of some racial groups over others? If not, what then does the charge or even just the label of “racism” imply that it does not imply in cases where you wouldn’t use morally charged vocabulary? What, for example, is the categorical difference between the “racism” of Nicholas Wade and the “gene selectionism” of Richard Dawkins?

        • Alex:

          I don’t think my particular take on what is “morally bad” is so relevant. As I wrote before, I don’t think that racist beliefs are necessarily false. Racism is a way of thinking about the world; it’s not a collection of empirical statements, true or false. A racist worldview can mix together various statements that are true, false, and neither true nor false (because they’re poorly defined, or because they’re opinions). For example, Nicholas Wade spends a lot of time explaining, or trying to explain, economic differences between countries whose residents have different ethnic/racial mixes. Economic differences between countries are real.

          Regarding your question about Wade: I have no idea what he honestly believes or doesn’t believe. I can just go by what he wrote. I explained in my review of his book why I characterize it as racist. I’ve not read the work of Dawkins carefully. But, in any case, there are degrees of racism just as there are degrees of ideologies more generally. There’s no bright line. By saying that Wade has racist views, that doesn’t mean I’m equating him with Jesse Helms or Andrew Johnson.

      • Alex –

        > I’m not sure you’re sufficiently aware that empirical statements and claims can never be sexist (or racist, for that matter), because there’s always a chance they turn out to be true, and then you find yourself in the unenviable position of having to call an empirical fact sexist.

        What if they are false, and that their false is a function of facile reasoning which is “motivated” by sexist or racist prejudices? History is replete with examples.

        • Joshua

          I’ve tried and failed to understand how your statement is a challenge to what I’ve said, but maybe it’s not meant to be?

          Of course, what you describe is something we should want to avoid, but there’s no way around actually making the case that something is empirically false and therefore a prejudice. Certainly, you can’t let your sense of what is morally desirable determine your judgement on the empirical truth or otherwise of a claim.

        • Alex –

          > but there’s no way around actually making the case that something is empirically false and therefore a prejudice.

          But that’s not what I was suggesting. What I was describing is a very common phenomenon: that an empirical conclusion is false, because it was rooted in a misdirection misperception rooted in bias, or prejudice, and in some cases racism.

          In that sense it is false because it is racist.

          > Certainly, you can’t let your sense of what is morally desirable determine your judgement on the empirical truth or otherwise of a claim.

          That’s not what I’m suggesting. What I’m describing doesn’t in any way have to interact with one’s interpretation of morality.

  9. Personally, I think the way everyone seems to hit the roof when someone postulates group differences in IQ, let alone tries to draw conclusions from that, tells us a lot more about our society and its value system than about IQ.

    – When exactly did we (we academics?) decide that intelligence is a measure of a person’s intrinsic worth to the extent that any attempt to identify IQ differences between groups is seen as an attempt to dehumanise some group or devalue its members as human beings? Does anyone really think that someone (or a group) with an (average) IQ of 115 is somehow less human than someone (or a group) with an (average) IQ of 120? Because that’s the toxic assumption that seems to underlie much IQ anxiety.

    – When exactly did IQ overtake, say, physical strength as the dominant social status marker in individuals in our society? Presumably, that arrangement benefited some while disadvantaging others.

    – How did we convince ourselves that a society largely stratified by IQ – something that is partly inherited, partly acquired through social background – is more meritocratic than a society based on any other factor? It’s not as if people with a high IQ worked hard to get it.

    – People with higher IQ have higher status in Western societies, but so do taller people. Why is it OK to postulate that there are group differences in average height, but not group differences in average intelligence?

    BTW, anyone seriously interested in the subject of IQ should read Flynn’s brilliant book “What Is Intelligence?”. It’s dense but immensely rewarding.

    • Till:

      I don’t there is something that “we (we academics?) decide” as a group. Baumeister and the editors of Psychological Science are academics, and they published a paper that is pretty much junk, of interest primarily because of their ridiculous extrapolations to strong political claims. But I’m another academic, and I think this paper is junk. So, even if Baumeister etc. have decided that intelligence is a measure of a person’s intrinsic worth etc., that’s just them (and the editors of Psychological Science), it’s not me or many other academics!

    • Ivy league colleges in the early 20th Century had mixed feelings about admitting students on test scores versus character, bloodlines, athletic ability, etc. By the early 1920s, Jews were coming so strongly to dominate the top test scores that most of Ivy League put soft caps on the percentage of Jews among the freshmen class.

      The demands for cognitive talent of WWII and the Cold War brought the Ivy League around again on this question of IQ vs. other characteristics, with Harvard getting rid of its limits on Jewish students in the 1950s and Yale in 1965.

      Now, however, with Asian test scores pulling away from everybody else, Harvard imposes a soft cap on the percentage of Asian students it admits, much like the WASPs tried to keep down the number of Jews about 95 years ago.

      • This is an very interesting point. Thank you.

        I am also reminded how the introduction of IQ tests in the UK by Sir Cyril Burt dramatically increased
        the representation of working class boys in preparatory schools.

        If present trends continue, it seems we’re looking at a classic situation of “Those who forget history are doomed to repeat it”.

        • Indeed.

          My contrarian theory of why support for use of the SAT and ACT is collapsing is this:

          While everybody claims they’ve turned against SAT and ACT because they are so biased against blacks and Hispanics and in favor of whites, in reality, very little has changed over the decades in terms of those particular racial gaps. Instead, the racial gap that has gotten much bigger in the 21st Century is that Asians are leaving whites in the dust. The average Asian advantage on the SAT over whites has expanded from about 10 points in 2000 to about 100 points in 2019, a half of a standard deviation.

          As upper middle class white families increasingly lose the test score struggle with their Asian rivals, they are increasingly giving up supporting testing in order to take away their Asian rivals’ best weapon in the college admission race.

        • As an Asian 100% in favor of affirmative action, some part of me hopes that the Abigail Fishers of the world get what they think they want just to see them realizing they never wanted it in the first place.

        • I clicked that link, and over on the right side of the page there was a “Featured Book / HTML Books” section, which linked to “The Hoax of the Twentieth Century”.

          From https://en.wikipedia.org/wiki/The_Hoax_of_the_Twentieth_Century: “Butz argues that Nazi Germany did not exterminate millions of Jews using homicidal gas chambers during World War II but that the Holocaust was a propaganda hoax.” I assume since this was published in 1976 there would have been time for someone to add a counter-argument to the Wikipedia article if there was one.

          I guess there’s already a discussion of what an ad hominem attack is or is not above.

          I’m sure the UC administration is totally capable of making the (right/wrong) decisions for the (right/wrong) reasons, but I have a hard time caring about the academic quality of an article you wrote on a website promoting Holocaust denial.

        • Ben:

          The holocaust denial thing is just horrible. It could be instructive to read the sociology or anthropology literature on the topic. One thing that puzzles me is the way that holocaust denial and racism go together. On one hand, it makes sense: the Nazis were the ultimate racist organization, so if you’re a racist, you’ll want to think good things about them. On the other hand, if you look at things from a “race science” perspective then I’d think the reasoning would go as follows: race explains all sorts of things about the world, so of course people are personally racist (I think racists are ok with the idea that most people have an aversion to other races), so from that perspective it shouldn’t seem so surprising that genocide happens from time to time, when a dominant group has the opportunity to do so and when for reasons of war or whatever it’s hard for sane people to get together and stop it. That is: I’d think that if someone held racial-essentialist scientific beliefs and a racist political ideology, that genocide would seem completely natural to them (just an extreme version of other common race-based policies such as discrimination, slavery, and apartheid). So it just seems illogical for these people to go for holocaust denial. It would be like an astrologer also being a flat-earther.

          I mean, sure, from a historical point of view I understand that racism, like other marginalized ideologies, gets mixed with conspiracy theory, and holocaust denial is just one more version of anti-Jewish conspiracy theorizing. Also there’s the idea that holding an extremely unpopular view is a commitment device. If that website advertises holocaust denial, they’re sending a signal to their readers that they’re all-in on racism. Indeed, it’s a costly signal as it also conveys to outsiders the message that the site can’t be trusted. So, yeah, I can understand how race science and holocaust denial can go together. From a logical standpoint, though, it strikes me as a bit off.

        • > That is: I’d think that if someone held racial-essentialist scientific beliefs and a racist political ideology, that genocide would seem completely natural to them (just an extreme version of other common race-based policies such as discrimination, slavery, and apartheid). So it just seems illogical for these people to go for holocaust denial. It would be like an astrologer also being a flat-earther…From a logical standpoint, though, it strikes me as a bit off.

          The logic of fascism and fascism adjacent viewpoints is not to establish truth, but to build a strong coalition. In that sense, it’s perfectly logical to deny the holocaust because the holocaust, to put it mildly, makes fascism look unattractive to most people. It’s logical from the perspective of recruitment, not history. That’s why you get high profile holocaust deniers who also appeared at Charlottesville chanting “gas the Jews, race war now.”

          It reminds me of the Nazi’s “Aryanized” mathematics. Dismissing a “Jewish” formalism of mathematics is obviously illogical in the most literal possible sense. The only thing that mattered was building a sense of consolidated German science and superiority.

        • Steve –

          > My contrarian theory of why support for use of the SAT and ACT is collapsing is this:

          You paint with far too broad a brush. Opposition to standardized testing exists for a variety of reasons. My opposition primarily rests on educational principles, not merely that it is a poor criterion for college admission, and I am far from alone in that respect.

        • I think the issue he’s pointing out is that the fundamental principle of the objections have been stable over time while pushback against for the tests amongst upper middle class white suburbia has grown rapidly, and the timing coincides with a wave of highly educated Asian immigrants and their high achieving children pushing ahead of the crowd. It’s a deeply cynical take, but it’s non-exclusive with having genuinely high minded educational ideals. Maybe the white parents always hated the SAT/ACT, but until it actually started getting white kids pushed out of top schools they didn’t care *
          enough to put in the effort to campaign against it.

          Don’t know if he’s right, but it’s certainly a provocative idea.

        • somebody –

          > while pushback against for the tests amongst upper middle class white suburbia has grown rapidly, and the timing coincides with a wave of highly educated Asian immigrants and their high achieving children pushing ahead of the crowd.

          Perhaps. But I’d need to see actual longitudinal evidence in that regard. I can see a certain logic to the proposition, but I’m also reflexively skeptical of attributions of evil-doing by the powerful elites – especially on the Interwebs – unless it is accompanied by evidence.

          I have seen a lot of pushback against the SAT as a college admission criterion for a very long time. For example, FairTest (which comes from a diametric angle) has been around since 1985.

        • Joshua:

          I’m not sure how important this is, but another factor is that I think it’s easier to cheat on these exams than it used to be.

        • Andrew –

          I think it is related.

          I’m sure that some of the pushback is connected to the test prep industry, which has been a more prominant factor over time, especially in some Asian counties.

          Relatedly, which I guess you’re referring to, there is the enormous cheating aspect in China and Korea, which may overlap with the supposed anti-Asian flavor that is being referenced here in the comments.

          I don’t know how familiar people are with that phenomenon. Students would take the tests and then go on line immediately afterwards and recreate the questions in a kind of databank and then they would crowdsource the answers. I had Chinese students who told me that they knew the answers for a majority of questions before they had read more than just a small part of a question which was enough to recognize it, and that taking the test became little more than a memory exercise.

          My understanding is that schools would discount scores from students from some countries relative to others at least partly for that reason (and because they wanted to balance the demographic composition of in-coming classes).

        • Probably more important than cheating in my opinion is the drastic increase in the perceived economic importance of a college education and the proliferation of test prep. The longer the test has been around, the more historical tests there are to serve as prep resources and the more total knowledge exists of test specific strategies. So the incentives have created a situation where every year the system gets gamed a little harder by those with the resources to do so.

          Back when I took it, (when there was still an essay), it had gotten to the point where my high school teachers would tell students not to think about writing a good essay, just handing out the checklist of things to reference and saying “remember, these are tired teachers funneled into a room reading essays for the same prompt for 6 hours, they just want to be done.” I was also probably the only person I knew personally who took the exam only once and didn’t buy a book for it.

          So there’s lots of other things that seem more important than the Asian thing to me–though they aren’t unrelated either! Higher income Asian immigrants will shell out a massive sum on test prep for their kids. I suspect anyone on this blog who lives in the right area can increase their income by quitting their job and running test prep full time. If the SAT is too boring you can do AMC; the going rate for a private tutor around NYC has inched up to $100 at this point.

          Life is terrible and fairness is impossible.

      • I won’t comment on some of the things in this series of posts, as others already have pointed out some of the really objectionable stuff contained in some of the links provided. But having been at Harvard from 1967-1971, and Yale thereafter, the idea that “Harvard getting rid of its limits on Jewish students in the 1950s and Yale in 1965” is bunk. Harvard for example just found less overt methods of doing the same thing (and not just for Jews, for Blacks also). One such was “geographic distribution”, which was based not on the school you went to, but to where you “lived”. So boarding schools like Exeter and Andover and Choate had students whose “homes” were all over, while schools like Scarsdale High (heavily jewish) or the elite NYC public high-schools (jews and blacks) were from the same location. “legacy” and other such things served the same purpose. My year, there were only 2 accepted from Scarsdale high, and I think it was something like 20-some from Exeter, even though a study I am aware of at that time showed that about the top 10 at Scarsdale had better everything than perhaps the top 1-2 from Exeter. There were similar low number from the elite NYC public schools. Harvard and Yale may have gotten “better” since then, but as we are even seeing now, there are many buzzwords and indirect ways of implementing prejudice.

        • “having been at Harvard from 1967-1971, and Yale thereafter, ” brings back memories of deciding in around 1965 or 1966 what graduate schools to apply to. The Princeton catalogue said explicitly, “Admission is normally limited to adult males.” Scratch that one. I also scratched Yale, since I knew a woman a year ahead of me who hadn’t been accepted at Yale, but was accepted at Chicago and Cornell. So I only applied to Chicago and Cornell.

        • OMG – someone on this blog is even older than I!!!!! But in all seriousness, as I am sure you well recall, back then the discrimination against women was pretty overt – there were barely efforts to hide it. And in just my few years in grad school, I knew of at least two cases (not in my dept. thank god but a related one) of women close to finishing their PhD, who were then drummed out of the dept. when they refused to have sex with their main professor. But let’s go back to the good old days.

          Just as a side quiz for history buffs, what was the year that Harvard had its first jewish full professor and who was it, and second how strong was the Bund in the faculty of Harvard during the 1930s?

        • “as I am sure you well recall, back then the discrimination against women was pretty overt – there were barely efforts to hide it.”

          The chair of the math department at my undergraduate university (University of Michigan, Ann Arbor) said publicly that he would never consider hiring a woman faculty member.
          There were “nepotism” rules that prevented the university from hiring wives of faculty members (including as a librarian). So a lot of wives of Michigan faculty members worked down the pike at Eastern Michigan University (known as “Ypsi”) in Ypsilanti.

        • I graduated from Harvard in 1966. I am confident in my memory that the two high schools that provided the largest number of my classmates were Phillips and Bronx Science—there may have been low numbers for other NYC elite schools—but not all. I was from a blue-collar public high school on the west coast—with very good (like good for Harvard) SAT scores. So, I think that (1) geographic distribution and (2) reasonable reliance on the SAT were (and may well still be) excellent policies.
          Anon
          PS. After college and graduate school I was unable to find a job similar to taking SAT tests.

        • I’m a mere whippersnapper by local standards (MIT ’76), but MIT was dense of Bronx Science, Stuyvesant, and Boston Latin (from where I graduated) folks. So not much geographical diversity, it seems. Nowadays, unlike Harvard, MIT has simply accepted reality and something like 40% of the undergraduate student body is Asian-American. Last spring, at an MIT event, a kid I found myself talking too was also a Boston Latin grad, whose parents were Taiwanese. I sensed a bit of a high-academic pressure background, and asked if his parents were both MDs. They were.

          That Asian-Americans do well on SAT/GRE sorts of things is interesting. My experience was that prep for the reading section required lots and lots and lots of reading. Which I had done. (I had finished all the SF in the Boston Public Library by sixth grade, and when on to the hippy/lefty approved literature of the time after that. Dead white males, mostly, though.) But, presumably, many of the Asian-American kids are taking it in their second language. I’m currently in the midst of a project to bring my second language (which I worked as a translator in for 25 years) up to local college standards, and it’s not easy…

        • A late friend of mine taught history at Yale in the 1960s. He said the difference in intellectual intensity he observed in the classroom between the freshmen in 1964 (when George W. Bush was admitted) and in 1965 (when the quota on Jews was removed) was striking. Bush, for example, who had felt more or less right at home as a freshman, became increasingly alienated from the student body over his last three years at Yale.

    • anyone seriously interested in the subject of IQ should read Flynn’s brilliant book “What Is Intelligence?”. It’s dense but immensely rewarding.

      I found that book frustrating, not because it was dense but because it was discursive and poorly organized. Still, it would inform the discussion here. Among other things, it has a fairly straightforward definition of intelligence. It also has relatively clear discussions of IQ and the meaning of g.

      It’s probably best at offering an explanation for the Flynn Effect. I suppose that’s no surprise.

      • John:

        I was impressed by Flynn’s argument about the impossibility of meritocracy, which I reviewed back in 2005. That was from an article he wrote, not a book, but I’m assuming his position on this hasn’t changed. Here’s his argument:

        The case against meritocracy can be put psychologically: (a) The abolition of materialist-elitist values is a prerequisite for the abolition of inequality and privilege; (b) the persistence of materialist-elitist values is a prerequisite for class stratification based on wealth and status; (c) therefore, a class-stratified meritocracy is impossible.

        As I summarized, “Meritocracy won’t happen: the problem’s with the ‘ocracy.”‘ This argument seems solid to me, and I’ve been pushing it hard ever since, but it hasn’t really caught on with the pundit class, I suspect because they’re too comfortable with the “Meritocracy: good or bad?” trope. If I were Flynn, this would irritate me to no end. Some things just need to be rediscovered over and over again, in op-ed after op-ed. Another example would be “the cultural contradictions of capitalism,” which, again, pundits keep rediscovering as if it’s this new thing that came up.

        • I fear that I never found Flynn’s argument about meritocracy clear. (In your post, you quote his “sociological” framing of the argument alongside the “psychological” framing, and that does help.) This isn’t to say that he’s wrong; it’s just to say that there may be other reasons why the pundit class hasn’t taken up his argument.

          But perhaps some member of that class have taken it up. For example, here is a Times columnist:

          the meritocratic ideal ends up being just as undemocratic as the old emphasis on inheritance and tradition, and it forges an elite that has an aristocracy’s vices (privilege, insularity, arrogance) […] the meritocratic elite inevitably tends back toward aristocracy, because any definition of “merit” you choose will be easier for the children of these self-segregated meritocrats to achieve

          That seems to me to be in the spirit of Flynn’s argument.

        • John:

          Yes, people keep rediscovering the argument. I just wish that social scientists and pundits could internalize the argument so that they wouldn’t have to keep bringing it up every few years as if it’s a new idea.

        • Abstract
          In the late sixties the Canadian psychologist Laurence J. Peter advanced an apparently paradoxical principle, named since then after him, which can be summarized as follows: ‘Every new member in a hierarchical organization climbs the hierarchy until he/she reaches his/her level of maximum incompetence’. Despite its apparent unreasonableness, such a principle would realistically act in any organization where the mechanism of promotion rewards the best members and where the competence at their new level in the hierarchical structure does not depend on the competence they had at the previous level, usually because the tasks of the levels are very different to each other. Here we show, by means of agent based simulations, that if the latter two features actually hold in a given model of an organization with a hierarchical structure, then not only is the Peter principle unavoidable, but also it yields in turn a significant reduction of the global efficiency of the organization. Within a game theory-like approach, we explore different promotion strategies and we find, counterintuitively, that in order to avoid such an effect the best ways for improving the efficiency of a given organization are either to promote each time an agent at random or to promote randomly the best and the worst members in terms of competence.

        • Joshua:

          To solve the problem of the Peter Principle, why not just demote people when they show incompetence, or give people temporary appointments so they can demonstrate their competence or incompetence?

        • Andrew –

          I think that random promotion to produce optimal results is the most elegant method – because of the implications as to how people build on their biases to inform their views on meritocracy.

          > To solve the problem of the Peter Principle, why not just demote people when they show incompetence,

          but how did they first rise in the hierarchy to a point where they could be demoted?

          > or give people temporary appointments so they can demonstrate their competence or incompetence?

          I kind of like that idea – except it’s also problematic because of the biases in how people judge competence, and the problematic methods that we use for assessment (such as SATs) that lack real world validity and reliability.

          My own “prior” is that appointments should be temporary on an on-going basis – in other words rotated – where people assume a variety of roles within an organization. I can see many reasons why the experiences gained at multiple organizational levels would have a synergistic effect.

  10. The College Board and the testing organizations have published enormous amounts of data over the decades. The U. of California commissioned an expert panel of faculty members to analyze whether or not to keet the SAT and ACT — the professors agreed strongly that the SAT/ACT was crucial to admissions and must be kept. The politicians then voted to ignore the scientists.

    Perhaps even better are the many longitudinal panels, such as NLSY79 and NLSY97 which gave the Pentagon’s SAT-like AFQT enlistment test to 10000+ national representative young people and have tracked them (and their children) ever since. I’m sure you carefully studied a 1994 book relaying the results of NLSY79: it’s called “The Bell Curve.”

    The point is that hundreds of scholars have reviewed and argued over test results for scores of years, with virtually no breakthroughs — since Flynn’s important “Flynn Effect” in the 1980s — at undermining the settled science.

    • Absent a demonstration of an alternate cause of the Flynn effect, the only reasonable, prudent conclusion to draw from its existence is that is proves conclusively a defect in intelligence testing. IQ testing needs to be re-conceptualized at the very least. And all research based in IQ is fundamentally flawed and none if it can be relied on. Currently it is crackpot.

      The fact that it is not so regarded in the academy is a problem for the academy, not reality. At this point, it’s all crackpot, persisting for its sociopolitical usefulness, not scientific merit.

      Not quite my opinion: The Flynn effect proves a defect in the whole system.

      • Flynn offers a speculative but non-crackpot explanation of the Flynn Effect in What Is Intelligence?, especially on pages 23-29 (revised edition).

        He starts from the observation that generation-to-generation increases in IQ scores are not distributed evenly across the various IQ subtests. Instead, they are concentrated in those subtests that most privilege abstract over concrete reasoning. He offers examples on pages 23-29.

        Flynn infers that generation-to-generation IQ-score increases don’t signal increases in intelligence writ large. Instead, he infers that they signal more specific, widespread gains in the ability, or tendency, to think in terms of abstractions. He credits the expansion of formal schooling in the 20th century. (He also credits change in “the nature of our leisure activities,” but I don’t follow his argument on this point.)

        • Re: (He also credits the change in “the nature of our leisure activities,” but I don’t follow his argument on this point.)
          —–
          Hobbies quite possibly, if Flynn equates hobbies to leisure activities because indulging in hobbies is borne from natural curiosity and facilitates making tacit & meta-analytic connections. This is just a guess, on my part.
          But I hesitate to suggest that current education has led to higher quality conceptual development. It’s a complex subject.

        • Hobbies have become more similar to IQ tests over the decades. Everybody plays around with 2-d surfaces. Nobody whittles anymore, and fewer people have 3 dimensional hobbies in general. People do more and more stuff on 2-D surfaces that look rather like a Raven’s Matrices IQ test question. I call it a superset of Moore’s Law: information has become ever cheaper since Gutenberg, so we’ve gotten better at dealing with it. I don’t think it’s wholly a coincidence that the Father of American IQ Testing (Louis Terman) and the Father of Silicon Valley (Fred Terman) were father and son Stanford professors.

        • The notion that the 2D format in IQ testing is finding more skill with 2D format in changing culture populations—which for economic and political reasons are trending to a certain uniformity in material culture—is plausible.

          What is not one bit plausible is that this is not an artifact of the IQ testing format. Even less plausible is that this measures anything useful about intelligence of groups in the sense required by the article cited in the post.

          I suggest that there is a common process where inconvenient refutations by disappearing measurable effects seem to be resolved by inventing another distinction. The hypothesis of “g” meets difficulties in actual usage in describing the real world, a distinction between fluid and crystallized “g” is invoked. Whether IQ testing format can measure fluid “g” isn’t an issue that has been explained to the laity.

        • John: Thanks. But, it looks like it only has a total score. That’s not at a low enough level for what I am interested in.

        • I see.

          NLSY79 does contain item-level info for the ASVAB. That’s not an IQ test, though. And I can’t see that the public dataset and documentation tell you what any given item asked. For each subtest, the items seem to be listed just as “Item 01,” “Item 02,” etc.

          More details here on the item-level data in the ASVAB for NLSY79.

          There may be other relevant item-level data in the NLSY studies, perhaps especially if you get a license to view restricted data. But I don’t know these datasets well.

        • I talked to the head of psychometrics for one of the major branches of the military who gave Herrnstein and Murray all the Pentagon’s NLSY/ASVAB data in 1990. He said the only thing wrong with “The Bell Curve” was that the authors pulled their punches.

        • I’d be happy to discuss this paper as a starting point for a discussion if you would like. While I agree with the authors that the malleability of intelligence is far greater than typically acknowledged, I think they are too quick to accept the heritability estimates from the recent literature. I’ve read a lot of the older literature on IQ and individual differences in intelligence and I have never read one that treated “environmental factors” in a way that did not bias the estimate in the genetic direction.

  11. Methods: Bad. Bigotry: Worse. But the main problem is just this: Psychologists are not qualified to study society and culture at the nation-state level. And psych journals aren’t equipped to evaluate it.

    Yes, intelligence and religiosity and aggression are all individual-level characteristics, but the study uses them in aggregate, so this isn’t even social psych. The authors do zero analyses that would permit them to make any empirical inferences about how these variables interact within or between individuals. Imagine if you came across a paper in a sociology journal by a group of sociologists, who use several single-subject case studies to evaluate the effectiveness of talk therapy and conclude it doesn’t work. They then explain their findings in terms of unmeasured changes in social attitudes toward mental health treatment.

    This is cargo-cult science by members of the wrong cult–they’re trying to get a submarine to land on their airstrip!

Leave a Reply to Steve Sailer Cancel reply

Your email address will not be published. Required fields are marked *