How would the election turn out if Biden or Trump were replaced by a different candidate?

Paul Campos points to this post where political analyst Nate Silver writes:

If I’d told you 10 years ago a president would seek re-election at 81 despite a supermajority of Americans having concerns about his age, and then we’d hit 8% inflation for 2 years, you wouldn’t be surprised he was an underdog for reelection. You’d be surprised it was even close! . . .

Trump should drop out! . . . Biden would lose by 7 points, but I agree, the Republican Party and the country would be better served by a different nominee.

Campos points out that the claim that we “hit 8% inflation for 2 years” is untrue—actually, “Inflation on a year over year basis hit 8% or higher for exactly seven months of the Biden presidency, from March through September of 2022, not “two years.” It did not hit 8% in any calendar year”—and I guess that’s part of the issue here. The fact that Silver, who is so statistically aware, made this mistake is an interesting example of something that a lot of people have been talking about lately, the disjunction between economic performance and economic perception. I don’t know how Nate will respond to the “8% inflation for 2 years” thing, but I guess he might say that it feels like 8% to people, and that’s what matters.

But then you’d want to rephrase Nate’s statement slightly, so say someting like:

If I’d told you 10 years ago a president would seek re-election at 81 (running against an opponent who is 77) despite a supermajority of Americans having concerns about his age, and with inflation hitting 9% in the president’s second term and then rapidly declining to 3.5% but still a concern in the polls . . .

If Nate had told me that ten years ago, I’m not sure what I’d have thought. I guess if he’d given me that scenario, I would’ve asked about the rate of growth in real per-capita income . . . ummm, here’s something . . . It seems that real per-capita disposable personal income increased by 1.1% during 2023. These sorts of numbers depend on what you count (for example, real per-capita GDP increased by 2.3% during that period) and what is your time window (real per-capita disposable personal income dropped a lot in 2002 and then has gradually increased since then, while the increase in GDP per capita has been more steady).

In any case, economic growth of 1 or 2% is, from the perspective of recent history, neither terrible nor great. Given past data on economic performance and election outcome, I would not be at all surprised to find the election to be close, as can be seen in this graph from Active Statistics:

The other thing is a candidate being 81 years old . . . it’s hard to know what to say about this one. Elections have often featured candidates who have some unprecedented issue that could be a concern to many voters, for example Obama being African-American, Mitt Romney being Mormon, Hillary Clinton being female, Bill Clinton openly having had affairs, George W. Bush being a cheerleader . . . The age issue came up with Reagan; see or example this news article by Lou Cannon from October, 1984, which had this line:

Dr. Richard Greulich, scientific director of the National Institute on Aging, said Reagan is in “extraordinarily good physical shape” for his age.

Looking back, this is kind of amazing quote, partly because it’s hard to imagine an official NIH scientist issuing this sort of statement—nowadays, we’d hear something from the president’s private doctor and there’d be no reason for an outsider to take it seriously—and partly because of how careful Greulich was to talk about “physical shape” and not mental shape, which is relevant given Reagan’s well-known mental deterioration during his second term.

The 2020 and 2024 elections are a new thing in that both candidates are elderly, and, at least as judged by some of their statements and actions, appear to have diminished mental capacity. When considering the age issue last year (in reaction to earlier posts by Campos and Silver), I ended up with this equivocal conclusion:

Comparing Biden and Trump, it’s not clear what to do with the masses of anecdotal data; on the other hand, it doesn’t seem quite right to toss all that out and just go with the relatively weak information from the base rates. I guess this happens a lot in decision problems. You have some highly relevant information that is hard to quantify, along with some weaker, but quantifiable statistics. . . . I find it very difficult to think about this sort of question where the available data are clearly relevant yet have such huge problems with selection.

Both Biden and Trump were subject to primary challenges this year, and the age criticisms didn’t get much traction for either of them. I’m guessing this is because, fairly or not, there was some perception that the age issue had already been litigated in earlier primary election campaigns where Biden and Trump defeated multiple younger alternatives.

Putting this all together, and in response to Nate’s implicit question, if you had told me 10 years ago that the president would seek re-election at 81 (running against an opponent who is 77) despite a supermajority of Americans having concerns about his age, and with inflation hitting 9% in the president’s second term and then rapidly declining to 3.5% but still a concern in the polls, then I’d probably first ask about recent changes in GDP and per-capita income per capita and then say that I would not be surprised if the election were close, nor for that matter would I be surprised if one of the candidates were leading by a few points in the polls.

What about Nate’s other statement: “Trump should drop out! . . . Biden would lose by 7 points, but I agree, the Republican Party and the country would be better served by a different nominee”?

Would replacing Trump by an alternative candidate increase the Republican party’s share of the two-party vote by 3.5 percentage points?

We can’t ever know this one, but there are some ways to think about the question:

– There’s some political science research on the topic. Steven Rosenstone in his classic 1983 book, Forecasting Presidential Elections, estimates that politically moderate nominees do better than those with more extreme views, but with a small effect of around 1 percentage point. When it comes to policy, Trump is pretty much in the center of his party right now, and it seems doubtful that an alternative Republican candidate would be much closer to the center of national politics. A similar analysis goes for Biden. In theory, either Trump or Biden could be replaced by a more centrist candidate who could do better in the election, but that doesn’t seem to be where either party is going right now.

– Trump has some unique negatives. He lost a previous election as an incumbent, he’s just been convicted of a felony, and he’s elderly and speaks incoherently, which is a minus in its own right and also makes it harder for the Republicans to use the age issue against Biden. Would replacing Trump by a younger candidate with less political baggage be gain the party 3.5 percentage points of the vote? I’m inclined to think no, again by analogy to other candidate attributes which, on their own, seemed like potential huge negatives but didn’t seem to have such large impacts on the election outcome. Mitt Romney and Hillary Clinton both performed disappointingly, but I don’t think anyone is saying that Romney’s religion and Clinton’s gender cost them 3.5 percentage points of the vote. Once the candidates are set, voters seem to set aside their concerns about the individual candidate.

– Political polarization just keeps increasing, which leads us to expect less cross-party voting and less short-term impact of the candidate on the party’s vote share. If the effect of changing the nominee was on the order of 1 or 2 percentage points a few decades ago, it’s hard to picture the effect being 3.5 percentage points now.

The other thing is that Trump in 2016 and 2020 performed roughly about as well as might have been expected given the economic and political conditions at the time. I see no reason to think that a Republican alternative would’ve performed 3.5 percentage points in either of these elections. It’s just hard to say. Trump is arguably a much weaker candidate in 2024 than he was in 2016 and 2020, given his support for insurrection, felony conviction, and increasing incoherence as a speaker. If you want to say that a different Republican candidate would do 3.5 percentage points better in the two-party vote, I think you’d have to make your argument on those grounds.

P.S. You might ask why, as a political scientist, I’d be responding to arguments from a law professor and a nonacademic pundit/analyst. The short answer is that these arguments are out there, and social media is a big part of the conversation; the analogy twenty or more years ago would’ve been responding to a news article, magazine story, or TV feature. The longer answer is that academia moves more slowly. There must be a lot of relevant political science literature here that I’m not aware of . . . obviously, given that the last time we carefully looked at these issues was in 1993! I can read Campos and Silver and social media, process my thoughts, and post them here, which is approximately a zillion times faster and less effortful than writing an article on the topic for the APSR or whatever. Back in the day I would’ve posted this on the Monkey Cage, and then maybe another political scientist would’ve followed it up with a more informed perspective.

1. Why so many non-econ papers by economists? 2. What’s on the math GRE and what does this have to do with stat Ph.D. programs? 3. How does modern research on combinatorics relate to statistics?

Someone who would prefer to remain anonymous writes:

A lot of the papers I’ve been reading that sound really interesting don’t seem to involve economics per se (e.g.,, but they usually seem to come out of econ (as opposed to statistics) departments. Why is that? Is it a matter of culture? Or just because there are more economists? Or something else?

And here’s the longer version of my question.

I’ve been reading your blog for a couple of years and this post of yours, “Is an Oxford degree worth the parchment it’s printed on?”, from a month ago got me thinking about studying statistics. My background is mainly in engineering (BS CompE/Math, MS EE). Is it possible to get accepted to a good stats program with my background? I know people who have gone into econ with an engineering, but not statistics. I’ve also been reading some epidemiology papers that are really cool, so statistics seems ideal, since it’s heavily used in both econ and epidemiology, but I wonder if there’s some domain specific knowledge I’d be missing.

I’ve noticed that a lot of programs “strongly recommend” taking the GRE math subject test; is that pretty much required for someone with an unorthodox background? I’d probably have to read a topology and number theory text, and maybe a couple others to get an acceptable GRE math score, but those don’t seem too relevant to statistics (?). I’ve done that sort of thing before – I read and did all the exercises in a couple of engineering texts when I switched fields within engineering, and I could do it again, but, if given the choice, there are a other things I’d rather spend my time on.

Also, I recently ran into my old combinatorics professor, and he mentioned that he knew some people in various math departments who used combinatorics in statistics for things like experimental design. Is that sort of work purely the realm of the math departments, or does that happen in stats departments too? I loved doing combinatorics, and it would be great if I could do something in that area too.

My reply:

1. Here are a few reasons why academic economists do so much work that does not directly involve economics:

a. Economics is a large and growing field in academia, especially if you include business schools. So there are just a lot of economists out there doing work and publishing papers. They will branch out into non-economics topics sometimes.

b. Economics is also pretty open to research on non-academic topics. You don’t always see that in other fields. For example, I’ve been told that in political science, students and young faculty are often advised not to work in policy analysis.

c. Economists learn methodological tools, in particular, time series analysis and observational studies, which are useful in other empirical settings.

d. Economists are plugged in to the news media, so you might be more likely to hear about their work.

2. Here’s the syllabus for the GRE math subject test. I don’t remember any topology or number theory on the exam, but it’s possible they changed the syllabus some time during the past 40 years, also it’s not like my memory is perfect. Topology is cool—everybody should know a little bit of of topology, and even though it only very rarely arises directly in statistics, I think the abstractions of topology can help you understand all sorts of things. Number theory, yeah, I think that’s completely useless, although I could see how they’d have it on the test, because being able to answer a GRE math number theory question is probably highly correlated with understanding math more generally.

3. I am not up on the literature for combinatorics for experimental design. I doubt that there’s a lot being done in math departments in this area that has much relevance for applied statistics, but I guess there must be some complicated problems where this comes up. I too think combinatorics is fun. There probably are some interesting connections between combinatorics and statistics which I just haven’t thought about. My quick guess would be that there are connections to probability theory but not much to applied statistics.

P.S. This blog is on a lag, also sometimes we respond to questions from old emails.

Questions and Answers for Applied Statistics and Multilevel Modeling

Last semester, every student taking my course was required to contribute before each class to a shared Google doc by putting in a question about the reading or the homework, or by answering another student’s question. The material on this document helped us guide discussion during the class.

At the end of the semester, the students were required to add one more question, which then I responded to in the document itself.

Here it is!

“‘Pure Craft’ Is a Lie” and other essays by Matthew Salesses

I came across a bunch of online essays—“posts”—by Matthew Salesses, a professor of writing at Columbia:

‘Pure Craft’ Is a Lie

How Do We Teach Revision?

Who’s at the Center of Workshop and Who Should Be?

7 Things I Teach: A Manifesto

Also 22 Revision Prompts, which are so great that I’ll devote a separate post to them.

As a writer and a teacher of creative work (yes, statistical analysis is creative work!), I’m very interested in the above topics, and Salesses has a lot of interesting things to say.

I should warn you that he has a strong political take, and the political perspective is central to his thinking—but I think his advice should be valuable even to readers who don’t share his views on cultural politics. I’d draw the analogy to Tom Wolfe, whose cultural conservatism informs his views on art, views that should be of interest even to people to disagree with him on politics. It’s possible to be a big fan of Tom Wolfe while at the same time thinking it was pitiful for him to take his cultural politics so far as to deny evolution. Anyway, you can think what you want about Salesses’s political views and still appreciate his thoughts on writing and teaching, in the same way as you can still enjoy The Painted Word and From Bauhaus to Our House without having to subscribe to Wolfe’s political views. Pushing a position to its extreme can yield interesting results.

And more!

It seems that Salesses was doing this Pleiades Magazine blog for some period in 2015. Some googling turned up this fun list. Here’s Salesses:

I have been thinking for a while about how our attempts to define craft terms influence our students’ (and our own) aesthetics, and I have wanted to try other definitions. How to define “tone,” for example, seemed especially difficult. Here are some alternate definitions, for now:

Tone: an orientation toward the world

Plot: acceptance or rejection of consequences

Conflict: what gives or takes away the illusion of free will

Character arc: how a character changes or fails to change

Story arc: how the world in which the character lives is changed or fails to be changed

Characterization: what makes the character different from everyone else

Relatability: is it clear how the implied author is presenting the characters

Believability: the differences and similarities between various characters’ expectations

Vulnerability: the real author’s stakes in the implied author

Setting: awareness of the world

Pacing: modulation of breath

Structure: the organization of meaning

I guess that he wrote some other cool posts but I don’t know how many, and I can’t find any link that lets me scroll through them.

A few years after writing those posts, Salesses published a book, Craft in the Real World, which . . . is on sale at the local bookstore. I think I’ll buy it!

According to the publisher’s website, Craft in the Real World is an NPR Best Book of the Year, an Esquire Best Nonfiction Book of the Year, an Electric Literature Best Book of the Year. But I hadn’t heard of it until this recent google, following up on Salesses’s posts. It’s a big world out there, when a book written by a Columbia colleague, on a topic that interests me, which received multiple awards, was unfamiliar to me. The funny thing is, when I read the above-linked posts, I thought, This guy should write a book! And it turns out he did.

I recommend reading all this along with the advice of writing coach Thomas Basbøll.

He has some questions about a career in sports analytics.

Sometimes I get requests from high school students to answer questions. For example, Noah C. sent this list:

I’m writing with some questions for my final economics project. We pick a career aspiration, write about the job market and experience necessary to work in the industry, and conduct an interview with someone who has done relevant work before. I chose sports analytics as my field of choice, and I know you’ve done some statistics work with a professional sports team in the past.

Can you provide a general overview of the work you did with the team?
What was the workload like? How did it compare to the normal amount of work you need to do as a professor?
How did you begin working with the team in the first place? Did they contact you, or was it from your end?
What was the culture like from the organization’s side? Were they excessively demanding?
What advice do you have for someone interested in studying statistics?
What facet of experience is most valuable for someone looking for a job in sports analytics?
If someone wanted to work in the behind-the-scenes aspect of sports, is it best to start from within the world of the particular sport, or come in from outside?
How does the application of statistics in a social science/political context differ from a sporting context?

And some less serious questions:

Python or R, which is more useful to know?
What is your favorite metric for evaluating players?
Favorite athletes currently?
(My teacher is a big mets fan) Why are the Mets so bad historically? (be nasty please)

Before answering the above questions, let me emphasize that I only know about some small corner of sports analytics. That said, here are my responses:

1. I helped the team fit multilevel Bayesian models for various aspects of game play and player evaluation.

2. A couple of us met weekly for an hour and a half with some people from the team’s analytics group. They were the ones who did almost all the work. I think we were helpful because they could bounce ideas off us, and sometimes we had suggestions, and also we helped them build, test, and debug their code.

3. People from the team contacted me. They had read one of my books and found the methods there to be useful, and they wanted to go further.

4. We agreed ahead of time on a certain number of hours per week. On occasion we’d do a bunch of work outside the scheduled meetings, but that didn’t take up too much time, and, in any case, it was fun.

5. There are lots of ways to learn statistics. Ideally you can do it in the context of working on some application that is of interest to you.

6. I’m not sure what is the most valuable experience if you want to go into sports analytics. If I had to guess, I’d say programming with data, being able to manipulate data, make graphs, extract insights.

7. Some of sports analytics people I’ve met have a strong sports background; others have strong backgrounds in quantitative analysis and are interested in the sport they are working on. I think it would be hard to do this work if you had little to no interest in the sport.

8. There are some similarities between social-science statistics and sports statistics, also some difference. With sports we typically have a lot more data on individual perfomers. See this post, Minor-league Stats Predict Major-league Performance, Sarah Palin, and Some Differences Between Baseball and Politics.

9. I use R, which is popular in academic statistics, economics, and political science. In the business world, it’s my impression that Python is more popular. I fit my models in Stan, which can be called from R or Python.

10. I don’t have a favorite metric for evaluating players. If you fit an item-response model (see here, for example), then player abilities are parameters in the model and are estimated from data. So in you don’t have a metric (in the sense of some summary that is a combination of an individual player’s stats), you have a model that allows you to simultaneously estimate the relative abilities of all the players. It also makes sense to model players multidimensionality: in the general sense you can break skills into offense and defense, or subdivide them further (different sorts of rushing, receiving, or blocking skills in football; strikeouts, walks, home runs, and performance for balls in play in baseball; different sorts of matchups in basketball; etc.).

11. I haven’t been following sports too closely recently, so my favorite athlete depends on what I’ve been watching lately. Shohei is amazing, Mbappé and Messi gave us quite a show last summer, Simone Biles can do incredible things, ya gotta love Patrick Mahomes, . . . we could go on and on.

12. Hey, the Mets just won today. You gotta believe! Regarding their history, I recommend Jimmy Breslin’s classic book, Can’t Anybody Here Play This Game? Breslin also wrote a beautiful biography of Damon Runyon—a great read for anyone who’s ever lived in this city.

I strongly doubt that any human has ever typed the phrase, “torment executioners,” on any keyboard—except, of course, in discussions such as this.

Greg Mayer writes:

Still both appalled and amused by the notion of “torment executioners,” I googled the term and found earlier usages. Several of the earlier usages seem to be associated with scientific journals that are unfamiliar to me, and come from publishers like “Allied Academies” and “Hilaris Publishers.” Other early usages are associated with health and self help websites. Many usages show a decidedly non-idiomatic grasp of English.

Here are two examples. In “Examination Finds Better Approaches to Battle the Narcotic Emergency” by S. Uttam in the Journal of Psychology and Cognition, and submitted in January of 2021, there are multiple terms using torment: “torment prescriptions,” “torment executioners,” and “torment hindering properties.” (I did not click further into the site, as the page did not inspire confidence.)

And, from a chiropractic clinic in Chicago from July 2017, there are also multiple uses of torment in addition to “torment executioners.” The following sentence from the clinic site captures the general style of the writing in sites featuring the term “torment executioners”:

The most serious issue with back and neck torment from auto crashes is that because of the horrible idea of the mischance makes a substance irritation pathway end up noticeably actuated that doesn’t end rapidly.

Overall, it looks like “torment” pops up repeatedly when the intended meaning is “pain,” but instead some other word is used, either because of a deliberate attempt to avoid using “pain” (because it appears in some source material?) or because the writer is unfamiliar with idiomatic English.

I replied that I did some google searching too, and it looks to me like all the references are either fake research or examples of internet spam or link farms or whatever they are called right now: machine-generated pages that exist to pop up on a google search. I think what these pages do is to scrape text from random places on the internet and then run it through some synonym program to make the plagiarism less detectable.

I strongly doubt that any human has ever typed the phrase, “torment executioners,” on any keyboard—except, of course, in discussions such as this.

Mayer followed up:

The case of the piece by Uttam in the Journal of Psychology and Cognition may be similar to the UNR case—in an actual journal of sorts, and using imprecise synonyms.

The phrase also came up in a book from 2018 called Anxiety Disorder. It seems to be self-published, but Amazon has an audiobook of it.

Indeed, apparently the internet is awash in machine-produced books. Presumably with chatbots this will only get worse.

Report of average change from an Alzheimer’s drug: I don’t get the criticism here.

Alexander Trevelyan writes:

I was happy to see you take a moment to point out the issues with the cold water study that was making the rounds recently. I write occasionally about what I consider to be a variety of suspect practices in clinical trial reporting, often dealing with deceptive statistical methods/reporting. I’m a physicist and not a statistician myself—I was in a group that had joint meetings with Raghu Parthasarathy’s lab at Oregon—but I’ve been trying to hone my understanding of clinical trial stats recently.

Last week, the Alzheimer’s drug Leqembi (lecanemab) was approved by the FDA, which overall seems fine, but it rekindled some debate about the characterization of the drug causing a “27% slowing in cognitive decline” over placebo; see here. This 27% figure was touted by, for example, the NIH NIA in a statement about the drug’s promise.

So here’s my issue, which I’d love to hear your thoughts on (since this drug is a fairly big deal in Alzheimer’s and has been quite controversial)—the 27% number is a simple percentage difference that was calculated by first finding the change in baseline for the placebo and treatment groups on the CDR-SB test (see first panel of Figure 2 in the NEJM article), then using the final data point for each group to calculate the relative change between placebo and treatment. Does this seem as crazy to you as it does to me?

First, the absolute difference in the target metric was under 3%. Second, calculating a percentage difference on a quantity that we’ve rescaled to start at zero seems a bit… odd? It came to my attention because a smaller outfit—one currently under investigation by about every three-letter federal agency you can name—just released their most recent clinical trial results, which had very small N and no error bars, but a subgroup that they touted hovered around zero and they claimed a “200% difference!” between the placebo and treatment groups (the raw data points were a +0.6 and -0.6 change).

OK, I’ll click through and take a look . . .

My first reaction is that it’s hard to read a scholarly article from an unfamiliar field! Lots of subject-matter concepts that I’m not familiar with, also the format is different from things I usually read, so it’s hard for me to skim through to get to the key points.

But, OK, this isn’t so hard to read, actually. I’m here in the Methods and Results section of the abstract: They had 1800 Alzheimer’s patients, half got treatment and half got placebo, and their outcome is the change in score in “Clinical Dementia Rating–Sum of Boxes (CDR-SB; range, 0 to 18, with higher scores indicating greater impairment).” I hope they adjust for the pre-test score; otherwise they’re throwing away information, but in this case the sample size is so large that this should be no big deal, we should get approximate balance between the two groups.

In any case, here’s the result: “The mean CDR-SB score at baseline was approximately 3.2 in both groups. The adjusted least-squares mean change from baseline at 18 months was 1.21 with lecanemab and 1.66 with placebo.” So both groups got worse. That’s sad but I guess expected. And I guess this is how they got the 27% slowing thing: Average decline in control group was 1.66, average decline in treatment group is 1.21, you take 1 – 1.21/1.66 = 0.27, so a 27% slowing in cognitive decline.

Now moving to the statistical analysis section of the main paper: Lots of horrible stuff with significance testing and alpha values, but I can ignore all this. The pattern in the data seems clear. Figure 2 shows time trends for averages. I’d also like to see trajectories for individuals. Overall, though, saying “an average 27% slowing in cognitive decline” seems reasonable enough, given the data they show in the paper.

I sent the above to Trevelyan, who responded:

Interesting, but now I’m worried that maybe I spend too much time on the background and not enough time in making my main concern more clear. I don’t have any issues with the calculation of the percent difference, per se, but rather what it is meant to represent (i.e., the treatment effect). As you noted, and is unfortunately the state of the field, the curves always go down in Alzheimer’s treatment—but that doesn’t have to be the case! The holy grail is something that makes the treatment curve go up! The main thing that set off alarm bells for me is that the “other company” I referenced claims to have observed an improvement with their drug and an associated 200%(!) slowing in cognitive decline. In their case, the placebo got 0.6 points worse and the treatment 0.6 points better, so 200%! But their treatment could’ve gotten 10 points better and the placebo 10 points worse, and that’s also 200%! Or maybe 0.000001 points better versus 0.000001 points worse—again, 200%.

I think my overall concern is, “why are we using a metric that can break in such an obvious way under perfectly reasonable (if currently aspiration) treatment outcomes?”

See here for data from “other company” if you are curious (scroll down to mild subgroup, ugh).

And here’s a graph made by Matthew Schrag, who is an Alzheimer’s researcher and data sleuth, which rescales the change in the metric and shows the absolute change in the CDR-SB test. The inner plot shows the graph from the original paper; the larger plot is rescaled:

My reply: I’m not sure. I get your general point, but if you have a 0-18 score and it increases from 3.2 to 4.8, that seems like a meaningful change, no? They’re not saying they stopped the cognitive decline, just that they slowed it by 27%.

P.S. I talked with someone who works in this area who says that almost everyone in the community is skeptical about the claimed large benefits for lecanemab, and also that there’s general concern that resources spent on this could be better used in direct services. This is not to say the skeptics are necessarily right—I know nothing about all this!—but just to point out that there’s a lot of existing controversy here.

Who is the Stephen Vincent Benet of today?

For some reason the other day I was thinking about summer camp, and in particular the names of some of the campers: Travis Levi, Tony Kiefer, Patrick Amory, Southy Grinalds, Rusty Zorbaugh, . . . I remember very little about the actual kids. Some I liked, some I didn’t. I’m not in touch with any of them. Once on the street several decades ago I saw someone whose face looked familiar, I think my face was familiar to him too, we looked at each other and said hi but then were puzzled and walked away. I think that was Travis Levi but really I have no idea. The names, though, they have an emotional resonance for me. Not because of the people attached to them; it’s more the sound of these names that carries the feeling.

The resonance of these names reminded me of Stephen Vincent Benet’s classic poem, “American Names,” which begins:

I have fallen in love with American names,
The sharp names that never get fat,
The snakeskin-titles of mining-claims,
The plumed war-bonnet of Medicine Hat,
Tucson and Deadwood and Lost Mule Flat.

And ends with the beautiful stanza:

I shall not rest quiet in Montparnasse.
I shall not lie easy at Winchelsea.
You may bury my body in Sussex grass,
You may bury my tongue at Champmédy.
I shall not be there. I shall rise and pass.
Bury my heart at Wounded Knee.

Which made me wonder, who is the Stephen Vincent Benet of today? Back when I would go to used bookstores 40 or 50 years ago, there would often be some dusty hardbound volumes of his poems and stories on the shelves—I guess his books sold a lot of copies in the midcentury period. I think he’d be considered a “middlebrow,” to use the terminology of the time—here’s a good essay on the topic by literary critic Louis Menand. Benet was kinda classy, kinda folksy, took life and literature seriously but with a sense of humor, but lacked some depth; Dwight Macdonald described his book-length poem, John Brown’s Body, as “sometimes solemn, sometimes gay, always straining to put it across, like a night-club violinist.” That’s a problem with book-length poems in general—I’d say the same thing of Vikram Seth’s The Golden Gate, for example. But Seth’s a novelist, not a proclaimer in the mode of Benet.

Here’s a good and unfortunately anonymous mini-biography of Benet, which concludes, “The measure of his achievement, however, is indisputably John Brown’s Body, a poem whose naîveté and conventionality in themes, techniques and viewpoints are raised, by the greatness of its subject and Benét’s devoted craftsmanship, to the level of high folk art.” That seems about right.

I’m thinking that maybe the closest match to Stephen Vincent Benet in recent years is . . . Alice Walker? Successful writer of serious books that are not pulpy but are not quite considered all that as literature, public figure, a conscious representative of America in some way. I’m not sure, but maybe that’s the best fit, recognizing that literature as a whole has a much smaller cultural footprint today than it had a hundred years ago. Another possible match would be Stephen King, who fits into the “folk art” and “Americana” slots but as a massive bestseller has played a different role in our culture.

To what extent is psychology different from other fields regarding fraud and replication problems?

David Budescu read my recent article, How Academic Fraudsters Get Away With It (based on this blog post, in case that first link is paywalled for you), and wrote:

Can’t argue with most of your points and I can’t help but notice that some of them represent potentially testable psychological theories.

The recurrence of these problems in psychology is really painful, especially when some of the people involved are friends and collaborators.

The one point I don’t understand is why people are so eager to highlight the problem in psychology. If you get the daily Retraction Watch email, like I do, or look at their database, it is obvious that the problem is much worse in biomedical research (both in terms of quantity and, probably, potential impact and cost).

I wonder if the obsession with psychology may cause some people to underestimate the magnitude and breadth of the problem. Finally, I am curious how your analysis can explain fraudulent behavior in dentistry, cancer research, etc.

There are two issues here. The first is how the points made in my article, and by others on social media, represent potentially testable psychological theories. I have no idea, but if any psychologists want to look into this one, go for it!

The second issue is to what extent psychology is different from other fields regarding fraud and replication problems. Here are a few things I’ve written on the topic:
Why Does the Replication Crisis Seem Worse in Psychology?
Why Did It Take So Many Decades for the Behavioral Sciences to Develop a Sense of Crisis Around Methodology and Replication?
Biology as a cumulative science, and the relevance of this idea to replication

More on the oldest famous person ever (just considering those who lived to at least 104)

Yesterday the newspaper ran an obituary of Jack Jennings, who was part of the story that inspired The Bridge on the River Kwai:

His family believes that Mr. Jennings was the last survivor of the estimated 85,000 British, Australian and Indian soldiers who were captured when the British colony of Singapore fell to Japanese forces in February 1942. . . .

To build bridges, Mr. Jennings and at least 60,000 P.O.W.s — and thousands more local prisoners — were forced to cut down and debark trees, saw them into half-meter lengths, dig and carry earth to build embankments, and drive piles into the ground.

He died at 104. At first read I thought he was personally responsible for that Kwai story, but it then became clear that he was not himself famous; he was just involved in a famous event. Fair enough. Still worthy of an obituary.

In any case, this made me think about a question we discussed a couple years ago regarding who is the oldest famous person.

In honor of Mr. Jennings, I’ll restrict myself here to people who lived to at least 104. Wikipedia has lists which I assume are pretty comprehensive so I just went there.

Brooke Astor lived to 105 and Rose Kennedy lived to 104.

Marjory Stoneman Douglas lived to 108, but I’d only heard of her because of the horrible crime done at the school that was named after her, so I don’t think this quite counts as being famous for herself. On the other hand, given that shooting, she seems to be the most famous person who’s lived to that age.

The guy who directed and produced Pal Joey, directed On the Town, wrote and directed Damn Yankees, and was involved in a bunch of other Broadway classics lived to 107. He’s named George Abbott, and I’d never heard of him before writing this post, but he seems to be legitimately famous. He wrote the book for a Rodgers and Hart show!

And then there’s Olivia de Havilland, who lived to 104 and was Paul Campos’s choice for longest-living famous person (as always, excluding people who are famous because of their longevity), and I continue to hold out for Beverly Cleary, who lived a to a slightly older 104.

Oscar Neimeyer lived to 104. He’s famous!

Vance Trimble lived to 107! I’d never heard of Vance Trimble or anything about him—his name just jumped out at me when I was going through one of the lists of centenarians on wikipedia—but I should’ve heard of him. Listen to this: “He won a Pulitzer Prize for national reporting in recognition of his exposé of nepotism and payroll abuse in the U.S. Congress. . . . He was inducted into the Oklahoma Journalism Hall of Fame in 1974.” And he wrote biographies of Sam Walton, Ronald Reagan, Chris Whittle, and other business and political figures. Vance Trimble. If he’d lived in New York or Los Angeles, he’d be famous. Or maybe if he’d lived in New York or Los Angeles, he’d just have been one of many many reporters and not stood out from the crowd. Who knows?

George Seldes lived to 104 as well. He’s not famous and hasn’t been famous for nearly 100 years, but I once read a book he wrote, so I recognized the name. He was a political journalist.

Bel Kaufman, author of Up the Down Staircase, which I’ve never read but have seen on a shelf—it has a memorable title—lived to 103. But we’re not considering 103-year-olds here. This post is limited to 104’s and up. If we were covering 103-year-olds, I’d mention that she went to Hunter College and Columbia University! Her actual name was Bella, which for professional reasons was “shortened because Esquire only accepted manuscripts from male authors.” At least, that’s what wikipedia says. Perhaps this particular fact or claim will soon appear uncredited in an article by retired statistics professor Ed Wegman.

Herman Wouk lived to 103 also. He really was famous! He wrote The Caine Mutiny and Marjorie Morningstar, which were made into two iconic films of the 1950s. But, no, we’re not doing any 103’s here, so no more on him.

Jacques Barzun lived to 104. He’s a famous name, used to appear in the New York Times book review and places like that. It’s still hard for me to think of him as famous or important. To me, he just seems like someone who was well connected. Nothing like Olivia de Havilland or Beverly Cleary who made enduring cultural artifacts, or even George Abbott who did what it took to make some of those musicals happen. But I’ve heard of Barzun and had a vague idee of what he did, so I guess I’ll have to count him here.

On to the wikipedia’s list of centenarians who were engineers, mathematicians, and scientists . . . The only one who lived to at least 104 who resonates at all is Arthur R. von Hippel, listed as “German-American physicist and co-developer of radar.” Co-developer of radar . . . that’s pretty important! If I’m gonna count “one of the 60,000 soldiers who was part of the story that inspired Bridge on the River Kwai” as a famous person, then I’ll have to include “co-developer of radar” for sure. And he lived to 105. He didn’t quite reach the longevity of Vance Trimble, but he also “discovered the ferroelectric and piezoelectric properties of barium titanate.” He’s on the efficient frontier of age and fame, at least by my standards. Much more so than, say, Bernard Holden, “British railway engineer.”

Rush Limbaugh’s grandfather lived to 104. Sorry, but being a relative of a famous person doesn’t make you famous. Sure, I mentioned Rose Kennedy earlier, but she’s different. As the Kennedy matriarch, she was famous in her own right.

Huey Long lived to 105! But of course it was a different Huey Long. This oldster was a jazz singer. He was a member of the Ink Spots for several months in 1945. Sorry, not famous by my standards. Sharing the name of a more-famous person was enough to get my attention but not enough to count.

Hmmm, who else have we got? There’s Edward Fenlon, lived to 105, “American politician, member of the Michigan House of Representatives.” Completely obscure, but he’s from Michigan so maybe he’d be on Paul Campos’s list.

It says on wikipedia that Saint Anthony lived to 105 (born in 251, died in 356). He’s famous! But, hey, what can I say, I have some doubts about his numbers.


So, oldest famous person ever? I’ll have to go with the guy who directed Pal Joey, On the Town, and Damn Yankees and lived to 107.

Again on the role of elite media in spreading UFOs-as-space-aliens and other bad ideas

We’ve talked about this one before.

It came up again today when came across this post by Palko on the latest publicity on UFOs as space aliens, this time from political performer Tucker Carlson.

I’d always thought of the UFO-space-aliens thing as being neither left nor right, or maybe more left than right in that the space aliens thing doesn’t fit into biblical fundamentalism.

But if the belief has shifted over to the right, that can make it appealing both to right-wing pundits such as Carlson, who can present it as an anti-government conspiracy theory, center-right pundits such as Tyler Cowen who can support the idea as part of a general trend of suspicion of government and of academic experts, and to center-left pundits such as Ezra Klein and Nate Silver who can use this as an issue to demonstrate how open-minded they are.

Palko linked to the blog of Jason Colavito, where I came across a 2023 in Review post, which featured a month-by-month, blow-by-blow description of UFO hype on the New York Times, Fox News, Politico, CNN, and congressmembers of both parties (including both my senators from New York! ugh).

Colavito concluded:

As 2023 came to an end, what had initially promised to be ufology’s biggest year turned into something of a Pyrrhic victory. Ufologists became the dog that caught the car. Now what? They got everything they wanted, from massive mainstream media coverage to a shiny new Pentagon UFO office to full government funding to a public Congressional hearing with a UFO whistleblower, and all it managed to do is expose the lack of anything besides stories and stories about stories behind the myth of flying saucers. But what a dismal revelation it nonetheless was for me to be proven right that our leaders are listening to kooks and self-deluded fools and are beholden to fabrications and mythology. In a year when conspiracy theories threatened the very Republic itself and democracy hangs in the balance, it chills the bones to realize that the people who will decide our fate and our future can be swayed by a spook story.

I agree. The only thing that I think was missing in his roundup was the pickup of this stuff by media insiders. As I wrote in an earlier blog comment, I think the substance of the matter is less important to them than than Tyler Cowen would call “mood affiliation,” in this case an opposition to people they view as sanctimonious conformists, even in cases where the conformists pretty much have it right. I think that these and other media insiders like being on the opposition side on the UFOs-as-space-aliens issue. It’s kind of dangerous, mildly edgelordy. Indeed, the entire “edgelord” phenomenon fits somewhere into this discussion. Edgelording can be thought of as a form of trolling, but there’s a thin line etc., and some people can take it all too seriously. It’s the difference between sharing outré theories about Barack Obama’s birth certificate or ridiculous election denial theories—ha ha, just poking fun, can’t you guys take a little ribbing, etc.—and shooting up a pizza parlor.

Selection bias leads to confusion about the relative stability of deterministic and stochastic algorithms

If you do any statistical computing at all, one thing you soon realize is that simple algorithms are more stable than complicated algorithms, and deterministic algorithms are more stable than stochastic algorithms.

Least squares is super-stable (unless you have collinearity or near-collinearity, in which case you don’t want to be fitting unregularized least squares anyway); regularized least squares is even more stable and is no more complicated than least squares; logistic regression with maximum likelihood is super-stable unless you have collinearity or separation; regularized logistic regression is just as simple and is even more stable. Getting to more complicated algorithms: variational inference can run into problems; and Markov chain Monte Carlo is wonderful but it can have trouble mixing and you have to be careful to monitor its convergence.

What I want to argue here is that the relative stability of simple and deterministic methods is not a property of the algorithms but rather a reflection of the problems to which they are applied.

Least squares is not a stable algorithm at all! Try to use it to fit a hierarchical model or a mixture model or any latent-variable model and you will get a disaster. But we don’t use least squares for such problems. Similarly for maximum likelihood logistic regression: you can use it on simple problems but it will fail on more complicated item-response models. We use regularized least squares or regularized maximum likelihood for slightly harder problems but it too will fail on many latent-variable problems. When we apply complicated stochastic algorithms to difficult problems, we get some difficulties. Complicated stochastic algorithms would work just fine on simple linear and logistic regressions; we just usually use simpler algorithms for such problems, at least when they are small problems.

The point is that there’s an “ecological correlation” (as they say in social science) between difficulty of problem and complexity of computation algorithms. More complicated algorithms get applied to harder problems. Conditional on the problem, it’s the complicated stochastic algorithms that are typically more stable. Those steps of iteration help avoid getting stuck in bad places, and the randomness allows us to easily monitor mixing. Deterministic algorithms, when they don’t work, often fail in ungraceful ways.

This is not to say you should never use simple deterministic algorithms, just that you should be aware that reason they typically have such good performance in the wild is that they are typically applied to relatively easy problems.

“Nonreplicable” publications are cited more than “replicable” ones?

Pointing to this recent article, “Nonreplicable publications are cited more than replicable ones” and associated press release, Carol Ting writes:

There seems to be something slightly ironic here. The study would be very useful for educational purposes if the analysis was on solid grounds, but the classification seems to imply that studies with p>0.05 in the OSC study are all false positives, which the OSC warned people against in the press release. The finding is cute, but does it make sense to base the analysis on this dichotomous variable? It also raises the bigger issue of communicating findings of replication projects with the public. Even authors have good intentions, the message often gets distorted and all the nuances lost after going through the media pipelines. I guess this one might not be really damaging, but as I read through the press coverage I’ve definitely seen articles taking a very cynical view about scientists and jumping to conclusions based on the 36% replication rate. I wonder what you think about this communication problem.

My reply: What caught my eye, and not in a good way, was the very first sentence in the press release:

Papers that cannot be replicated are cited 153 times more because their findings are interesting . . .

I went into the paper and it turns out that the claim is an additive 153, not a multiplicative 153. That is, the total number of citations of the so-called non-replicated papers was 153 more, on average, than that of the so-called replicated papers. Or something like that. They were fitting some regressions too.

I’d rather report this by saying that some types of papers are cited 1.5 times more than others, or 2 times more than others, or whatever. But I guess that “153” (a number which is actually buried pretty deep in the paper) looks better in the press release. Can’t blame the authors for that!

Here’s the relevant graph from the published article, which was reproduced in the press release:

Later in the article, it says,

On average, papers that failed to replicate are cited almost 16 times more per year. . . . This difference of 16 citations more per year can be benchmarked against the 5-year impact factor of the journal in which the original studies were published, which measures the citations of papers published in the previous 5 years. In 2016, the 5-year impact factor of Nature and Science was 44 and 38, respectively, meaning the papers they published in the same time period as the original studies were cited, on average, 38 to 44 times per year. . . .

OK, 153/16 = 9.6, so maybe the papers they’re looking at are, on average, 9.6 years old? I’m not sure, but I get the general point.

Getting to the explanations, the paper offers a plausible story:

When the paper is more interesting, the review team may apply lower standards regarding its reproducibility.

I agree with Ting, though, that it is a mistake to characterize a paper as “replicable” or “nonreplicable” based on whether a replication study exceeded some p-value threshold.

Update on “the hat”: It’s “the spectre,” a single shape that can tile the plane aperiodically but not periodically, and doesn’t require flipping

Last year we reported on “The hat”: A single shape that can tile the plane aperiodically but not periodically. A commenter pointed out that the aperiodic “hat” tiling included mirror reflections, which led to the question of whether there’s a single tile that can do the job without flipping.

Bob points us to the answer, pictured above from this source: It’s called “the spectre” and it’s an aperiodic tiling for which “reflected copies of the tile are not needed to form a tiling and no tiling with unreflected copies has a repeating pattern.”

Here’s the research paper, A chiral aperiodic monotile, by David Smith, Joseph Samuel Myers, Craig Kaplan, and Chaim Goodman-Strauss.

So cool!

As before, I haven’t checked this result myself, but I have no reason to doubt it.

When the story becomes the story

I was thinking recently about the popularity of Nudge, despite all its serious flaws, not just in presentation but in substance, not just extolling fraudulent science and the later not coming to terms with it, but also being part of a whole academic movement that relies on junk science even when you eliminate the clearly-identified fraud.

I can kinda see why this stuff would be popular with the Ted/NPR crowd, the kind of people who want to take your organization’s spare cash and spend it on management consultants, motivational speakers, and people who will organize lifeboat activities, which I guess is the modern equivalent of making people go to church every Sunday and mouth the words even if they don’t believe.

But how did it become so influential within academia? How is it that psychologists and economists (not to mention business and law professors) at top universities fell for it all?

Part of it is the whole academic-gold-rush thing: Tversky, Kahenman, and their predecessors and successors in the field of judgment and decision making really did indeed have lots of good ideas (see here, for example), and it made sense for other researchers to follow up and for others to popularize and promote the ideas.

So far, so good. But, then, when it moved from lab experiments and studies of defaults to goofy stuff like power pose and bottomless soup bowls and signing at the bottom and himmicanes and all the rest, why has it taken so long for academic researchers to jump off the train (and, indeed, some are still on it, serenely taking drinks in the club car as it goes off the cliff)?

Again, I’ll start with the charitable explanation, which I do think has a lot of truth to it: judgment and decision making is a real area of research, don’t want to throw out the baby, let’s accentuate the positive, etc etc. This is a strategic argument to keep quiet, keep getting some use of the bathwater as it slowly drains out [ok, sorry for switching metaphors but this just a blog post, ok? — ed.], basically the idea is to extract what value there is here and kinda keep quiet about the problems.

But . . . I think something else has been going on, not so much now as ten or fifteen years ago when these ideas were at their height, and that’s that the story became the story, which is indeed the subject of this post.

What do I mean by “the story became the story”? I mean that a big part of the appeal of the Nudge phenomenon is not just the lab studies of cognitive biases, not just the real-world studies of big effects (ok, some of these were p-hacked and others were flat-out faked, but people didn’t know that at the time), not just “nudge” as a cool unifying slogan that connected academic research to policy, not just potential dollars that flow toward a business- and government-friendly idea, but also the idea of Nudge as an academic success. The idea is that we should be rooting for Thaler and Sunstein because they’re local boys made good. The success is part of the story, in the same way that in the 1990s, Michael Jordan’s success was part of his story: people were rooting for Michael to break more records because it was all so exciting, the same way people liked to talk about how world-historically rich Bill Gates was, or about the incredible Tiger Woods phenomenon.

Sometimes when something gets big enough, its success becomes part of the story, and I think that’s what happened with Nudge and related intellectual products among much of social-science academia. One of their own had made it big.

Another example comes up in political campaigns and social movements. Brexit, Black Lives Matter, Barack Obama, Donald Trump: sometimes the story becomes the story. Part of the appeal of these movements is the story, that something big is happening.

It doesn’t have to happen that way. Sometimes we see the opposite, which is that someone or something becomes overexposed and then there’s a backlash. I guess that happened to some extent with Gladwell (follow-up here but also see here). So it’s not like I’m postulating any general laws here or anything. I just think it’s interesting how, in some cases, the story becomes the story.

Whassup with those economists who predicted a recession that then didn’t happen?

In a recent column entitled “Recession was inevitable, economists said. Here’s why they were wrong,” Gary Smith writes:

In an August 2022 CNBC interview, Steve H. Hanke, a Johns Hopkins University economics professor, predicted: ‘We’re going to have one whopper of a recession in 2023.’ In April 2023, he repeated the warning: ‘We know the recession is baked in the cake,’ he said. Many other economists also anticipated a recession in 2023. They were wrong.”

I am not an expert on monetary policy or economics. Rather, this story interests me as a political scientist, in that policy recommendations sometimes rely on academic arguments, and also as a student of statistical workflow I am interested in how people revise their models when they learn that they have made a mistake.

Along those lines, I sent an email to Hanke asking if he had written anything addressing his error regarding the recession prediction, and how he had revised his understanding of macroeconomics after the predicted outcome did not come to pass.

Hanke replied:

Allow me to first respond to your query of January 23rd. No, I have not written up why my longtime colleague John Greenwood and I changed our forecast concerning the timing of a likely recession. But, given your question, I now plan to do that. More on that below.

In brief, Greenwood and I employ the quantity theory of money to diagnose and predict the course of the economy (both inflation and real GDP growth). That’s the model, if you will, and we did not change our model prior to changing our forecast. So, why was our timing on the likely onset of a recession off? After the onset of the COVID pandemic, the money supply, broadly measured by M2, exploded at an unprecedented rate, resulting in a large quantity of excess money balances (see Table 2, p. 49 of the attached Greenwood-Hanke paper in the Journal of Applied Corporate Finance). We assumed, given historical patterns, etc., that this excess money would be exhausted and that a recession would commence in late 2023. Note that economic activity is typically affected with a lag of between 6 and 18 months after a significant change in the money supply. The lags are long and variable, sometimes even shorter than 6 months and longer than 18 months.

We monitored the data and realized that the excess money exhaustion was taking longer than we had originally assumed. So, we changed our forecast, but not our model. The attached Hanke-Greenwood article contains our new forecast and the reason why we believe a recession is “baked in the cake” in late 2024.

All this is very much in line with John Maynard Keynes’ quip, which has become somewhat of an adage: “When the facts change, I change my mind. What do you do, sir?”

Now, for a little context. After thinking about your question, I will include a more elaborate answer in a chapter in a book on money and banking that I am under contract to deliver by July. That chapter will include an extensive discussion of why the quantity theory of money allowed for an accurate diagnosis of the course of the economy and inflation during the Great Financial Crisis of 2008. In addition, I will include a discussion of how Greenwood and I ended up being almost the only ones that were able to anticipate the course of inflation in the post-pandemic period. Indeed, in 2021, we predicted that U.S. headline CPI would peak at 9% per year. This turned out to be very close to the 9.1% per year CPI peak in June 2022. Then, the Fed flipped the switch on its monetary printing presses. Since March 2022, the U.S. money supply has been falling like a stone. With that, Greenwood and I forecasted that CPI would end 2023 between 2% and 5% per year. With December’s CPI reading coming in at 3.4% per year, we hit the bullseye again. And, in this chapter, I will also elaborate on the details of why our initial prediction of the onset of a recession was too early, and why the data have essentially dictated that we move the onset forward by roughly a full year. In short, we have moved from the typical short end of the lag for the onset of a recession to the long end.

Again, macroeconomics is not my area of expertise. My last economics class was in 11th grade, and I remember our teacher telling us about challenges such as whether checking accounts count as “money.” I’m sure that everything is a zillion times more complicated now. So I’ll just leave the discussion above as is. Make of it what you will.

P.S. Since writing the above I came across a relevant news article by Jeanna Smialek and Ben Casselman entitled, “Economists Predicted a Recession. So Far They’ve Been Wrong: A widely predicted recession never showed up. Now, economists are assessing what the unexpected resilience tells us about the future.”

Simulation from a baseline model as a way to better understand your data: This is what “hypothesis testing” should be.

Clint Stober writes:

I would like to let you know about a paper my colleagues and I recently published in Perspectives on Psych Sci, and here is the preprint. We take a critical look at estimation accuracy across the behavioral sciences, using a hypothetical lab reporting random conclusions as a benchmark. We find that estimation accuracy can be so poor that it’s difficult to tell current practice apart from such a lab. It’s a short, but hopefully thought-provoking, paper that provides a different perspective on calibrating tests and the challenges of interpreting small effects. It certainly relates conceptually to Type S and M errors. Perhaps you and your readers will find it interesting. Links below to the article and the pre-print.

I’ve published in the journal Perspectives on Psychological Science, but more recently I was upset because the journal published a lie about me and refused to correct it. That said, journals can change, so I was willing to look at this new paper.

I like the idea of using “this idea of random conclusions to establish a baseline for interpreting effect size estimates.” This is related to what we call fake-data simulation or simulated-data experimentation.

It’s kinda what “hypothesis testing” should be: The goal is not to “reject the null hypothesis” or to find something “statistically significant” or to make a “discovery” or to get a “p-value” or a “Bayes factor”; it’s to understand the data from the perspective of an understandable baseline model. We already know the baseline model is false, and we’re not trying to “reject” it; we’re just using it as a baseline.

Three takes on the protests at Columbia University

As you might have heard, we had some turmoil at Columbia University recently, of a sort reminiscent of, but much less than, the events on campus in 1968. I went to the bookshelves and pulled out three books from that era that discussed those past events:

SDS, by Kirkpatrick Sale

We Must March My Darlings, by Diana Trilling

Discriminations, by Dwight Macdonald

The 2024 protests were similar to the 1968 protests in that they represent a challenge from the left to the university administration and the national government. The contexts differ, though: in the late 1960s there were influential revolutionary movements on the left in the United States and around the world: it was leftists who were saying that the entire U.S. system was corrupt, elections didn’t matter, etc. Since the 1990s, the blow-it-all-up energy in this country has come from the far right—literally in the Oklahoma City bombing and more symbolically with the election denial movement. The far right has done a more effective job of taking over the Republican party in recent years than the far left did with the Democrats in the 1960s-1970s, so not a complete symmetry here. On campus, one difference is that in 1968 the protesters shut the university down; in 2024, that was done by the administration.

My point here is not to offer any comments on what has been happening at Columbia recently—I don’t think I have anything to add beyond what’s already out there. I just wanted to share these things written over fifty years ago, when the political situation was so different.


Kirkpatrick Sale was a radical left-wing journalist, and his book is a detailed, readable, and critical history of the Students for a Democratic Society, an organization that began in the early 1960s and moved further and further left until by 1970 they were setting off bombs and becoming politically irrelevant. The SDS was around the height of its influence when its Columbia chapter was involved in occupying buildings in a long showdown with the administration. Columbia presents the 1968 protests retrospectively in a positive light. Sale devotes 20 pages of his book to the events of Columbia, concluding:

Columbia was a vivid demonstration that (as the general SDS analysis had it) still irrelevant insofar as they pressed for their selfish ends, could be a serious threat to the society when they acted for larger political goals . . . . Moreover, students, through their head-on confrontations with some of the major institutions of the society (universities, police, media) could expose the nature of those institutions, radicalize the community of the young, and create new recruits to the cause of total social transformation.

SDS is right next to The Catcher in the Rye on our bookshelf. Alphabetical order is a funny thing sometimes! Or maybe the point is that if you pull out almost any pair of books, you’ll be able to find some connection. On the other side of SDS is Superior, by Angela Saini, which we discussed in this space a few years ago (I liked the book, Lizzie had problems with it).

We Must March My Darlings

Diana Trilling (next to Calvin Trillin on our bookshelf) was a literary critic and wife of a Columbia professor of English—they lived in the neighborhood—and she wrote about the 1968 events in a long essay for Commentary magazine that is included in the above book. She began by comparing to the mass rally that had happened the year before in Washington, D.C.:

The march on the Pentagon was organized as a protest of the Vietnam war while the war was all but absent as an issue of the University protest. The Washington occasion, taken as a whole, had also permitted a rather broader representation of political views than was manifest in the early stages of the Columbia uprising . . . But these differences are of secondary importance compared with the similar philosophies and tactics of the two events. Both were acts of civil disobedience initiated by people who regard the law as the instrument of established power, the arm of a particular and despised form of social organization. . . .

Existential the two occasions might be, and morally and politically continuous with each other, but the march on the Pentagon was wholly a symbolic enterprise whereas the University uprising, although not without its large symbolic impulse, was shatteringly actual. The Washington demonstration was a protest of the Vietnam war; as such it logically directed itself against the building which houses the Department of Defense. But no one supposed the Pentagon could be occupied or its work halted. The University made a quite different case. For the present-day revolution, all universities are representative of the society in which they exist. This is why they are under assault—for the revolutionary young their schools are their most immediate symbol of the hated social authority.

Also this:

Columbia, the campus itself and its immediate vicinity where many of the faculty live, has for some years been an island, a white island, constantly shrinking. . . . It is the proximity of Harlem to Columbia that made the student uprising of this spring a great deal more than a mere campus manifestation . . . At no point, however, did the black population outside the University make more than a token contribution to the revolt. But this was through no lack of effort on the part of the revolutionary students who launched the insurrection and who continued to have it largely in their charge. . . .

Trilling expresses unhappiness with the anti-university protests, writing:

And education is still sacred for most of us; for where else in this modern universe of ours, unless to education, are we to look for our continuing civilization, where else do we issue our passports to knowledge and enlightenment?


Dwight Macdonald (neighboring book on shelf: The Valachi Papers by Peter Mass) was another literary critic, perhaps more accurately described as a political and culture critic. The very last piece in his very last book was an exchange of letters in the New York Review of Books, on the Columbia student strike of 1968. Macdonald supported the strike (“I’ve never been in or even near a revolution before. I guess I like them”); on the other side was Ivan Morris, a professor of East Asian Languages and Cultures at Columbia. Morris seems to have been on the left as well—he was chairman of the American section of Amnesty International—but he drew the line at revolutionaries occupying buildings and university offices.

Here’s Macdonald:

When I first read about it in the press, I was against it on general principles: I don’t approve of “direct action” that interferes with the freedom of others, nor could I see the justification for a minority occupying college buildings and closing down a great university—or even a small, mediocre university. That was in general. But, as often happened in my life, the general yielded to the pressure of the particular. On Friday I went up to Columbia to see for myself . . . There was an atmosphere of exhilaration, excitement—pleasant, friendly, almost joyous excitement. . . . But what really changed my mind about the sit-ins was my own observation of two of the “communes,” as the occupied buildings were ringingly called . . . the atmosphere in both was calm, resolute, serious, and orderly . . . it was, or seemed to be, participatory democracy . . .

Reading all these accounts, and then writing this post, what strikes me is not so much the disagreements on principles so much as the different functions of the protests themselves.

For Sale, the protests were part of a national revolutionary movement which had achieved some success and some failures. Sale was interested in understanding what worked and what didn’t work, with the (in retrospect unfounded) hope that future left-wing revolutionary movements could do better.

For Trilling, the protests reflected different groups within Columbia, the city, and within the country: it was a power struggle that was happening in her neighborhood and her community.

For Macdonald, the salient thing about the student actions was the protest itself, as representing a way of being that was different from the default top-down organization of business, the military and police, civilian government, schools, and other institutions in society.

All these perspectives, and many others, are of interest. Just reading one recounting of the events, or even one debate with two sides, wouldn’t give a full sense of the different ways of thinking about the events.

Dan Luu asks, “Why do people post on [bad platform] instead of [good platform]?”

Good analysis here. Here are Luu’s reasons why people post on twitter or do videos instead of blogging:


Just looking at where people spend their time, short-form platforms like Twitter, Instagram, etc., completely dominate longer form platforms like Medium, Blogspot, etc.; you can see this in the valuations of these companies, in survey data, etc. Substack is the hottest platform for long-form content and its last valuation was ~$600M, basically a rounding error compared to the value of short-form platforms . . . The money is following the people and people have mostly moved on from long-form content. And if you talk to folks using substack about where their readers and growth comes from, that comes from platforms like Twitter, so people doing long-form content who optimize for engagement or revenue will still produce a lot of short-form content.


A lot of people are going to use whatever people around them are using. . . . Today, doing video is natural for folks who are starting to put their thoughts online.


When people talk about [bad platform] being lower friction, it’s usually about the emotional barriers to writing and publishing something, not the literal number of clicks it takes to publish something. We can argue about whether or not this is rational, whether this “objectively” makes sense, etc., but at the end of the day, it is simply true that many people find it mentally easier to write on a platform where you write short chunks of text instead of a single large chunk of text.


And whatever the reason someone has for finding [bad platform] lower friction than [good platform], allowing people to use a platform that works for them means we get more content. When it comes to video, the same thing also applies because video monetizes so much better than text and there’s a lot of content that monetizes well on video that probably wouldn’t monetize well in text.

Luu demonstrates with many examples.

I’m convinced by Luu’s points. They do not contradict my position that Blogs > Twitter (see also here). Luu demonstrates solid reasons for using twitter or video, even if blogging results in higher-quality argumentation and discussion.

Blogging feels like the right way to go for me, but I also like writing articles and books. If I’d been born 50 years earlier, I think I just would’ve ended up writing lots more books, maybe a book a year instead of every two or three years.

As for Luu, he seems to do a lot more twitter posting than blog posting. I went on twitter to take a look, and his twitter posts are pretty good! That won’t get me to be a regular twitter reader, though, as I have my own tastes and time budget. I’ll continue to read his blog, so I hope he keeps posting there.

P.S. I was thinking of scheduling this for 1 Apr and announcing that I’d decided to abandon the blog for twitter, but I was afraid the argument might be so convincing that I’d actually do it!