The Effect of High-Tech Statistical Analysis on the P-values of Top Researchers

Michael Wiebe writes:

My [Wiebe’s] replication of Moretti (2021) is now accepted as a comment at American Economic Review. Here’s the pdf. I show that the event study uses the wrong model, and the instrumental variable has a coding error. Fixing both issues leads to null results, so the positive correlation between cluster size and patenting may not be causal. I also document eight other issues in the appendix.

We mentioned this one a couple years ago. It’s good to hear that the criticism was published at the same journal as the original article. That almost never seems to happen! Credit to the AER for not being defensive. It’s inevitable that a journal that publishes a lot of papers will publish some with fatal errors; the AER just seems to be one of the few journals that can accept the implications of that.

I expect it took a lot of persistence on Wiebe’s part to get his comment through the review process; I usually lose patience with that sort of thing! For example, I don’t know where this one will ever get published.

“The idea of Israel” . . . more generally, The idea of X, for different values of X

We were talking about herrenvolk democracy in class last week, and I happened to come across a book, The Idea of Israel, published in 2014 by Ilan Pappe. He’s an historian from Israel, now working in England, and his book focuses on the period in the 1990s when it became prevalent in Israeli academia, arts, and the news media to challenge the standard patriotic (“Zionist”) view of the history of Israel/Palestine. Pappe argues that before the 1990s there were almost no prominent anti-establishment voices, then there was a blooming in the 1990s (the “post-Zionist moment”) followed after the failure of the Oslo peace process by a “neo-Zionism” in which dissenting voices have been quieted.

It’s an interesting story and it makes me wonder how it would work in other countries.

U.S. academia has its share of opposition to established narratives, with a tradition going back at least to Charles Beard in the early twentieth century and continuing today, and we see some of this in the news media as well, as least in the soft version such as proffered by TV documentarian Ken Burns. Maybe one reason for this is that the centuries-old political divisions between the North and the (white) South have been accompanied by alternative national myths. So, once you get past 1783 or so, you have two established narratives to choose from, which means that in some ways just about everyone is oppositional. That said, as someone who was born and raised in this country, it’s hard for me to put myself in a state of equipoise where I can read about the American revolution, the First World War, the Second World War, or the Cold War, and not root for our side. I wouldn’t deny that atrocities were committed on all sides of these conflicts, so I’m not saying I’d drink the straight-up 200-proof patriotic history here, but I still have that bias. Then again, I’m not a historian. If I were, and given my temperament, I’d probably just kinda get technical and avoid taking a stand on the big issues.

What about other countries? I have no idea what academia and the news media are like in Egypt, say, but my guess is that they depart from the national party line much less than their counterparts in Israel, let alone in the U.S. What about our neighbors? I’d guess that Canadian academia and culture are similar to the U.S. in broadly supporting the national myths while allowing a fair share of dissent. Mexico is like the U.S. in a different way in that revolution and conflict are part of its national culture: unlike in Israel or Egypt, I doubt there’s a single dominant national myth in Mexico, so I imagine there’s space in Mexican academia and culture for a variety of takes on the country’s history.

There have also been changes over time. I’ve been reading the memoirs of Raymond Aron, a French sociologist who was highly politically engaged for much of the twentieth century, and his stories of the tumult in French academia and news media in the late 1950s during the war in Algeria reminded me of Pappe’s discussion of Israel. There were dissenting voices in France at that time, Aron’s among them, but the dissenters got a lot of angry pushback: they were attacked as traitors, and the opposition was not a comfortable place to be. Perhaps if Charles de Gaulle had been assassinated and a hard-line government had taken over, France in the 1960s might have moved to a neo-patriotic period of cultural repression in the mode of Israel’s neo-Zionism.

After finishing a book I like to triangulate by reading some reviews to get other takes. Directly googling Pappe’s The Idea of Israel was not helpful because all I got were brief positive reviews that didn’t add anything to the book. But more searching yielded some harsh negative reviews of some of Pappe’s earlier books. Some of these were just empty political attacks, but I came across two meaty negative reviews by Benny Morris, an Israeli historian, roughly the same age as Pappe who followed a different political trajectory. Like Pappe, Morris wrote from the late 1980s onward about Israeli war crimes (covering the period before, during, and after the 1948 war), but then Morris moved to what Pappe would call a neo-Zionist position. So the two historians are now in political opposition, even while they agree on many of the historical facts.

In his reviews from 2004 and 2011, Morris pulls no punches, using terms like “complete fabrication,” “absurd,” “can almost be called a deliberate system of error,” “one of the world’s sloppiest historians,” “he will also simply and straightforwardly falsify evidence,” etc.

I can also point you to Pappe’s response to the 2004 review. After reading Morris’s reviews and Pappe’s response, I have to say that I don’t know what to think. I’m no expert on the history of the Middle East and am even less of an expert on the historians of the Middle East. Further googling led me to this Reddit thread which makes a pretty convincing case that Pappe has indeed been sloppy in his historical writing. And, yeah, I know, Reddit, but, still my current thinking is that Morris has a point. I’d have more respect for Pappe if, in his reply to Morris, he’d had some “my bads” to acknowledge errors or misleading passages in his books.

That said, I still find the intellectual-history aspect of Pappe’s book to be interesting. In chapter 3, for example, he supplies this quote from a “history book on 1948 that was used as the main professional text in Israel for many years — The Edge of the Sword”: “Israel’s victory in the war was a miracle performed by divine authority, by a God who has not deserted his people in their hour of need. . . .” This is a book from 1961, so I guess it hasn’t been the “main professional text” for many years, but still.

Setting aside disputes over the facts, which don’t seem like much of a dispute here, actually–Pappe and Morris seem to agree on most of the key points, with the disputes being over various minor details–, this really seems like a debate about legitimacy. In Pappe’s view the pro-Palestinian historical perspective was marginalized in Israel for many years, was briefly taken to be a legitimate position (but hardly the dominant position), and then became delegitimized. Morris seems to be arguing that the pro-Palestinian view indeed should be illegitimate within Israel. I don’t think he’d argue that nobody in Israeli academia, arts, or the media should espouse such views, but he doesn’t think they deserve any respect, in the same way that I wouldn’t think that any respect would be due to an American political scientist who argued that we should bring back the monarchy and become subjects of King Charles III. Or how I think that fringe scientific views such as HIV denialism or belief in ghosts belong on the fringe.

The question of professional balance, or whatever you want to call it, is tied in with what views are considered to be acceptable. For example, in a history of the Second World War, you want to give all sides in historical perspective but the Nazi racial theories would be historicized, not taken as serious arguments. I guess there’s room in academia for a few hardcore Nazis–the government and social media seem to be filling up with them now–but I think it’s also ok to disrespect them. And from the perspective of Morris and other neo-Zionists, there’s not much difference between Pappe and a Nazi, so they don’t have much sympathy for his complaint that he and other post-Zionists have been pushed to the edge of the Israeli conversation.

Even in a technical field such as statistics, I think some leading academic figures have been getting things completely wrong, and I don’t think they’re doing it on purpose, I just think they’re missing the point, that there’s a disconnect between their theory, their methods, and the applied problems we’re all trying to solve.

What I’m getting at is that, in any space of discussion, there’s a question of how much bandwidth to give different views, which perspectives should be taken seriously and which should be dismissed. National history is an interesting example because we’re immersed in it from childhood and it has serious consequences.

From time to time a misguided but fashionable idea will take over an academic field or subfield in a country. Examples are Lysenko’s biology in the Soviet Union, junk social psychology in the U.S. in the 2010-2015 period, Chomskyian linguistics in the U.S. for a few decades, string theory in physics (ok, I’ll leave that one to others to decide how misguided it is), etc. When this happens in science, we see it as an aberration. But when it comes to history, I’m guessing that extreme nationalism is the usual dominant position and that it is rare for countries to give serious consideration to alternative views.

So you can take this post as a general comment about how to think about “The idea of X,” in environments where the range of discussion can shift a lot in different decades according to political, economic, and social conditions.

“The inner workings of a scam”

This story from pharmacologist Csaba Szabo is hilarious/horrible. He goes through a long email exchange with a fake-academic-journal scammer, gradually extracting various aspects of their evil business plan.

I get these scam messages all the time and either ignore them or mock them on the blog (as in this notorious example which came from Wolfram Research). Sometimes I try something cute like, they tell me about some service they have to turn my work into a fancy brochure and website for the cost of a mere $3000 or whatever, and I respond: Thanks, but $3000 isn’t enough, if you send me $10,000 I’ll consider. So I admire Szabo to have gone to all this effort to expose this for us.

Hey! What the Froot was up with this Harvard website?

This is probably the least important topic in the history of our blog.

I have this old post, Teaching materials now available for Llaudet and Imai’s Data Analysis for Social Science!, which recently received the following comment from Shriram Krishnamurthi:

For whatever strange reason, the Website link (for “tons of materials”) now redirects to an unrelated(?) professor’s vanity site. The link one wants is here:
https://ellaudet.github.io/dss_instructor_resources/

I checked and Shriram is right, so I fixed the link.

But here’s the weird part.

The old link (now removed from the earlier post) was: https://scholar.harvard.edu/ellaudet/dss-instructor-materials

I get that Llaudet is no longer at Harvard and they might want to keep their website clean by removing links to former affiliates. So it makes sense that the link “https://scholar.harvard.edu/ellaudet/dss-instructor-materials” would no longer work.

But here’s the weird thing. Go to that url and it bumps you over to a website called “https://k-froot.com,” which looks like this:

It’s the homepage of a retired Harvard professor of business administration. I tried plain old “https://scholar.harvard.edu/ellaudet” and this also takes us to k-froot.com.

And then I tried “https://scholar.harvard.edu/abc” and, you guessed it . . . it takes us to k-froot.

Did Froot make some sort of arrangement with his employer so that all non-working links at https://scholar.harvard.edu go to his webpage? This would seem a bit bizarre, as it’s not like he’s getting anything from these links (in case you’re wondering, if you click through to his C.V. you’ll learn various random things such as that in 2016 he received the “Crowell Second Paper Prize, PanAgora Asset Management”).

Or maybe it’s a glitch on the Harvard website, that these nonworking links all get sent to k-froot.com because some accidental bit of code ended up in their html? I have no idea. Nonworking links at the regular Harvard site (e.g., https://www.harvard.edu/abc) just go to your standard 404 page (not the one linked here, unfortunately). Does anyone have any idea?

P.S. I just checked the links and that Froot thing is no longer happening. Maybe it was some temporary glitch that they fixed. Weird. Anyway, I changed “is” to “was” in the title of this post.

Journalists and the people they interview: The individual contract and the social contract

1. Why do journalists interview people? Why do people agreed to be interviewed by journalists?

The implicit individual contract is that each side gets something: the journalist gets material, some contribution to an interesting story, and the person interviewed gets to tell his or her story.

But that’s not the whole thing. Journalism isn’t just a way of making money (or, perhaps, of being a loss leader for some other business); it’s also a public trust. I’m not saying there’s something amazing about journalism; lots of jobs are public trusts, including doctors, nurses, teachers, police officers, bus drivers, farmers, etc. We’re all making a living here, but we’re also doing our part to help our civilization run smoothly.

So there’s also an implicit social contract: journalists uncover and share important news (they “comfort the afflicted and afflict the comfortable”), and the rest of us donate our time to help them out.

In my own professional life, I spend lots of time doing research, teaching, and writing, but I also sometimes act in a journalistic role, and other times I help out journalists. I think that when I act as a journalist, I follow these implicit contracts–at least, I try!–but in my experience as a source, sometimes it feels that the news organizations are not holding up their side of the bargain.

2. Agreement or controversy, but no room for uncertainty

A few years ago I was contacted by a newspaper regarding the claim by a couple of economists that mortality rates among middle-aged white people was increasing.

I’d found a problem with the economists’ published paper–they hadn’t done sufficient age adjustment, and it turned out that after adjusting for age, the mortality rate in this demographic category was increasing for women but decreasing for men–and so this was worth correcting.

At the same time, this group’s mortality rate, even if not increasing, was still not decreasing in the way that it was in many other rich countries, and so the economists’ general point–that we should be concerned about this trend–was still largely valid.

And here was the problem. If I wanted to chime in and agree with that published paper, that would’ve been fine. Or if I wanted to “debunk” it, that would’ve been newsworthy too. But for me to say: Yeah, their general point is right but, no, mortality rates among men in that group were not actually increasing . . . Nah, that’s too subtle.

What the journalists wanted was either a clean story in which those economists were heroes, or a juicy dispute between two warring factions of academics. My half-assed intermediate position just didn’t work for them.

I felt that, by soliciting and gaining my sincere efforts and then discarding what I had to say, the reporter was violating the individual and social contract.

It would be as if someone asked you to donate something to a bake sale for their good cause, and you offered to bake a banana bread, and they said that would be great and you should leave it at a certain spot, and then they never came to pick it up and the bread went stale. They’re under no obligation to serve your bread, but it’s abusive of your goodwill for them to ask for something they’re not going to use. I think that’s sometimes how people feel about small-dollar campaign contributions, when it doesn’t seem that the candidate spent the money efficiently: the money meant something to you but apparently nothing to them.

Another time I was angry at about some shady journalists who pretended to be associated with MIT. In that case, my take on the matter did not fit into the simplistic story they wanted to tell about scientific heroes and villains. Again, no reason they had to agree with my take; what I didn’t like was that they misrepresented the facts. But journalism is all about storytelling, right?

3. Ghosted!

A few weeks ago I was contacted by a major news organization about a story they were running–they wanted my take as a statistician on the matter, as they’d seen a post of mine on the topic in question. I said sure, I’d be happy to talk with them. They followed up with some questions and I responded with a message expressing my uncertainty.

And then . . . crickets!

I sent a couple followup emails to see what was up, and the producer didn’t respond. I had the horrible feeling that they didn’t want me on the show because whatever I would say wouldn’t fit their predetermined storyline. Indeed, my annoyance at that was one reason for writing the present post. I felt . . . exploited, almost: I’d given them my trust and then when I didn’t act like an authoritative expert, they decided they didn’t need me.

But then, after I started writing this post but fortunately before I posted it, they got back to me! They were just super busy. OK, super busy is annoying too, but, no, it’s not at all true that they only wanted me to tell them what they already wanted to hear. Indeed, in this case I was arguably putting their responses into my preconceived storyline! So, a good cautionary tale.

4. Which journalists should we trust to tell our stories to?

Coincidentally, while I was in the middle of writing this post (in which I’m acting both as source in telling my story and as journalist in shaping it for you), I happened to listen to an episode of the 404 Media podcast where Jason Koebler of 404 was interviewing independent journalist Marisa Kabas. One thing that came up in the interview was that, when it came to sharing stories about tech and the government, many of their sources were much more comfortable talking with independent media such as Kabas and 404 than with major news organizations such as the New York Times and the Washington Post.

Now, I have nothing bad to say about the Times or the Post. Or, to be precise, I do have complaints (see here and here), but these are large organizations, and every large organization has its serious flaws, even my own employer!

The question here isn’t whether the Post or the Times are good institutions; the question here is: If you have a story to tell, would you rather tell it to Kabas or 404, or would you rather tell it to some random reporter from one of these major news organizations? (OK, I guess the Post is now a minor news organization, but you get the point.)

Kabas and Koebler said that lots of people in the trenches would trust them more. And so would I. If I wanted to tell some story about bad things going on in my workplace, or in my corner of academia or business, I’d trust Kabas or 404 more than I’d trust someone from a major news organization.

What’s going on here? How can this be?

I’ve read enough from 404 Media to have a sense that they have integrity, will report their stories seriously, and that they’re not tied to simplistic storylines. In contrast, my impression of major news organizations have been a mix of good and bad. I’m sure that NPR has lots of wonderful reporters, but they also emit a continuing stream of junk-science hype. As to newspaper/magazine/radio/TV reporters: they’re almost always under the gun, in so much pressure to finish their piece that I’ll often find myself giving them tons of background material that they never use.

And then there’s the push toward simple storytelling, which I noted in part 2 of this post above.

There’s also a selection effect. I think I’d have more confidence telling my story to a random NYT reporter than to a random blogger on the internet! So in some sense it’s not so fair to compare 404 Media to the New York Times. It would make more sense to compare 404 Media to some small set of Times reporters who I respect a lot.

And when you get contacted by someone you’ve never heard of, like those fake MIT journalists, be careful!

5. When do the news media show respect to the people who share their stories?

Another thing that came up in that 404 interview was the idea that these particular independent journalists take the stories they’re writing seriously. It’s not all a big joke to them, and they respect the people who share their stories.

That sounds obvious, but in my experience with major media, I don’t always see it. As noted above, sometimes reporters got my cooperation but then withdraw when I don’t want to tell the story their way. Or, more generally, they can be fishing for a quote that will slot into their set narrative. They don’t want my story at all, which in some way is fine–they have every right to report the news however they want–but when they do this, they’re exploiting me and others who are giving them our time and our stories for free. They’re violating the individual contract, and they’re violating the social contract. (That said, often I’ve had excellent experience from journalists in the organized news media, such as with Stephanie Lee, formerly of BuzzFeed and now at the Chronicle of Higher Education, who have openness and who are really listening to what people tell them.)

Another example of the problem of the news media not showing respect, this time not involving me in any way, is discussed in this article by Jessica Olin about Amanda Knox, that young woman who was imprisoned in Italy for four years for a murder that it seems she never committed. The relevance to the present post is this passage, reporting reactions to a book that Knox wrote after being released from prison and returning to the United States:

In the New Yorker, Mark Singer dutifully explained why she had been acquitted but warned that ‘if she now elects to exploit and cash in on her celebrity, it will prove that she hasn’t learned much worth emulating.’ On the LRB blog, Lidija Haas predicted that ‘few will want to read’ the book.

Writers employed arch language – the killing was described as the result of a ‘sex escapade’ and ‘high jinks’ – and framed events as though they were fictional. The playwright John Guare, who called Knox ‘my kind of murderess’, wondered whether she was a heroine in the mould of ‘Daisy Miller, an innocent young girl who goes to Europe for experience? Or is she Louise Brooks, the woman who takes what she wants and destroys everything? Or is she Nancy Drew caught up in Kafka?’ Nathaniel Rich wrote in Rolling Stone: ‘One might expect that the lead role in this blockbuster would be assigned to the victim,’ but ‘the show was stolen by an accidental ingénue.’ . . . In the New York Times, Sam Tanenhaus was dismissive of Knox’s ‘well-orchestrated round of TV appearances’ . . .

And this wasn’t the tabloid press; it was prestige journalism. Now I’m not saying that these writers had an obligation to take Knox’s side, or that her status as a wrongly-imprisoned person should make her writing or her life immune from criticism. Rather, I’m agreeing with Olin that there’s something disturbing in how these writers were treating Knox as a character, not as a person.

6. When I’m acting as a journalist

People send me tips all the time, and sometimes I look into these and blog them. I have to be careful–sometimes I’ll get an email about some purported scandal but it’s not something that seems bothersome at all, but often it really is a thing. Often the things I end up writing about involve malfeasance by highly placed academics at Cornell, Nevada, California, USC, Harvard and Stanford, Freakonomics, etc etc. I don’t do lots of reporting–usually I’m commenting on public documents–but I do get all these tips, often from people who want to remain anonymous, and I’m pretty careful about showing respect to the people who send me things. I can see why they would trust me more than a news organization that will already want to slot their story into some preconceived pattern.

7. Our parasocial relationships

The term “parasocial” has been used to describe the way in which ordinary people can almost feel that they personally understand how some celebrity is thinking. There is fannish behavior (not always so horrible; I don’t mind signing books for people) and sometimes the corresponding disappointment if someone does something disappointing (I guess that’s how some Larry David fans felt after he was shilling crypto).

Many journalists have a personal style, and that goes extra for people who write in the blog format. I’ve given you 20 years of myself, a bit every day, indeed I’d be disappointed if you didn’t have some sense of how I think, and not just on statistical issues. This parasocial relationship can be one reason–not the only reason, but one of them–that strangers email me and trust me with their stories. And it goes the other way too! I listen to a few podcasts from those 404 Media people and I think of them as my friends, kind of. And then there’s Nate Silver: I respect his work, we did some small projects together, I even met him once, and then when he decided to stop engaging, I was really disappointed. But I guess I shouldn’t have been disappointed: Nate has many goals, and worrying about the details of his statistical models has to be low on his priority list.

The dangerous side of parasocial relationships with the news media (setting aside personal dangers such as stalkers, which, yeah, that’s scary) is when people place trust in news influencers who are spreading lies. I’m not thinking so much of politicians and performers such as Alex Jones or Al Sharpton or Tucker Carlson–I assume that even their fans see these guys as distant political actors, not as potential friends–, but rather influencers on social media–bloggers, even!–who come off as ordinary Joes with a story to tell. For example, I could do a lot of damage if I were to persuasively write in support of bad statistical ideas and junk science!

So there are pluses and minuses from institutional “Big Journalism.” On the minus side, as discussed earlier in this post, independent journalists can give me a sense of trustworthiness and commitment–they really do seem to care about the stories people are telling them. On the other side, major news organizations can institute some quality control (with some exceptions) so that, even if they’re writing cookie-cutter stories, they’re not doing willful manipulation in the manner of propaganda outlets.

I also see the downside of parasocial relationships, in that some people who don’t know me seem to think I’m mean. I’m not mean! This is not to say that I’m always nice, just that I try my best to see things from each person’s perspective. For example, I’ve said a lot of mean things about Cass “Nudgelord” Sunstein (for example here and here), but really am trying to understand where he’s coming from.

8. Journalists are busy!

I sent an earlier version of this post to some journalists I know–a mix of people, including a former magazine editor, a former wire service and newspaper reporter, a former college newspaper editor . . . Lots of “formers” there, which tells you something about the state of the journalism business today.

One of them pointed out a selection issue I hadn’t thought of:

I think there are tons of more indie publications that consistently rise and fall every time they have a big scandal, while the new york times or npr are big enough to withstand all their scandals. So the smaller publications that you hear of end up being more trusted.

That’s an interesting point. By the time I’d heard of 404, they’d already been around for awhile. They’re independent, but they’re big enough that I’d heard of them.

All my journalistic contacts pretty much agreed with my point that it can be frustrating to tell your story to a news organization and it turns out that they’re just trying to slot it into a preconceived structure: a clean narrative with perhaps space for a contrarian view but not much room for ambiguity. One of the challenges here is that critical journalism can require some subject-matter expertise, or access to people with subject-matter expertise, but it can be hard to find such people. Sara Silver discusses this in this story about failures of the business press in their reporting of the financials of Netflix.

But my colleagues also pointed out that a good journalist might interview lots and lots of people for a story, and there won’t be room to share all their stories in the published article. So, just for statistical reasons, it’s likely your story won’t make it to the final cut, and this isn’t necessarily a failure of the reporting process. This could be different for independent media which is more bloggy and can have more space to present a broader perspective.

Here’s how Stephanie Lee puts it:

Another factor that you allude to, but is more significant than maybe you realize, is that journalism is up against severe financial challenges. There’s been a massive decimation in staff-employed reporters (i.e. the mass layoffs at the Post, the closing of BuzzFeed News), and it’s nearly impossible to make freelance journalism work financially, and as a whole, journalists are under pressure to produce a lot of stories in a short amount of time in order to keep audiences engaged, keep the social media platforms fed, etc. This has always been true to some extent, but it’s especially exacerbated now. I sincerely appreciate your kind words about me, they made my day, but I also want to acknowledge that BuzzFeed and now the Chronicle have given me the time, space, and resources to pursue stories that I think are interesting and important, even if they take a long time, and that don’t even always result in publishable stories, and that most journalists don’t have these luxuries.

If you’re doing journalism as a job, you’re probably overworked and underpaid. Independent journalists can do better in some cases because they may be doing it in transition from one job to another, or because they’re working to build their personal reputation, or because they’re in a small-business structure in which they don’t need to work on too many stories at once. Or, as in my case, because they have a full-time job and they’re doing journalism on the side.

9. Summary

Sometimes I’m contacted by news organizations because they want an interview or a quote or background information, and, with rare exceptions, I feel like the journalists on the other end of the line are just doing their job–they’re not really interested in what I have to say, they just want some material to slot into a narrative they’ve already written. I expect that independent journalists such as from 404 would be much more respectful of what I, or other sources, had to offer them.

From the other direction, when people contact me with material, I show them respect. I don’t automatically agree with them; what’s relevant here is that I follow the implicit code: they’re giving me material and I’m telling their story.

OK, not always. On occasion I’ve received angry emails from prominent academics objecting to something or another in my posts. I’d rather they just write something in the comment section so we could have an open discussion. But if someone has something reasonable to say, I’m happy to post it, along with my take.

After writing all the above, I’m afraid I’m giving the impression that I hate most journalists. I don’t! They’re doing a job, and when people are doing a job, they typically fall into an “I’m doing my job” mindset. It happens to me too! I love research and teaching, also they’re my job, and there are days when I come into work and I’m not fully committed. Blogging’s easier because I’m doing it as extra. And maybe it feels different if you’re working in independent media, I don’t know. Again we have to watch out for selection bias, as there are tons of independent journalists or opinionators or whatever you want to call them on the internet who are some combination of stupid, corrupt, and irremediably ideological.

And those people working at the Washington Post, or the New York Times, or the BBC, or even NPR . . . they’re doing their best. They’re just unfortunately working within a framework in which their job is to regularly produce coherent narratives on deadline. So when you talk with them, they might not really be wanting your story; rather, they want a piece in a puzzle that’s already laid out on their table.

My purpose in writing this post is to frame this in terms of the implicit individual and social contracts that bind journalists to their sources. Anyone who’s reporting a news story, or is talking to a journalist, is putting in some time into what, ultimately, is a collaborative effort.

My online talk Tues 24 Feb, 9am NY time at the Behind-the-Scenes seminar series: Russian Roulette and stochastic potential outcomes

I’m speaking at this online seminar Tues 24 Feb, 9am NY time:

The Behind-the-Scenes Seminar Series is designed to learn about the production process of research papers, offering an opportunity for students and researchers in all fields and at all career stages to engage with the challenges encountered during project development and how they were overcome.

Unlike most research seminars that focus on the research findings, this series will be dedicated to discussing the research process. Not only this, the seminars will also feature a live survey to gauge the audience’s expectations regarding the journey of the paper and compare them with the speaker’s actual experience.

What happened is that a few months ago the seminar organizers (three economists: Vatsal Khandelwal, Séverine Toussaert, and Jasmin Baier) wrote to me:

Speakers not only present their findings but also share the story behind their research, from the initial idea and design choices to data or modelling challenges and unexpected results.

Our aim is to foster openness, reflection, and engagement in the research community by highlighting the often-invisible processes that shape scientific work.

Would you be willing to suggest a paper you could cover? Ideally, it would be something that has already been accepted for publication, so that we can discuss the full journey, including the submission and review process.

I replied:

Here’s a list of our published research from last year.

If you go to that link and scroll down to “The stories behind the papers,” you’ll see where each paper came from.

So, if you want, you can pick one or more papers from that list that have good origin stories.

They responded that, as economists, they were most interested in the Russian roulette project.

It should be fun, to speak not just on the research itself but on where it came from and how it came to be published. It’s a joint paper with Jonas Mikhaeil, and we came up with the idea after hearing from Amanda Kowalski about her recent paper with Neil Christy, which got us thinking about what you can get from stochastic models for potential outcomes.

Here’s our published paper, “Russian roulette: The need for stochastic potential outcomes when utilities depend on counterfactuals,” and here’s the abstract:

It has been proposed in medical decision analysis to express the “first do no harm” principle as an asymmetric utility function in which the loss from killing a patient would count more than the gain from saving a life. Such a utility depends on unrealized potential outcomes, and we show how this yields a paradoxical decision recommendation in a simple hypothetical example involving games of Russian roulette. The problem is resolved if we abandon the stable unit treatment value assumption and allow the potential outcomes to be random variables. This leads us to conclude that, if you are interested in this sort of asymmetric utility function, you need to move to the stochastic potential outcome framework. We discuss the implications of the choice of parameterization in this setting.

We learned a lot from writing this paper and we’re continuing to think about the topic.

So, if you want to hear more, you can go to the Behind the Scenes website and sign up to get the zoom link. And here’s our blog discussion of the paper from last year.

Our sanctimonious Epstein-associated overlords (Kenneth Starr edition)

We’ve talked about Kenneth Starr and Jeffrey Epstein before. Another batch of emails turned up, including this one:

I think the messages from Starr got less publicity than those from various other celebrities, because Starr was already known to be a political hack or hired gun. Also Starr is no longer alive.

But I think it’s a mistake to let the Starrs of the world off the hook for this. People like Starr themselves have quite a bit of money and power, which they use to insulate the even more powerful people in our society from any consequences (and, in the case of Epstein associates Lawrence Summers and Peter Thiel, to suppress journalism, which as an institution can shine a light on the misdeeds of the powerful). These sorts of actions by Starr etc. would be a bad thing even if Jeffrey Epstein had never been born, but if this is what it takes for these problems to be brought to attention, that’s still something.

Review of string theory book from 2004 brings up interesting questions regarding age-period-cohort effects in the sociology of science

I don’t agree with everything that Freeman Dyson writes, but this review of a book by Columbia physicist Brian Greene was pretty good. It’s on the topic of string theory, which has come up before on this blog.

Dyson begins his review on a judiciously positive note:

I recommend Greene’s book to any nonexpert reader who wants an up-to-date account of theoretical physics, written in colloquial language that anyone can understand. For the nonexpert reader, my doubts and hesitations are unimportant. It is not important whether Greene’s picture of the universe will turn out to be technically accurate. . . . Even if many of the details later turn out to be wrong, the picture is a big step toward understanding. Progress in science is often built on wrong theories that are later corrected. . . . Greene’s book explains to the nonexpert reader two essential themes of modern science. First it describes the historical path of observation and theory that led from Newton and Galileo in the seventeenth century to Einstein and Stephen Hawking in the twentieth. Then it shows us the style of thinking that led beyond Einstein and Hawking to the fashionable theories of today. The history and the style of thinking are authentic, whether or not the fashionable theories are here to stay.

After quoting from Greene’s description of string theory, Dyson continues:

This is a fine beginning for a theory of the universe, and maybe it is true. To be useful, a scientific theory does not need to be true, but it needs to be testable. My doubts about string theory arise from the fact that it is not at present testable.

I guess not much has changed in the past twenty-one years!

More interesting than the string-bashing or string-skepticism is Dyson’s age-period-cohort take on the sociology of the cutting edge of physics:

In the history of science there is always a tension between revolutionaries and conservatives, between those who build grand castles in the air and those who prefer to lay one brick at a time on solid ground. The normal state of tension is between young revolutionaries and old conservatives. This is the way it is now, and the way it was eighty years ago when the quantum revolution happened. I [Dyson in 2004] am a typical old conservative, out of touch with the new ideas and surrounded by young string theorists whose conversation I do not pretend to understand. In the 1920s, the golden age of quantum theory, the young revolutionaries were Werner Heisenberg and Paul Dirac, making their great discoveries at the age of twenty-five, and the old conservative was Ernest Rutherford . . . a great scientist, left behind by the revolution that he had helped to bring about. That is the normal state of affairs.

Fifty years ago, when I was considerably younger than Greene is now, things were different. The normal state of affairs was inverted. At that time, in the late 1940s and early 1950s, the revolutionaries were old and the conservatives were young. The old revolutionaries were Albert Einstein, Dirac, Heisenberg, Max Born, and Erwin Schrödinger. Every one of them had a crazy theory that he thought would be the key to understanding everything. Einstein had his unified field theory, Heisenberg had his fundamental length theory, Born had a new version of quantum theory that he called reciprocity, Schrödinger had a new version of Einstein’s unified field theory that he called the Final Affine Field Laws, and Dirac had a weird version of quantum theory in which every state had probability either plus two or minus two. . . . Each of the five old men believed that physics needed another revolution as profound as the quantum revolution that they had led twenty-five years earlier. Each of them believed that his pet idea was the crucial first step along a road that would lead to the next big breakthrough.

Young people like me saw all these famous old men making fools of themselves, and so we became conservatives. The chief young players then were Julian Schwinger and Richard Feynman in America and Sin-Itiro Tomonaga in Japan. Anyone who knew Feynman might be surprised to hear him labeled a conservative, but the label is accurate. Feynman’s style was ebullient and wonderfully original, but the substance of his science was conservative. He and Schwinger and Tomonaga understood that the physics they had inherited from the quantum revolution was pretty good. The physical ideas were basically correct. They did not need to start another revolution. They only needed to take the existing physical theories and clean up the details. I helped them with the later stages of the cleanup. The result of our efforts was the modern theory of quantum electrodynamics, the theory that accurately describes the way atoms and radiation behave.

This theory was a triumph of conservatism. We took the theories that Dirac and Heisenberg had invented in the 1920s, and changed as little as possible to make the theories self-consistent and user-friendly. Nature smiled on our efforts. When new experiments were done to test the theory, the results agreed with the theory to eleven decimal places. . . .

This is fascinating. I’d never thought of the history of twentieth-century physics this way, and it leaves me with some thoughts:

1. The age-period-cohort nonidentifiability, something we’ve seen before when studying public opinion. Dyson is talking about the scientific views of leading physicists, but it’s the same general thing, that you can explain the observed data in multiple ways.

One story is that attitudes of the young were driven by the logic of events: in the 1910s-20s, the foundations of physics were in a mess and so the young physicists were radical, recognizing the need for revolutionary science; in the 1940s-50s, the foundations were strong and much progress could be made using normal science, hence the young physicists were conservative; in the 1980s-90s, the advances of conventional methods in fundamental physics had trickled to a halt, hence the young physicists were motivated to be radical. Another logic-of-events thing that Dyson could’ve mentioned, but didn’t, is that the 1940s-50s were special in that a huge amount of effort in theoretical physics was going into the design of atomic bombs: for that, what was needed was creativity in the application of existing fundamental theories, not a fundamental restructuring. After the 1950s, the military remained a major funder of physics, but bomb design was no longer the cutting edge.

A different explanation is based on cohorts. The cohort of Einstein, Dirac, etc., achieved success with new fundamental theories and so they kept wanting to do that–they remained radicals all their lives. Reacting against this, the cohort of Feynman, Dyson, etc., achieved success while working within existing theories, so they remained conservatives all their lives. The cohort of Greene, etc., reacted against their fathers and became radicals.

These two stories can coexist; that’s part of the nonidentifiability.

2. Similar things have gone on in statistics. From the 1950s-1980s there was a conservative movement within academic statistics featuring opposition to Bayesian methods (see discussion here), with some of this attitude lingering even into the 1990s (as discussed here). My generation was more radical, favoring developments in many different directions. In recent decades I and others have become more aware of misuses of statistics but we have not framed this as an anti-modeling stance. My point here is not that I’m right and these other people are wrong, but just that, as with Dyson, I see generational differences. Indeed, I’ve talked with some young statisticians who express to me what seem like naive old-fogey attitudes demanding statistical guarantees. Again, these are legitimate differences of opinion, in the same way that it’s perfectly fine for Dyson and Greene to differ on the value of string theory. It’s just interesting to see these sorts of age-period-cohort effects in these contexts.

The pantheon of celebrity billionaires

In our recent post on rich guys doing stupid things, I quoted Paul Campos, who wrote:

We worship billionaires in this society now like Stalin or Mao or Hitler were worshiped in their societies. Vladimir Putin scores eight goals when he plays hockey against professional Russian hockey players because that’s the way the world works I guess . . .

But the highlight of the thread were two comments.

From James:

I met an English guy on (the original) Ithaca who had set up a cafe there, and he claimed that the locals cheated like hell at backgammon and thought it was better to win by cheating because it required more skill than winning through playing by the rules.

This reminds me of Dan Luu’s explanation of how you can cheat at Codenames.

And from Somebody:

Pretty much nobody worships billionaires as a class. Most people worship at least one billionaire; that’s THEIR billionaire. Think old school paganism. Pantheon of gods, but a tribe will focus on one. A lot of immigrant Chinese Americans worship Elon Musk. Maybe it’s Donald Trump, or Kanye West, or Beyonce, or Taylor Swift, or Charlie Munger, or Warren Buffet, or Steve Jobs, etc.

I like that framing. Except that I’ve never heard of Charlie Munger, but I guess that’s part of the point.

Who’s the billionaire I worship? Bob Dylan. Or, if he doesn’t have an actual billion dollars, I guess I can go with Paul McCartney. Back in the day, I knew a lot of people who worshipped Steve Jobs. Go on the internet, and you’ll see lots of Elon Musk and Donald Trump worshippers.

I mean, sure, I know that all these people are mortal. By “worship,” I mean that my favorite celebrity billionaires have done amazing, wonderful, inimitable things and I’m always rooting for them. Not necessarily in all aspects of their lives–I seem to recall reading that Bobby Z. wasn’t much of a father, for example–but, then again, Zeus and the gang didn’t have such tidy domestic situations either. People can worship Donald Trump in the sense of following him to the ends of the conceptual earth, supporting him on whatever latest policy twist or fake story he comes up with this week, while still finding him somewhat comical and even a bit despicable in his business dealings. Or you can worship Lebron James, admiring him for his amazing basketball skills, his physical conditioning, his Jordanesque will to win, etc., without wanting to look too carefully into his social life or those rumors of performance-enhancing drugs. And so on.

And it does seem that these celebrity billionaires live in their own Olympian plane (on their literal private planes) and only sometimes descend to Earth in order to involve us in their petty battles.

Fifty years ago, this wasn’t the case! There were rock stars, movie stars, sports stars, media stars, political stars, not so many business stars. There was Howard Hughes but he was a weirdo, not a god. Forty years ago there was Lee Iacocca, but he was a self-promoting businessman–a kind of big-budget Ron Popeil or Crazy Eddie–not an independent source of power like the modern celebrated business leaders.

Also, it can be fun to be talked about–or, at least, it can sound like fun until it happens to you. So you get boring billionaires such as investor Bill Ackman who decides he’d like to enter the pantheon as a minor god (ok, in his mind I’m sure he’s a major god), so he starts staging publicity stunts. And . . . it works! Because he’s a billionaire. But he’s gotta keep doing newsworthy things or else he’ll be forgotten. Maybe he could pay to build a really tall building somewhere and put his name on it? Or try his hand coaching an NFL team? I dunno.

Anyway, this “pantheon of celebrity billionaires” idea? I like it. It accurately captures something real about our society, something that’s relatively new.

How the new era of CEO supervillains are trapped in their own ideology

I’m using the word “ideology” here not in the sense of political ideology but rather their view of tech innovation, in which successful innovations lead to big companies which become dinosaurs that get defeated by the next generation of plucky upstart mammals. Rich and powerful tech investors and executives see themselves as being the previous generation of upstarts and they’re painfully aware of the possibility that they’ve become the dinos. Paradoxically, the story they tell about their specialness as founders is embedded in a framework that implies future creative destruction at their expense, leading to an insecurity that drives them to do bad things.

It’s interesting because a standard role of economic ideology is to justify the positions of the wealthy. In this case, yes, the economic ideology justifies their existing position but it also implies future uncertainty. And, in an appointment in Samarra sort of way, every step they take to avoid the future reckoning just makes this uncertainty worse. Leading to tragedy for these tech executives and also for the rest of us.

I thought about all this after reading Careless People, a memoir by Sarah Wynn-Williams, a lawyer from New Zealand, about her several years as a Facebook executive. It follows her trajectory from idealism through enthusiasm, excitement, intensity, disillusionment, resistance, and departure. One thing I appreciated is that Wynn-Williams doesn’t present herself as a victim. In the book she’s a competent and resourceful person who eventually finds herself in over her head. And she’s got great stories. I don’t know what’s true, what’s exaggerated, and what’s left out–I don’t know any of the people involved, but it all sounds plausible. Her superiors in the organization seem like a bunch of liars, but I guess that some of that is helpful for attaining success in this ever changing world in which we live in. There were some scary bits like when they pressure her to take a long-distance flight when she’s in an advanced stage of pregnancy, and a funny bit where Mark Zuckerberg is playing Settlers of Catan with the other senior executives, they all go easy on him to let him win, and Zuckerberg doesn’t realize this is happening. I guess that he outsources his people-reading skills.

One thing that struck me is that, somewhere in the middle of the book, Facebook moves from a traditional big company that tries to use some mixture of competition-busting tactics and lobbying to maintain its position as market leader, to a powerful entity in itself that negotiates to keep governments in power.

It’s kind of like, first they’re playing the business game according to its de facto rules, then they’re playing the version of the game of monopoly where if you’re powerful enough, you can try to rewrite the rules. They’re playing the game of meta-monopoly.

And that seems dangerous. It puts the executives in the “fiduciary” position in which they’re expected not just to play hard and not just to push the boundaries of the rules–the usual calculation is that breaking the law is ok as long as the expected benefits exceed expected costs, and when benefits are in the billions and fines are sporadic and in the millions, you can see where this is going–but also to change the rules. This is too much power, and it also seems corrupting. I feel like even the Facebook executives themselves–even the creepy ones who may have enjoyed being able to change the rules–were not served by this.

To use a saloon poker analogy for a moment: you can think of Facebook as a successful poker player with a huge bankroll, playing a largely on-the-level poker game, with some collusion, some stacked decks, etc., but still mostly poker–but now someone comes in and hands Facebook a revolver. Cool! Now then can really make bank. But the poker game won’t last so long anymore. Nobody wants to play a game where they don’t have a chance. They move from lobbying the government, to being partners with the government and getting special advantages, to needing the government to keep the game going, maybe even infusing the game with tax dollars in some way. I’m not saying this has all happened yet, just that it’s the trajectory. I think they’d be better off if they were still playing in a straight-up poker game, but I can see that once they had the opportunity to grab more, it was hard for them to say no.

The ideology’s in trouble

The trajectory of Facebook gives me some insight into the inherent incoherence of the ideology of market-leading tech companies. I have the impression that their ideology has three components:

1. The company was founded from a combination of inspiration, brilliance, hard work, ruthlessness, and luck. The right people at the right time putting in all nighters and refusing to take no for an answer. The startup is the creative mammal thriving beneath the notice of the lumbering dinosaurs.

2. As the company gets big, it needs to avoid becoming one of those dinosaurs. So it should never lose its startup habits: openness to wild new ideas, thinking big, willingness to work long hours, and commitment to the cause. The challenge is to stay young and hungry even while the company is becoming middle-aged and fat.

3. The goal is 20% annual growth forever, or until the heat death of the universe, whichever comes first.

OK, you can see some contradictions here! On one hand, the storyline is that you’re gonna get overtaken by hungry young newcomers; on the other hand, you’re supposed to stay on top forever. The result, at least for Facebook, seems to have been a kind of desperation, a sense that on one hand they are the kings of the world and that on the other hand they are destined to fail and so they have to try harder and harder to grow and grow and preserve a near-monopoly status. And that’s how you get these executives who control unimaginable fortunes and yet are willing to lie and cheat (I wanted to say “lie, cheat, and steal,” but I don’t know if there was any actual stealing reported in that book), indeed they seem to feel that they have to like and cheat and manipulate the rules and all of this to stay on top.

This is where I feel like their ideology is killing them. Yeah, it’s good that they recognize that as businesspeople they’re nothing special–they just happened to be in the right place at the right time–and it’s good that they recognize that a company has a natural life cycle and you can’t stay on top forever. The bad thing here is that it gives them such a sense of existential insecurity that they feel that they have to keep reinventing themselves and their businesses. They seem to feel a kind of duty to keep the growth going, even while they recognize that there’s no reason they shouldn’t be supplanted by the new generation, and that motivates them to do bad things.

Again, I think they’d be better off if they weren’t able to change the rules in this way–they’d be better off if they were just selling widgets and following the usual corporate playbook. This growth-or-die attitude is just ruining these people.

Machine learning research is not serious research and therefore hallucinated references are not necessarily a big deal, agrees a prestigious group of machine learning researchers

This is Jessica. There’s been some debate among computer scientists about what policies conferences should adopt for papers with hallucinated references. An independent analysis turned up at least 53 NeurIPS 2025 papers that were accepted (and presumably presented) at the conference in December but which had at least one hallucinated reference.

The question is, what should the default policy be if a paper is found to have at least one hallucinated reference? Should we conclude that these papers should have been rejected, and retract them? Should we instead let authors correct them? Going forward, should we desk reject papers with at least one hallucinated reference? What exactly can be concluded about the quality of the rest of the paper if you find at least one hallucinated reference?

The NeurIPS board statement suggests leadership is uncertain what to do about these papers:

“The usage of LLMs in papers at AI conferences is rapidly evolving, and NeurIPS is actively monitoring developments. In previous years, we piloted policies regarding the use of LLMs, and in 2025, reviewers were instructed to flag hallucinations. Regarding the findings of this specific work, we emphasize that significantly more effort is required to determine the implications. Even if 1.1% of the papers have one or more incorrect references due to the use of LLMs, the content of the papers themselves are not necessarily invalidated. For example, authors may have given an LLM a partial description of a citation and asked the LLM to produce bibtex (a formatted reference). As always, NeurIPS is committed to evolving the review and authorship process to best ensure scientific rigor and to identify ways that LLMs can be used to enhance author and reviewer capabilities.”

To make things concrete, consider a hallucinated reference to be a citation listed in the references section of the paper where the average reader cannot (in a reasonable amount of time) determine the identity of the cited paper well enough to track it down. That is, even if the hallucination is a transformation of what was originally a valid citation, the transformation is severe enough that it’s not obvious what the paper is. Hallucinated references are distinguishable from more minor errors like syntax issues or other errors that affect the citation but don’t prevent you from still easily tracking down what was intended. 

I think we should be asking ourselves: What would we do if we found there was hallucinated evidence, such as experiment results? And we should treat these papers with hallucinated references as equivalently problematic. It doesn’t matter how many hallucinated references. It doesn’t matter how “valid” most of the paper is, or the probability that the main conclusions are correct conditional on finding a hallucinated reference. As most of us learn in primary school, a key reason authors cite relevant prior work is to help establish support for claims they make. If we don’t necessarily require that those references link to real research, then what are we even doing? 

For the NeurIPS board to say that “the content of the papers themselves are not necessarily invalidated” suggests that they think some degree of fictionalized evidence is tolerable, if it happened through honest mistake. A friend recently relayed to me such a horror story, in which, in a last minute rush before a deadline, they gave an LLM the full correct citations for their paper and prompted it to fix some minor formatting issues to conform with the required format. They submitted the results in time, only to find that the model had added a single citation to a non-existent paper, listing them as an author alongside some renowned researchers in the field. Imagine anyone you look up (much less one of the big wigs you associate yourself with in the fake citation!) reading your paper and discovering this. Yikes. 

So yes, these kinds of mistakes can happen. But I disagree with the board that it matters whether the hallucinations were accidental and most of the paper is ok. Sure, the proofs might be correct, or the paper’s experimental results unaffected. But if authors are using LLMs to help with their citations and are not building time into their process to check the results, it seems fair to conclude that either 1) they don’t understand the errors LLMs tend to make very well, or 2) they don’t consider it a priority to get the facts right. In most cases we can rule out #1 with ML researchers, suggesting that not everyone is on the same page about the importance of not making things up. When you imply that hallucinated references do not necessarily affect the validity of the paper, you signal tolerance for some amount of hallucinated evidence. You tempt authors to keep taking their chances with how much responsibility they can offload to models, rather than encouraging them to retain, regardless of tool use, a sense of personal accountability for the factuality of what they submit. 

Ultimately, I don’t think it matters that much whether NeurIPS allows authors of the affected 2025 papers to correct or retract. It would not surprise me if leadership decides to err on the side of the authors and let them correct, given that policies about LLM usage are evolving rapidly. What does matter is that they signal a lack of tolerance going forward, and this is where they missed an opportunity. 

This paper in Management Science has been cited more than 6,000 times. Wall Street executives, top government officials, and even a former U.S. Vice President have all referenced it. It’s fatally flawed, and the scholarly community refuses to do anything about it.

In a post entitled, “How Institutional Failures Undermine Trust in Science: The Case of a Landmark Study on Sustainability and Stock Returns,” Andy King (my collaborator on the project on scheduled post-publication review) tells a disturbing story of the failure of the scholarly publication process:

For a long time, I [King] resisted the accumulating evidence that our institutions for curating trustworthy science were failing.

I believed our academic gatekeepers–editors, reviewers, and research-integrity officers–were quietly doing their jobs. Overstretched, but nevertheless, curating a trustworthy scientific record and correcting it when problems appeared.

That belief ended when I attempted to replicate an extraordinarily influential article “The Impact of Corporate Sustainability on Organizational Processes and Performance,” by Robert Eccles, Ioannis Ioannou, and George Serafeim. The paper has been cited more than 6,000 times. Wall Street executives, top government officials, and even a former U.S. Vice President have all referenced it.

Uh oh . . . I have a horrible sense that I know what’s coming next:

It contains serious flaws and misrepresentations.

The article appeared in a prestigious journal, Management Science. The authors work at highly reputed institutions. As a result, I thought correcting the record would be straightforward.

I [King] ran into barrier after barrier.

OK, that doesn’t surprise me. I’ve had this sort of experience over and over. As the saying goes, it’s too hard to publish criticisms and obtain data for replication.

King continues:

The authors ignored me, the journal refused to act, and the scholarly community looked the other way. Two universities disregarded evidence of research misconduct–even after the authors admitted publishing a misleading report.

The article remains largely uncorrected–misleading thousands of people each year.

I believe our systems for curating trustworthy science are broken and need reformation.

Yup.

And now for the gory details:

The Authors

On September 11, 2023, I [King] emailed Eccles, Ioannou, and Serafeim to explain that I was attempting to replicate their study and had encountered serious problems:
• The reported method did not work as described.
• A key result seemed to be mislabeled as statistically significant when it was not.
• Some measures defied construction.
• Critical statistical tests appeared to be missing.
• The sample was highly unusual.
I explicitly acknowledged uncertainty and asked for help. Over roughly half a dozen follow-up emails, I shared progress updates and offered to collaborate.

I received no response.

My experience is not unusual. Bloomfield et al. (2018) show that requests from replicators are often ignored, delayed, or deflected. Because published articles frequently omit key details, authors can block replication simply by refusing to engage.

The Community of Scholars

I turned to colleagues and respected scholars for advice. I asked for help encouraging the authors to engage. I emphasized that mistakes happen–my own work is not unblemished–and that correcting errors strengthens, rather than diminishes, scholarly standing. I heard:
• “I can’t do anything–it would cause conflict.”
• “Your email is too long.”
• “I’m underwater for the next month.”
• “I’m too much of a coward.”
The last came from an internationally respected scholar with a chaired position at a top university. [Don’t worry, that wasn’t me — AG] I [King] appreciated the candor. It revealed an uncomfortable truth: much of social science operates on a culture of go-along, get-along.

“Once a paper is published… it is more harmful to one’s career to point out the fraud than to be the one committing it” (a different Bloomfield et al., 2018, link).

The Journal

Having received no response from the authors, I contacted Management Science. After getting advice, I submitted a comment.

It was rejected.

The reviewers did not address the substance of my comment; they objected to my “tone”.

Ahhhh, the tone police!

King continues:

They told me that published authors should be granted “discretion” in conducting their work and that replicators should tread very lightly. One reviewer was “inclined to turn down any invitation to review a revision” unless it was accompanied by a note from the original authors.

Knowing such a note would never come, I appealed. Rejected. I appealed again. Rejected.

The authors did admit to the editor that they had misreported a key finding–labeling it as statistically significant when it was not. The authors claimed the error was a “typo.” They intended to type “not significant” but omitted the word “not.”

Oh, I hate when that happens! So frustrating how the typos always seem to support the overblown claims being made.

King continues:

They did not address the implications of this “typo”–that it misrepresented the evidence for a central claim of the paper, that corporate sustainability increases stock returns.

I asked the journal to correct the record. Rejected.

My experience is not unusual. As one respondent told Bloomfield et al. (2018): “Replication studies don’t get cited, and journals don’t publish them. Nor do people get promoted for replication studies”.

The good news is that King and I are both too old to worry about getting promoted.

King continues:

Help from Outsiders: LinkedIn and an Upstart Replication Journal

I decided I needed to go outside the standard process and post publicly about the “typo” on LinkedIn.

Days later, I heard that the journal would publish a correction.

I was told the authors had submitted the correction before my post, but it had been misplaced and forgotten.

I believe the journal’s new editor found this news to be as incredible as I did. He quickly published an erratum.

I also submitted my replication to the Journal of Management Scientific Reports (JOMSR). This upstart publication was started in 2022 by a small group of courageous scholars who wanted to provide an outlet for replication studies like mine. I was impressed by their thorough reviews and tough guidance.

In spring 2025, JOMSR published my replication study.

Research Integrity Offices (Part 1)

While revising my replication for publication, I became convinced of a more serious issue: the method reported in Eccles, Ioannou, and Serafeim (2014) was not the method actually used. Worse, the true method could not support their “findings”.

I contacted the authors again. No response.

I decided a research integrity complaint was in order.

In July and August 2025, I submitted complaints to Harvard Business School and London Business School. I alleged that the reported method could not have been conducted as described–and that the results were therefore uninterpretable.

(A technical aside describing the study’s method may be useful here. Feel free to skip.)
• The empirical strategy in Eccles, Ioannou, and Serafeim (2014) rests on a demanding requirement: the “treated” and “control” firms must be so closely matched that which firm is treated is essentially random. The authors appear to recognize this, reporting that they used very strict matching criteria “to ensure that none of the matched pairs is materially different.”
• Despite their strict criteria, they also claim to have achieved remarkable success in finding precise matches, reporting that 98% of their “high sustainability” firms could be matched with a near-twin “low sustainability” firm. Yet when I attempted to replicate the study, I achieved a much lower match rate–fewer than 15%. To better understand the discrepancy, I conducted a probability analysis using a Monte Carlo simulation. I determined that the reported matching success was highly unlikely–many, many, many times less than winning the lottery.
• Either their matching process was precise, in which case they would not have enough pairs to run their analysis, or it was loose, in which case their analysis could not be interpreted.
(End of aside.)

Shortly after I submitted my complaint, the authors acknowledged they had misreported their method.

But they did not ask Management Science to correct the text of their article.

Research Integrity Offices (Part 2)

Eccles, Ioannou, and Serafeim explained that the misreport was an unfortunate accident. There had been two studies, they said, and the false description belonged to an “exploratory” study that was later removed to satisfy length requirements, except the sentences describing its matching process, which were inadvertently left behind. As a result, those sentences now appeared to describe the “main” analysis, but that is not what they had intended. It might look like misrepresentation, but it was just an editing error.

They did not explain that this meant all of their results were uninterpretable.

The explanation also conflicts with the record.
• The incorrect claim appears in the earliest available draft of their article–marked “NEW!” on HBS’s site.
• Over several later drafts, the false claim was retained and even edited, rather than removed.
• The “exploratory study” does not appear in any available draft.

In light of these inconsistencies, I submitted a revised complaint to Harvard Business School and London Business School.

Harvard Business School responded: “Whether or how the School does or does not move forward… will not be communicated to you.”

LBS was more open and responded quickly, concluding that the false claim was not an “intentional falsehood”. Why? Because the LBS professor (Ioannou) “did not have access to the raw data and did not conduct the analyses in question.”

That’s technically known as the “Ariely defense.” You’re the author of the paper but you didn’t touch the data, therefore you couldn’t possibly have cheated.

And then we get something we’ve heard many, many times before:

And in any case, the problem was of a “minor nature”, apparently because it pertained to some other study and thus did “not impact the main text, analyses, or findings.”

It’s funny how removing these fraudulent or erroneous analyses never affect the main conclusions of the study. It kind of makes you wonder why they went to the trouble of gathering and analyzing the data at all!

King continues:

Sadly, LBS’s response is empty.
• Data access is immaterial. I did not allege data fabrication.
• The false claim is not minor. It is the difference between a usable and useless study.
• It does not address the central question: Did the exploratory study ever exist? If not, false statements were published twice–first in the article, and then in the offered explanation.

LBS did conclude that the author engaged in “poor practice”, which they planned to address through “education and training or another non-disciplinary approach.”

I suggest LBS begin by explaining an author’s duty to correct errors in published work.

Where This Leaves Us

Eccles, Ioannou, and Serafeim (2014) remains only partly corrected in the pages of Management Science. Diligent readers may discover the erratum correcting the “NOT significant” finding, but they will not learn of the misreported method in the pages of Management Science. Thus, thousands of readers remain misled.

Our institutions for curating trustworthy social science are not working. They must be changed, reformed, and revitalized.

What you can do

1. Stop citing single studies as definitive. They are not. Check if the ones you are reading or citing have been replicated.
2. If you or someone else finds an error in your published work, publish a correction.
3. If one of your colleagues is behaving unprofessionally, tell them to stop.
4. Support replication. Encourage others to do so. Support the Journal of Management Scientific Reports.
5. Find out about the research integrity policies at your institution. If they are weak, strengthen them.
6. If you know Eccles, Ioannou, and Serafeim, ask them to retract their article, or at least publish another correction.

What else needs to change

For years, I studied industry self-regulation. The evidence is clear: it works only when it is transparent, independently monitored, and supported by graduated sanctions. Applying this to the curation of science.

1. Journals should disclose comments, complaints, corrections, and retraction requests. Universities should report research integrity complaints and outcomes.
2. An independent third-party should audit the process.
3. Penalties should reflect the severity of the violation, not be all-or-nothing.
4. And to ensure the system works, we need what Andrew Gelman and I call FurtherReview.

Let me just add one more thing.

I don’t know any of the authors of the paper under discussion–indeed, I’d never known of them, or their paper, before hearing this story from King–so I’m speaking in general terms:

– Whether or not the authors were lying or intentionally misrepresenting at any point, I agree with King that, based on the evidence above, they did research misconduct.

– This doesn’t mean that the authors of that paper are bad people!

We should distinguish the person from the deed. We all know good people who do bad things, indeed I’ve received some speeding tickets in my time, and there are lots of good people who’ve done worse than that. I’ve been in the car with some drunk drivers, some dangerous drivers, who could easily have killed people: that’s a bad thing to do, but I wouldn’t say these were bad people. They were just in situations where it was easier to do the bad thing than the good thing

What Eccles, Ioannou, and Serafeim did is much less bad than my friends driving drunk, but it’s still bad, but the same principle applies. They’re living in a world in which doing the bad thing–covering up error, refusing to admit they don’t have the evidence to back up their conclusions–is easy, whereas doing the good thing is hard.

OK, actually doing the good thing is easy. You just admit your error. I’ve done it myself–it’s super-easy, you just contact the journal and write a short, direct, and honest correction, and they’ll publish it. But to lots of people, it seems hard. As researchers they’ve been trained to never back down, to dodge all criticism. I don’t like what they did, but I imagine that they view their actions as something like how I might view a speeding ticket: yeah, I shouldn’t have done it, but it happens in the past.

From that perspective, the real problem is not the sin but rather the mistaken attitude that, in science and scholarship, what’s past is past. There’s a horrible sort of comfort in thinking that whatever you’ve published is already written and can’t be changed. Sometimes this is viewed as a forward-looking stance, but science that can’t be fixed isn’t past science; it’s dead science. And what bothers me about Eccles, Ioannou, and Serafeim, and all the many error-deniers like them, is that they don’t seem to realize this. It’s this fundamental misunderstanding of the scientific and scholarly endeavor, more than the dishonesty or sloppiness or whatever is the specific unethical behavior, that bothers me.

But, yeah, Andy King has a point that when universities, journals, and other institutions support the bad behavior, that’s not good. That doesn’t help at all. In all seriousness, you gotta feel a little sorry for Harvard Business School: they’ve had so many of these scandals now. It’s not like Duke and MIT business schools, which just had one scandal each–actually it was the same scandal for the two of them.

Don’t get any on you

This is Jessica. In “A Glass-Bottomed Cadillac”, David Hickey describes the advice Hank Williams Sr. received from his father and passes on to his son: “Don’t get any on you, pipsqueak.” By which he meant, don’t let the moral fallout of the road permeate your sense of self. Will Oldham describes the same struggle to retain yourself while the world presses in:

I am still what I meant to be
And I’m losing my mind
But our burdens must lessen
Though our enemies thrive

These days it seems there’s plenty of compromising mess to get on a person who isn’t being careful. By using certain services or purchasing from certain companies, you may be implicitly empowering forces you don’t agree with. Take X, fka Twitter, for example. Until a few days ago they were still enabling anyone willing to pay a small fee to generate child sexual abuse material. To Elon Musk, stopping users from subjugating whoever they please, regardless of their age, was unnecessary censorship. And yet, most of the people I know there went about their business on that platform without so much as a peep. By which I mean breathlessly posting about all the other, apparently more intellectually gratifying look-at-what-the-AI-did-now stuff. Watching AI research conversations play out on social media lately is both exhilarating and exhausting, with the volume of news coming out, and the sense of needing to not miss out on the next source of buzz that it fosters. One could even liken being in AI or ML to being a porn or sex addict, with all the urgency and heavy breathing. But that’s an analogy for a different post.

A less experienced version of myself might have felt personally offended by the thought of friends continuing to support a situation that actively disenfranchised people like me. But one eventually learns that going through life taking such things others do personally is no recipe for peace of mind. I also know from experience that the internal calculus does not always feel so easy when it comes to deciding when to vote with one’s feet. In the case of X’s knowing enablement of child porn, the right thing to do may have seemed obvious (like, don’t send your money every month to an entity that is enabling CSAM at scale!) But I get that severing connections with things you perceive as important to your goals can be difficult, and I think many researchers see their social media influence as part of their identity and evidence of success. Still, the whole situation makes me feel hesitant about going back to that platform.

From a practical perspective, part of the problem with getting too offended by people for not standing up to systems in which they are embedded, even if the principle of it does really does seem obvious, is that it assumes a level of intentionality that they may not have. To be clear, I do not mean that the choice is not ultimately available to them, or that it’s wrong to hold people accountable for their actions. Rather, I have been thinking about how hard it can be to possess oneself, and how many people do not fully possess themselves. By which I mean, they lack either the imagination or the intrinsic motivation to live by their own values. Or as Gram Parsons once put it,

Some of my friends don’t know who they belong to
Some can’t get a single thing to work inside

It’s possible to shut many things out in the name of “making it.” In my experience what gets shut out, and what you let get to you, is often not so much a choice as an attribute of your level of self-knowledge and self-ownership about yourself at that point. Are you capable of imagining a version of yourself that remains true to your values even if it causes friction with other perceived needs? Are you capable of imagining a version of yourself that is sure enough of what you’re doing that the friction disappears? Could you love yourself as much or feel as secure in your position without your Twitter account?

It’s easy to accumulate impurities in the quest to be something greater than your current state. As Rafe Meager writes in their recent essay, “In professional life, one is obligated to traffic in a certain amount of bullshit. We all know that. But it has to be finely calibrated, and that is hard.”

Calibrating our behavior and beliefs is hard in part because we can become conditioned to acting against our sense of self as a normal part of personal and professional growth. Actively moving toward the things that challenge you is not necessarily a flaw, and certain things must be endured to succeed in a new game. And often the things that challenge us are the things that don’t come naturally, that are not, at first glance, clearly aligned with our values.

When I was younger, one of the last things I wanted to be was a computer scientist. This is not to say I’m unhappy, or I don’t enjoy what I do now, because I very much do. I just can’t even imagine having to explain to my high school self how that happened, as what we do in computer science (and to some extent in other disciplines I frequent, like statistics) has never felt like the things that I’m most intrinsically motivated to be good at – I always did very well in math and science classes, but I had no passion for it. But somewhere along the way I got disillusioned with the pursuits that seemed more aligned with my values, like art, and so I started gravitating to the things that seemed most different from what bothered me. After enough years of testing to see how far you can take something, it becomes hard to go back, and it does start to feel like what you’re meant to do. Meanwhile, the Destroyer song loops in my mind,

Don’t become
Don’t become
The thing you hated
The thing you hated
The thing you hated

It’s a dangerous game, getting very good at letting go of things that once seemed important to who you are, in favor of urges to be something else. On the one hand, there’s a real power to be gained in separating yourself from the things you think you need. Sometimes after you take the first step, it’s like something clicks and there’s a high in suddenly realization that something you were stuck on bothers you no more. Or maybe it still does, but somehow now you can find peace in settling back to watch the tides of your cut-off aspirations and desires continue to pulse back toward what was severed. A way of earning indifference. I believe it’s possible to literally remake yourself this way.

And yet part of me, in looking back, feels like maybe I let myself down. Maybe I would’ve been a better writer or philosopher, both of which I always felt more personal proclivity for. Or maybe I would just feel less disillusioned now if I’d kept up certain hobbies that do matter to me—like writing poetry—more consistently, rather than dropping it for so many years because I couldn’t imagine what it would mean to be a person who did both. It was a failure of imagination: I couldn’t conceive of devoting myself to becoming the new thing while also retaining other aspects of myself.

What’s interesting is that this kind of letting yourself down, by letting yourself drift too far from your original purpose or what you feel like you’re best at, is not necessarily so different from lacking the imagination to do the right thing in the current political moment. There is a loss in both cases, a quiet slipping away of who you really are while you think you’re out there proving it. While you think you’re the one who has your priorities straight, while you’re striving to play the game, or maybe you’re even killing it, comparing how many followers you have or how many papers you wrote this year to those around you. The ignorability of it all is terrifying. Kierkegard got it right when he wrote that “The greatest hazard of all, losing one’s self, can occur very quietly in the world, as if it were nothing at all. No other loss can occur so quietly; any other loss—an arm, a leg, five dollars, a wife, etc.—is sure to be noticed.”

How do you tell whether letting go is growth or self-betrayal—especially when what we “need” narrows what we can even see? On the one hand, by moving more towards statistics and formalization over time, my thinking has expanded to encompass new forms of rigor. But it’s a kind of rigor along narrow lines. In another sense I lost imagination, in that it’s now harder for me to take seriously things that I can’t fit into my formal frameworks.

Did I lose ownership of myself? When I think about what it means to fully possess oneself, I think of Aristotle’s preoccupation with explanations of the inner principles that determine an entity’s states of change and rest. He distinguished that which exists by nature (physei, φύσει) and that which exists from other causes (di’ allas aitias, δι’ ἄλλας αἰτίας). It’s the first kind that’s self-possessed: it “contains in itself its own archê (ἀρχή),” the principle and origin of its entry into presence; the second “does not have its principle in itself,” but finds it in the productive activity of human beings.

This is why failures of imagination feel so tragic. To lack imagination is not only to fail to picture alternative versions of the self; it is to lose contact with the inner source that could have animated them. You become legible, optimizable, and perhaps successful, but successful in the way an object is successful when it performs its intended function. You can be moving quickly, and still be at rest with respect to yourself.

Researchers are beginning to ask how, and if, generative AI systems can attain something like intrinsic motivation. How do you get it to devote itself to an open-ended goal, like creative expression, that can’t be boiled down into a simple reward model? In a recent paper, Charness and Grieco find that AI outputs outperform human outputs (as determined by other humans) for tasks that are more clearly specified in terms of how to solve (“closed”), like writing short stories using specific required words. But AI outputs consistently underperform human outputs for open-ended tasks, like inventing things, where the participant is required to find, invent, or discover the problems. They propose a model for the utility function the agent faces as depending on three factors: the output they’d get from simply following instructions, deviation from the instruction-following output due to randomness (e.g., model temperature), and the utility of exercising imagination. Models can only follow instructions more or less closely, and obtain more diverse outputs through higher temperature, but humans alone respond to the pleasure of bringing imaginative ideas to life, a proxy for intrinsic motivation.

Is intrinsic motivation partly a matter of temporal depth—the capacity to care about consequences that don’t pay off immediately, or even in any obvious reward currency? What kind of “vision” allows an agent to see farther ahead, and be moved by what it sees? Yeats, in A Vision, has a line that keeps returning to me: “The Spirit … may know the most violent love and hatred possible, for it can see the remote consequences of the most trivial acts of the living, provided those consequences are part of its future life.”

The need for this kind of vision feels politically relevant now, as programs are dismantled and regulations rolled back or rewritten in ways that will reverberate for years. As a professor, most present in my mind are the moves that affect science in this country—visa regulation, institutional acquiescence when under political attack, the seeding of doubt in the goals of science or value of education.

But Yeats’ quote also feels very personally relevant. What would it take for us to be able to see this way in our personal choices? When we ignore the alliances we support through our various choices, we narrow our vision on purpose. It’s a survival mechanism, but it comes at a cost.

Ultimately I don’t think anyone owes me an explanation of their internal calculus. But neither do I owe them the assumption that they are in control of their drives. There was a time I might have tried to convince myself that the right interpretation was to give everyone the maximum benefit of the doubt. But as Andrew emphasizes, steelmanning is its own hang-up.

Meanwhile, there are days now where I feel like I’m just waking up, from a period of my life where I became laser focused on goals and outcomes and forgot about everything else. It’s quite painful sometimes, existing with this new awareness of what was always there, an inner “archê” that I’d let go quiet. More bluntly, it’s a cliche mid-life crisis with everything but the sports car. But the waking up is also very beautiful: realizing, suddenly, how real things are, how alive, and how little use there have for whatever games you’ve been playing. You don’t get infinite chances to notice.

It’s open season on the unabashedly earnest

This is Jessica. In response to my post on slop, Thomas Basbøll shared a 1967 New Yorker essay by Jacob Brackman about the havoc wreaked by the emergence of the “Put-On” in 1960s (and slightly earlier) art and culture. True to its name, the “Put-On” refers to a response that is deliberately outlandish yet ambiguous about intention, confusing the other party and causing them to doubt its sincerity. 

The put-on is perhaps best exemplified by Bob Dylan’s smart alecky style of responding to interviewers, in which he alternates between crazy stories about his past, exasperation with the counter-culture of which he’s part, and pointed questions turned back on the interviewer. Who is left to wonder, How much of this is real? Is he caricaturing himself, or is this actually his personality? But the put-on also appears in art and culture more broadly – e.g., Is John Cage making an important statement or just putting the audience on with these silence performances? Is Andy Warhol out to make fools of his critics with the Brillo boxes? The put-on is unsettling because you cannot resolve whether meaning is intended or still to come, or you are just wasting your time: “put-ons may disguise the fact that someone has nothing of interest to say—may, indeed, give precisely the opposite impression.” 

Today the put-on takes different forms – video shorts of animals doing things that are just beyond the boundary of what seems plausible, enough so that we need to watch a second time to figure out if it’s real. Essays or presentations by our students that elude a little too much confidence given their lack of experience with the topic, but which they deny using generative AI to write. There has always been plenty of bullshit on the internet, and plenty of cheating in classes, but Brackman’s stages of the put-on are especially familiar lately:

  1. You’re sucked in.
  2. You become confused.
  3. You resent (or appreciate) having been tricked. 

Patience games

The problem with the put-on–whether orchestrated by musicians or artists in the 60s or today’s language models and image generators–is that the ambiguity is strategic. You don’t know if it is going somewhere. You’re stuck sitting with your uncertainty, reflecting on how far your good naturedness extends. 

In teaching, when you think you’re facing undisclosed (over)reliance on generative AI for an assignment, do you take the sincere path of asking the student what they did, and trusting their response? Do you try to catch them in a lie? Or do you decide it’s just not worth your time to sleuth and let the students decide for themselves if they will use the course to learn something versus play the game?

We find ourselves facing games we may not want to play, and for which we have no precedent. This year ICML, one of the big machine learning conferences, is offering authors a choice: opt-in to a permissive policy about generative AI use in reviewing, or go the purist route, where your reviewers can’t use it at all, and you can’t use it at all for your own reviews either. The reviewer matching process sounds like it could get messy, and ML conferences are already known for their review randomness. Which option is likely to be less noisy? 

Not to mention that as a reviewer, one must increasingly wonder whether the paper they are preparing their comments on is an experiment in automated science. Will the authors even read your feedback? Do they care to improve the work? Or have you been inadvertently reduced to a Turing signal? 

It’s not easy for the “unabashedly earnest”, who dislike playing games and want to retain a certain innocence in their encounters with others, but who also want to stay ahead of the curve and not get duped. The put-on depends on the gullibility of its victim, so you face a choice of being ok with continuing as usual but feeling used at times, or becoming more skeptical about people in general. Please don’t make me part of your game is becoming the refrain for a new way of life. 

There’s little reason to think it will get better anytime soon. It’s still early and many people are still playing the old game, or still experimenting with how much they can get generative AI to do. We should be preparing for more disruption. 

From the sacred to the profane probabilistic

I find myself thinking about what kinds of signals I consider more sacred, i.e., that I would most dread seeing lose their meaning. For example, what do you do about undisclosed use of generative AI in close relationships? What if you suspect the friend or romantic partner you are corresponding with is relying on the AI suggested responses to do the thinking? Do you ask them about it, or let it go and risk the uncertainty undermining your ability to trust them?

I would also distinguish feedback on writing that is more personal. I don’t mind an AI-generated review on my research if it’s guided by a human with the right expertise. But if, for example, I was to learn that comments on my posts here that I took seriously as a reflection of engagement with what I wrote or that just gave me a rewarding feeling of connecting with people outside my usual sphere (which blogging is great for), I would feel dumb, and it would probably affect my desire to blog. But this is already happening on social media, with bot accounts jumping in with random, effusive compliments on what you write. 

Another scenario that makes me cringe is the application of generative AI to the kinds of art and literature that I get inspiration from. I can potentially enjoy some AI-generated music or script folded into the mundane background track or sitcom if it’s decent, but I look to art museums for a kind of consolation on what it means to be human, to be vulnerable, to feel forms of loss on a deep level. I don’t doubt that generative AI could occasionally result in experiences that would be hard for me to distinguish from human contemporary art. But I can’t imagine myself ever getting interested in art created by AI the way I’m interested in what other people make, because of the lack of specificity or intention. So if it were to infiltrate that realm, and I could no longer count on there being a human lived experience behind art, it would bother me. 

One thing I feel relatively sure of is that I won’t be wanting an AI guru. I wouldn’t be surprised if generative AI could do a pretty good job of mimicking the kind of capriciousness associated with spiritual guides like Zen masters. But similar to art, there’s something important about the person having experiences in the world that feels essential. 

I would be curious though to hear counterarguments from people who have thought about AI in art or religion or more intimate personal communication. Part of what I find difficult in all this is that I consider myself generally optimistic about new technology, and open to change from it (I am, after all, a computer scientist). So I would also hate to prematurely “close my ears” like a square in the 50s or 60s walking out on Cage’s experiments in sound. And so I expect my patience to remain unstable, and it to remain hard for me to predict what experiences will give me the urge to ditch versus hit rewind.

Institutional unraveling

Returning to the general theme of new decision points as signals erode in value, things are likely to get worse before they get better. Many of our systems are still mostly functioning at this point, because many people are still figuring out how to use generative AI, and where to draw their own line, or they are avoiding it completely. But the seeds for institutional breakdown are all around us. 

According to Zeynep Tufecki’s recent keynote at NeurIPS (which I summarize here), the problem is that society is built on assumptions that certain things will be hard (or “load bearing frictions”), i.e., that only humans can generate outputs with certain properties. LLMs break our ability to conclude there is proof of effort, or of authenticity or sincerity. Gatekeeping is a necessary function, and when the old mechanisms stop working, other measures will step in, like relying on the prestige of the candidate’s institution or their connections to decide who to hire, or what papers to cite or publish. When those things are no longer hard, some mechanism must step in in its place, and it may not be ideal. The point being if you break something important, you don’t necessarily get something better unless you build something better. 

I like how this view focuses attention on outcomes within the realm of our ability to predict, like what kinds of gatekeeping will emerge or are already emerging to fill in the holes. We can then try to identify better alternatives to those, rather than trying to predict when “AGI” will happen or what the most destructive thing AI could do is. Though it doesn’t absolve us of the very humanist discomfort of watching our precious tokens of sincerity wash away, and the personal choices that come with that. 

Brackman quotes P. T. Barnum on how “People like to be fooled,” and “There’s a sucker born every minute.” While the put-on has always relied on the victim’s willingness to stay in the conversation, the answer is unlikely to be opting out of dealing with AI output entirely (though there are certainly people in that camp). Some flexibility is warranted while norms are still shifting, and organizations are doing the right thing by experimenting with new policies. But until we have better signals, the burden of the put-on stays where it was: on the person deciding in that moment whether to continue listening.

The combination of originality, ambition, and lack of scruple can take you far in social science.

I happened to come across the above line in this post from a few years ago, about the scholar and Ted talk performer who made the ridiculously innumerate claim that “It’s possible to put actual monetary value on each citation a paper receives. We can, in other words calculate exactly how much a single citation is worth. . . . in the United States each citation is worth a whopping $100,000.”

Being an idiot is part of this guy’s success–but only part of it. The nation’s universities are full of intellectually limited tenured professors, and they don’t all get Ted talks. As I put it earlier, I attribute this guy’s success his ability to come up with big ideas, along with his willingness to act as if his claims were supported by evidence, when they’re not. The big ideas are important–without them, he’s just one more schlub with a Ph.D. and a Rolodex.

“No one could suspect that times were coming . . . when the man who did not gamble would lose all the time, even more surely than he who gambled.”

In preparation for this new class, I was reading White Collar, the classic 1951 book by sociologist C. Wright Mills. It’s perfect for week 2 of the course because it begins with a discussion of the changes from an American middle class of freeholders and tradesmen to a society of employees. I can only assume that lots of its claims have been disputed and discredited in the past 75 years, but at the very least it gives a window into the urban version of the frontier thesis in American history.

But what I wanted to talk about now is the quote that I put at the title of this post. It’s the epigraph to White Collar, and Mills attributes it to Charles Péguy. A Google search points us to a long poem from 1912 entitled Le porche du mystère de la deuxième vertu, translated as The Portal of the Mystery of Hope. I found part of a translation online here, but this was published in 1996 so I guess Mills was quoting from some earlier translation, or maybe he translated that brief passage himself from the French?

Searching in French leads a link to the actual published poem from 1912! I didn’t have the patience to read the whole thing. I skimmed through to see if I could see any passage close to “No one could suspect that times were coming … when the man who did not gamble would lose all the time, even more surely than he who gambled,” but no dice. The document also has a search function (“Estimated OCR rate for this document : 99.69%”), and I searched and searched but couldn’t find anything. I searched on “personne,” “soupçonner,” “temps,” “homme,” “parier” (also “pari,” “pariez,” etc.), “jouer” and its variants, “perdre,” even “sûrement,” but none of these led to anything even close to the quoted passage. Adding to the mystery, the only translation I could see of The Portal of the Mystery of Hope was from 1996–obviously not the version that Mills was quoting from back in 1951.

Can any of you track down this quote? Maybe this is a job for the Quote Investigator. I did a quick search and he’s never covered this one, so who knows.

For reasons that should be obvious, I like the quote a lot, but I’m loath to use it until I know its source. Yes, I could cite as “C. Wright Mills (1951), attributing to Charles Péguy,” but I’d like to do better than that!

Slop is not distinguishable by its attributes. It is an attitude of production

Since it’s dictionary week here on the blog, why not discuss Merriam-Webster’s word of the year: slop. They define it as:

digital content of low quality that is produced usually in quantity by means of artificial intelligence.

Max Read discusses conventional associations with slop–qualities like “forgettability, predictability, unoriginality, lifelessness” or “cheap, low-effort, convenient, consumable, interchangeable,” He collects several more pointed definitions from the web:  

“a low-to-zero marginal-cost substitute for something valued, or something being aggressively positioned to substitute for craft” from Bluesky

“the negative platonic form: not the ideal that particulars aspire toward, but the silhouette left when you subtract everything that would make a specific instance rather than a thing of a type” from Kevin Baker

He also proposes his own definition:

“slop” is that which is “fully optimized” to its domain to the point of texturelessness or characterlessness. “Slop” in this sense is anything designed to be as easy as possible to produce, sell, and consume, but it’s particularly slop at the point where all or most other players in the same space adopt the same strategies, and the material is no longer individual or differentiated from its competitors.

I enjoyed all of these. They paint slop as a kind of mass-produced shell rushing toward you at the speed of modern silicon chips. 

But these definitions also all miss a defining feature of slop, the thing that makes me feel vaguely repulsed when I see it despite the superficial harmlessness of what is often just some generic message or image or text. Slop is not merely a genre of media, it is an attitude of production, a cynical operating posture that is offensive not just on a surface level of insulting the consumer’s intelligence, though there is that. It is an ethos of resigned instrumentality that disgusts us with its intentional satisficing and lack of effort the way kitsch disgusted some art critics, a refusal of responsibility to authenticity, situatedness, and the risks associated with individualistic expression. A practical nihilism that threatens to engulf our own more sparse yet genuine attempts at production. From this perspective, the act of denial makes slop more like a spiritual threat than a type of content. 

Speaking of kitsch, I think it’s worth distinguishing art from slop. Kevin Baker’s definition of slop as a kind of shell devoid of any individual substance reminds me of certain philosophical arguments about art post-modernity. Various writers have described how after the emergence of the conception of “taste” in art, and the series of events that led up to moments like Duchamp installing a toilet in a gallery and calling it art, great art can no longer have “positive” content. It can only refer to the absence of something. In this sense art is irony. And yet, while I think humans can very much create slop without AI, I don’t think of much art as slop, because whether doomed to be self-referential or not, making art implies belief in something on the part of its creator, a kind of taking of responsibility to interaction. To make art is to anticipate its completion through the viewer.

For example, lately I’ve been thinking about pop art. I was in Pittsburgh and went to the Warhol museum. I was in Copenhagen and went to the Louisiana museum, where they happened to have a Marisol exhibition. Contemporary art has a special place in my heart, but I don’t like pop art. I never really have. However, I don’t think it’s fair to call it slop, even though it would fit many of the definitions above–it’s cheap, low-effort, could be produced in bulk, designed to mimic the predictable and forgettable. I can respect someone like Warhol because at the time, the work expressed a point of view, it contributed to a conversation, and by doing so opened a door to possibility, like all great art aspires to do. 

Slop, on the other hand, is talking when you have nothing to say. Slop is a waste of your time as a consumer, but also a waste of time for the author, who pleads for attention while denying themself a chance at discovering meaning. In this way, one could say slop is a matter of life or death, since after all, every moment is bringing us closer to death.

P.S. Merry Christmas to those who celebrate!

The problems with popular internet heuristics such as “Hanlon’s razor,” “steelmanning,” and “Godwin’s law,” all of which kind of fall apart in the presence of actual malice, actual bad ideas, and actual Nazis.

From my review of Dan Davies’s book on business fraud:

Fraud might be an unusual “tail risk” in business, but in science it’s usual. It happens all the time. Just in my own career, I had a colleague who plagiarized; another one who published a report deliberately leaving out data that contradicted the story he wanted to tell; another who lied, cheated, and stole (I can’t be sure about that one as I didn’t see it personally; the story was told to me by someone who I trust); another who smugly tried to break an agreement; and another who was conned by a coauthor who made up data. That’s a lot! It’s two cases that directly affected me and three that involved people I knew personally. There was also Columbia faking its U.S. News ranking data; I don’t know any of the people involved but, as a Columbia employee, I guess that I indirectly benefited from the fraud while it was happening. I’d guess that dishonesty is widespread in business as well.

This led me to an point that’s important enough that it deserves a post of its own (i.e., this one):

This also reminds me of the problems with popular internet heuristics such as “Hanlon’s razor,” “steelmanning,” and “Godwin’s law,” all of which kind of fall apart in the presence of actual malice, actual bad ideas, and actual Nazis. The challenge is to hold the following two ideas in your head at once:

1. In science, bad work does not require cheating; in science, honesty and transparency are not enough; just cos I say you did bad work it doesn’t mean I’m accusing you of fraud; just cos you followed the rules as you were taught and didn’t cheat it doesn’t mean you made the discovery you thought you did.

2. There are a lot of bad guys and cheaters out there. It’s typically a bad idea to assume that someone is cheating, but it’s also often a mistake to assume that they’re not.

A related point from that post:

Davies refers to “the vital element of time” in perpetuating a fraud. A key point here is that uncovering the fraud is never as high a priority to outsiders as perpetuating the fraud is for the fraudsters. Even when money is at stake, the amount of money lost by each individual investor will be less than what is at stake for the perpetuator of the fraud. What this means is that sometimes the fraudster can stay alive by just dragging things out until the people on the other side get tired. That’s a standard strategy of insurance companies, right? To delay, delay, delay until the policyholder just gives up, making the rational calculation that it’s better to just cut your losses.

I’ve seen this sort of thing before, that cheaters take advantage of other people’s rationality. They play a game of chicken, acting a bit (or a lot) crazier than anyone else. It’s the madman theory of diplomacy. We’ve seen some examples recently of researchers who’ve had to deal with the aftermath of cheating collaborators, and it can be tough! When you realize a collaborator is a cheater, you’re dancing with a tiger. Someone who’s willing to lie and cheat and make up data could be willing to do all sorts of things, for example they could be willing to lie about your collaboration. So all of a sudden you have to be very careful.

P.S. I talked about other problems with “steelmanning” here.

“I think there’s an argument to be made that much meta-scientific work is a kind of mirror image of the empirical work it critiques”

In the context of our recent discussion of the p-curve paper, Richard Morey wrote, “I think there’s an argument to be made that much meta-scientific work is a kind of mirror image of the empirical work it critiques,” and he shared this chart:

I think Morey is on to something here, but, as someone who does a lot of empirical science and a lot of meta-science, I think there’s one big thing he’s missing, one major asymmetry between empirical science and meta-science, and that is that bad empirical science makes strong claims, and the role of meta-science is to question the evidential support behind these claims, not usually to make a positive claim in itself.

The usual pattern goes like this: empirical scientists collect data D, perform analysis A, and use these to make strong general claim X about the world. The meta-scientist then comes along to assess the evidence. A negative meta-science analysis comes to the conclusion that D + A do not provide good evidence for X. The meta-science analysis does not make the strong claim that X is false, let alone the even stronger claim that some preferred alternative Y is true.

This comes up all the time. Some Cornell psychology professor claims to have strong evidence for extra-sensory perception or influence of food labeling on eating or whatever. The meta-scientist comes along and notes irregularities with the data or analysis and provides an alternative story of how these apparently convincing patterns in data could have come to be. The conclusion of the meta-scientific report is not that ESP or large effects of food labels don’t exist but rather that the published record does not provide good evidence of these extraordinary claims. (And indeed the claims are extraordinary, which is how they got so much publicity in the first place.)

It’s the all-important distinction between truth and evidence. I know that Morey understands this distinction and I’m not saying that anything in his above chart is wrong; I’m just trying to put it in the larger perspective of scientific inquiry.

In discussing the above asymmetry between empirical science and meta-science, I’m not saying that meta-science is better. Meta-science is fundamentally parasitic on empirical science, and, sure, empirical science is associated with bold claims, but it’s through making bold leaps–and being willing to retract those leaps as needed–that we make progress. The problem with bad science is not so much the overconfident conjectures–such steps may be psychologically necessary–so much as the unwillingness to reflect on contrary evidence, the unwillingness to admit error, and the practice of not confronting past mistakes.

And also the really stupid things that people say and never apologize for.

What’s your Jordan3 number?

In the discussion of our post, Who has the lowest Erdos-Bacon-Epstein number? (the winner appears to be the mathematician Daniel Kleitman, my freshman-year academic adviser at MIT!), an anonymous commenter asks:

Is there anyone with a finite Michael Jordan^3 number (acting with Michael B Jordan, coauthoring with Michael I Jordan, playing on a team with Michael J Jordan)?

Good question! In the earlier post we discussed the rules for what counts in being in the acting network (IMDB and with a legitimate acting credit, not just being interviewed) and the academic authorship network (scholarly journals).

What about playing on a team? What would it take to be in the Michael J Jordan network? It would be too much to restrict to players on NBA teams. I’d allow any college team–but only varsity would count, not intramurals–but even that is pretty darn restrictive, so I think I’d count high school varsity as well.

I guess that lots of guys who’ve played high school varsity basketball have some connection to Jordan. You just need to have one player on your team who played in college, then one guy in that player’s college team who ever made it to the NBA, and then the graph must be complete from there. You could also get there through a different sport–for example, maybe you played football, and someone on your football team played basketball, and someone on their team played in college . . . or maybe someone on your football team played college football for awhile, and someone else on that team played basketball in high school, and someone else on his high school team played basketball in college, etc.

I’m guessing that somewhere there are people who (a) have acted in at least one movie, (b) have coauthored at least one academic article, and (c) played on a high school varsity team. And if you have all three of these attributes, you have a shot at having a finite Jordan3 number.

I can’t do it myself, as I’ve never acted and I’ve never played varsity sports.

I do have a cousin who’s acted on TV, though. This one show he was on has a huge list of famous names, which I guess can happen for a TV show that runs for lots of episodes, but, still, the very very list includes the still famous Billy Dee Williams, along with vaguely-familiar faces such as Dennis Christopher, Max Gail, Stuart Margolin, as well as G. Gordon Liddy (!) and someone named Tony W. Randall (no, not the Odd Couple guy) and someone named Robert Axelrod (no, not the political scientist). My cousin also was in the Olympics, and maybe someone on his team also played serious high school sports, so he could well have a finite Michael J Jordan number too. But his Michael I Jordan number is infinite, because he has no academic publications. Just to check this out, I searched for my cousin’s name on Google scholar, but all I found were two papers by his dad, but they’re single-authored so that wouldn’t work either. My uncle was no academic; he was a doctor who many years ago was enthusiastic about computer touchpad and voice-recognition technology and wrote a couple articles about a system he was trying to sell for computerized medical records.

And then there’s Michael J Jordan, who by definition has a Michael J Jordan number of 0, and he starred in Space Jam, and that movie has a long cast list, so I’m guessing his Michael B Jordan number is no more than 4. But no scholarly publications (no, this namesake doesn’t count), so his Michael I Jordan number is infinity.

I’m guessing, though, that there are some people out there with that finite Jordan3 number. Any ideas? Someone you know who’s acted in a legit production, coauthored a scholarly publication, and played on a high school sports team? No Jeffrey Epstein connection required.