The Rider

I finally followed Phil’s advice and read The Rider, the 1978 cult classic by Tim Krabbé. It lived up to the hype. The book is the story of a minor-league bicycle race, as told by one of its participants, a journalist and amateur road racer. It’s pretty much a perfect book in form and content.

I want to say that The Rider belongs on a shelf of classic short nonfiction books, along with A Little Book About a Big Memory by A. R. Luria, How Animals Work by Knut Schmidt-Nielsen, The Origins of the Second World War by A. J. P. Taylor, Total Poker by David Spanier, and . . . hmmm, there aren’t really so many classic short nonfiction books really, are there?

In this interview, Krabbé characterizes The Rider as “a novel” but also as “90 to 95 percent real.” I wonder if he went back over the course while writing the book so as to jog his memory and help him get the details right.

I kinda wish Krabbé had gone to the trouble to make the book 100% real. It’s not clear to me what’s the missing 5 to 10%. Is he just saying his recollection is imperfect so there will be inevitable mistakes? Or did he change the names or combine some racers into composite characters? Did he reorder some events to make a better story? Did he just make up some stories entirely? The book is great, so I’m in no position to question Krabbé’s judgment in introducing fiction to his story. But if it’s 90 to 95% real, couldn’t he have written a short appendix telling us where the made-up stuff was? I feel that would increase my appreciation of the book. Krabbé has no obligation to do anything like that; I just think it would make this great book even better.

Shreddergate! A fascinating investigation into possible dishonesty in a psychology experiment

From A to Z

A couple years ago we discussed a post by Mark Zimbelman expressing skepticism regarding a claim by psychologist (also business school professor, Ted talk star, NPR favorite, Edge Foundation associate, retired Wall Street Journal columnist, Founding Partner of Irrational Capital, insurance agent, bestselling author, etc etc) Dan Ariely. Zimbelman wrote:

I have had some suspicions about some experiments that Professor Ariely ran using a shredder that was modified so it looked like it was shredding but really wasn’t . . . He claimed it was “quite simple” to convert a shredder by breaking the teeth in the middle with a screwdriver . . . We were unable to break any teeth on the shredders we purchased but ended up finding a way to remove some of the teeth in the center by taking the shredder apart. Unfortunately, when we did this the papers would no longer go through the shredder without getting turned to one side or another and they inevitably got stuck because the shredder no longer had enough teeth to pull them through. We concluded that it was impossible to modify any of the shredders we bought . . .

A couple weeks after his first post, Zimbelman followed up with further investigation:

I [Zimbelman] did an extensive literature search (involving several others who helped out) looking for the research that he claims was done with the modified shredder. The end result is that I can’t find any published paper that discusses using a modified shredder. I even called one of his co-authors and asked him if the experiment that they ran together used a modified shredder. He said the shredder in their study was not modified.

I did find a few papers that used a regular shredder but did not mention any modifications. I also found several statements (including this one and the one linked above) where he claims to use this mysterious modified shredder. Overall then, here’s where we are with the shredder:

1. Dr. Ariely has made numerous claims to use a modified shredder in his matrix experiments.

2. I am unable to find any published papers by Dr. Ariely that use a modified shredder.

3. Modifying a shredder to do what he has claimed appears to be very unlikely.

Zimbelman’s posts were from late 2021, and I reported on them in early 2022. That’s where things stood for me until an anonymous tipster pointed us to this video that Ariely posted to Youtube in September 2023, along with this note:

Over the years I ran many different versions of honesty/ dishonesty experiments. In one of them, I used a shredder. Here is a short piece from the Dishonesty movie in which this shredder is starring and can be seen in action.

Here are some screenshots:

The Dishonesty movie is from 2015. Here’s a press release from 2016 promoting the movie. The press release says that the experiment that used the shredder was performed in 2002.

I assume the scene in the movie showing the experiment was a reconstruction. It just seems doubtful that in 2002 they would’ve taken these videos of the participants in that way. There’s nothing dishonest about reconstructing a past scene—documentaries do that all the time!—I’m just trying to figure out exactly what happened here. Just to be sure, I watched the clip carefully for any clues about when it was shot . . . and I noticed this:

Let’s zoom in on that mobile phone:

I don’t know anything about mobile phones, so I asked people who did, and they assured me that the phones in 2002 didn’t look like that, which again suggests the scene in the documentary was a reconstruction. Which is still fine, no problems yet.

Home Depot . . . or Staples?

Zimbelman’s post links to a voicemail message from Ariely saying they bought the shredder from Home Depot. But in the video, the shredder is labeled Staples:

Wassup with that? In the label on the Youtube video, Ariely says that the shredder in the video is the same as the one they used. (“Over the years I ran many different versions of honesty/ dishonesty experiments. In one of them, I used a shredder. Here is a short piece from the Dishonesty movie in which this shredder is starring and can be seen in action.”) Which makes sense. It’s not like they’d have a whole bunch of modified shredders kicking around. But then where did they buy it? Back in 2002, was Home Depot carrying Staples brand products? I guess a real shredder-head could tell just by looking whether the shredder in the video is vintage 2002 or 2015. The shredder in the video doesn’t look 13 years old to me, but who am I to say? Unfortunately, the image of the label on the back of the machine isn’t readable:

You can make out the Staples brand name but that’s about it.

So, what happened?

There are several possibilities here:

1. The 2002 experiment happened as was described, they really modified that shredder as claimed, they kept the shredder around, it was still operating in 2015, they demonstrated it in the video, and it worked as advertised. Also, that shredder in 2002 had actually been bought at Staples, and Ariely just had a lapse in memory when he said they’d bought it at Home Depot.

2. The 2002 experiment happened as was described, they really bought that shredder at Home Depot and modified it as claimed, but then the shredder was lost, or discarded, or broke. When they made the movie, they went to Staples and bought a new shredder, modified it (using that approach which Ariely said is simple but which Zimbelman in the above-linked post said is actually difficult or impossible), and it worked just as planned. Under this scenario, Ariely was not telling the truth when he wrote that the shredder is the same as in the original experiment, but, hey, it’s just a movie, right? The Coen brothers had that title card saying that Fargo was based on a true story, and it wasn’t—that doesn’t mean they were “lying,” exactly!

3. The 2002 experiment happened as was described, they really bought that shredder at Home Depot and modified it as claimed, but then the shredder was lost, or discarded, or broke. When they made the movie, they went to Staples and bought a new shredder, but at this point they didn’t bother to try to modify it. It was just a dramatization, after all! Why ruin a perfectly good shredder from Staples? Instead they just filmed things just as they literally look in the video: they put paper in the shredder, then off screen they mangle the edges of some other sheets of paper, put them in the bin in the bottom of the shredder, and then turn on the video again and take the partially-mangled papers out. The way I wrote this, it seems kinda complicated, but if you think about it from the standpoint of someone making a video, this is much easier than getting some modified shredder to work! Indeed, if you look at the video, the sheets of paper that went into the shredder at the beginning are not the same as the ones that they took out at the end.

4. The 2002 experiment never happened as described. Ariely or one of his collaborators had that cool idea of modifying the shredder, but it wasn’t so easy to do so they gave up and just faked the study. Then when making the 2015 movie they just said they lost the shredder, and the moviemakers just did the reenactment as described in option 3 above.

5. I’m sure there are other possibilities I didn’t think of!

Is the video “fake”? I doubt it! I assume the video is either a clip from the movie or was filmed at the same time as the movie, and in either case it’s a reenactment of Ariely’s description of what happened in the experiment. In a reenactment if you show some sheets of paper fed into a shredder and then later you show some sheets of paper removed from the shredder, there’s no reason that they have to be the same sheets of paper. Similarly, if a movie reenacts a flight from New York to Chicago, and it shows a shot of a plane taking off from LaGuardia, followed by a shot of a plane landing in O’Hare, they don’t have to be the same plane. It’s just a reenactment! The fact that someone made a video reenacting a scene with a shredder does not imply that this shredding actually happened in the video—there’s really no reason to for them to have gone to the trouble to have done this shredding at all.

So, there are several possibilities consistent with the information that is currently available to us. You can use your own judgment to decide what you think might have happened in 2002 and 2015.

“Dishonesty can permeate through a system and show up not because of selfish interest but because of a desire to help.”

In the above-linked press release—the one that said, “we modified the shredder! We only shredded the sides of the page, whereas the body of the page remained intact”—Ariely also said:

There are pressures for funding. Imagine you run a big lab with 20 people and you’re about to run out of funding – what are the pressures of taking care of the people who work with you? And what kind of shortcuts would you be willing to take? Dishonesty can permeate through a system and show up not because of selfish interest but because of a desire to help.

With regards to research, I don’t think that most people think long-term and think to themselves that somebody would try to replicate their results and find that they don’t work. People often tend to “tweak” data and convince themselves that they are simply helping the data show its true nature. There are lots of things in academic publications that are manifestations of our abilities to rationalize. . . .

I think that the pressures of publication, funding, helping the group, and reputation are very much present in academic publications.

Which is interesting given the fraudulent projects he was involved in. Interesting if he was doing fraud and interesting if he was the unfortunate victim of fraud. He’s an admirably tolerant person. Fraud makes me angry; I would not be so quick to refer to it as “not because of selfish interest but because of a desire to help.”

Wassup, NPR? You raised concerns in 2010 but then promoted

One of the above sources links to this NPR article from 2010:

Should You Be Suspicious Of Your Dentist Or NPR’s Source? . . .

Last month, Dan Ariely, a behavioral economics professor, talked with All Things Considered host Robert Siegel, about how incredibly loyal, almost irrationally so, people are to their dentists – more so than with other medical professions. . . . when he appeared on NPR’s air, there was every reason to trust him.

Ariely offered information certain to unnerve listeners and anger dentists ¬– information based on a fact that he cannot back up.

If two dentists were asked to identify cavities from the same X-ray of the same tooth, Ariely said they would agree only 50 percent of the time.

Ariely cited Delta Dental insurance as his source. However, Delta spokesman Chris Pyle said there is no data that could lead to that conclusion. . . .

Here is what Ariely said:

Prof. ARIELY: And we asked both dentists to find cavities. And the question is, what would be the match? How many cavities will they find, both people would find in the same teeth? . . . It turns out what Delta Dental tells us is that the probability of this happening is about 50 percent. . . .

It’s really, really low. It’s amazingly low. Now, these are not cavities that the dentist finds by poking in and kind of actually measuring one. It’s from X-rays. Now, why is it so low? It’s not that one dentist find cavities and one doesn’t. They both find cavities, just find them in different teeth. . . .

“According to Dr. Ariely, he was basing his statement on a conversation he said he had with someone at Delta Dental,” said Pyle. “But he cannot cite Delta Dental in making that claim because we don’t collect any data like that which would come to such a conclusion.”

So what happened?

Ariely said he got that 50 percent figure from a Delta source who told him about “some internal analysis they have done and they told me the results. But they didn’t give me the raw data. It’s just something they told me.”

Ariely did not provide the name of the Delta medical officer, whom Ariely said was not interested in talking with me. . . .

Ariely told me he happened upon that figure when he was conducting research analyzing 20 years of raw data on Delta claims. . . . But Ariely did not see or analyze any data that would lead to a conclusion that dentists would agree only 50 percent of the time based on studying an X-ray.

Wow. The NPR report continues:

But what is NPR’s responsibility? . . . NPR can’t re-report and check out every thing that an on-air guest says. . . . In this case, the interview with Ariely was taped ahead of time and edited for air – but no one thought it necessary to challenge his undocumented statement.

ATC executive director Christopher Turpin said NPR had no reason to question Ariely, given his credentials as a tenured professor and an expert on how irrational human beings are. . . .

ATC has other pre-taped segments with Ariely, and those should be double-checked before they are aired. There’s no doubt that Ariely is both entertaining and informative about how irrational we humans are — but he also must be right.

It’s funny that, after all that, they write that there’s “no doubt” that Ariely is informative. I have some doubt on that one!

But here’s the interesting thing. The above warning was from 2010. Not 2022, not 2020. 2010. Fourteen years ago. But if you google *NPR Ariely*, you get a bunch of items since then:

2011: Is Marriage Rational? : Planet Money

2011: For Creative People, Cheating Comes More Easily

2012: TED Radio Hour: Dan Ariely: Why Do We Cheat?

2012: ‘The Honest Truth’ About Why We Lie, Cheat And Steal

2014: Dan Ariely: Where’s The Line Between Cheating A Little and Cheating A Lot?

2014: Rethinking Economic Theory: The Evolutionary Roots Of Irrationality : 13.7: Cosmos And Culture

2015: Dan Ariely: What Pushes Us To Work Hard — Even When We Don’t Have To?

2017: Dan Ariely: When Are Our Decisions Made For Us?

2018: Everybody Lies, And That’s Not Always A Bad Thing

2020: Why Some People Lie More Than Others

And then in the past year or so there have been some skeptical stories, such as this from 2023: Did an honesty researcher fabricate data?

But until recently NPR was running lots and lots of stories quoting Ariely completely without question—for years and years after they’d been explicitly warned back in 2010.

The scientist-as-hero narrative is just so strong that NPR kept going back to that well even after their own warning.

Why post on this?

God is in every leaf of every tree. It’s interesting how the more you look into this shredder story the more you can find. But there’s always some residual uncertainty, because there’s always some elaborate explanation that we haven’t thought of.

In the meantime, I’d recommend following the advice of that 2010 NPR report and asking people for their evidence when they make claims of scientific breakthroughs. There’s nothing you can do to stop people from flat-out lying, but if you can get purported experts to specify their sources, that should help.

If scientists know ahead of time that they’re expected to produce the shredder, as it were, maybe they’d be less likely to make things up in the first place.

Freakonomics asks, “Why is there so much fraud in academia,” but without addressing one big incentive for fraud, which is that, if you make grabby enough claims, you can get featured in . . . Freakonomics!

There was this Freakonomics podcast, “Why Is There So Much Fraud in Academia?” Several people emailed me about it, pointing out the irony that the Freakonomics franchise, which has promoted academic work of such varying quality (some excellent, some dubious, some that’s out-and-out horrible), had a feature on this topic without mentioning all the times that they’ve themselves fallen for bad science.

As Sean Manning puts it, “That sounds like an episode of the Suburban Housecat Podcast called ‘Why are bird populations declining?'”

And Nick Brown writes:

Consider the first study on the first page of the first chapter of the first Freakonomics book (Gneezy & Rustichini, 2000, “A Fine is a Price”, 10.1086/468061), in which, when daycare centres in Israel started “fining” parents for arriving late to pick up their children, the amount of lateness actually went up. I have difficulty in believing that this study took place exactly as described; for example, the number of children in each centre appears to remain exactly the same throughout the 20 weeks of the study, with no mention of any new arrivals, dropouts, or days off due to illness or any other reason. Since noticing this, I have discovered that an Israeli economist named Ariel Rubinstein had similar concerns (https://arielrubinstein.tau.ac.il/papers/76.pdf. pp. 249–251). He contacted the authors, who promised to put him in touch with the staff of the daycare centres, but then sadly lost the list of their names. The paper has over 3,200 citations on Google Scholar.

I replied: Indeed, the Freakonomics team has never backed down on many ridiculous causes they have promoted, including the innumerate claim that beautiful parents are 36% more likely to have girls and some climate change denial. But I’m not criticizing the researchers who participated in this latest Freakonomics show; we have to work with the news media we have, flawed as they are.

And, as I’ve said many times before, Freakonomics has so much good stuff. That’s why I’m disappointed, first when they lower their standards and second when they don’t acknowledge or wrestle with their past mistakes. It’s not too late! They could still do a few shows—or even write a book!—on various erroneous claims they’ve promoted over the years. It would be interesting, it would fit their brand, it could be educational and also lots of fun.

This is similar to something that occurs in the behavioral economics literature: there’s so much research on how people make mistakes, how we’re wired to get the wrong answer, etc., but then not so much about the systematic errors made in the behavioral economics literature itself. As they’d say in Freakonomics, behavior can be driven by incentives.

P.S. Some interesting discussion in comments regarding the Gneezy and Rustichini paper. I’ve not looked into this one in detail, and my concerns with Freakonomics don’t come from that example but from various other cases over the years where they’ve promoted obviously bad science; see above links.

Movements in the prediction markets, and going beyond a black-box view of markets and prediction models

My Columbia econ colleague Rajiv Sethi writes:

The first (and possibly last) debate between the two major party nominees for president of the United States is in the books. . . . movements in prediction markets give us a glimpse of what might be on the horizon.

The figure below shows prices for the Harris contract on PredictIt and Polymarket over a twenty-four hour period that encompasses the debate, adjusted to allow for their interpretation as probabilities and to facilitate comparison with statistical models.

The two markets responded in very similar fashion to the debate—they moved in the same direction to roughly the same degree. One hour into the debate, the likelihood of a Harris victory had risen from 50 to 54 on PredictIt and from 47 to 50 on Polymarket. Prices fluctuated around these higher levels thereafter.

Statistical models such as those published by FiveThirtyEight, Silver Bulletin, and the Economist cannot respond to such events instantaneously—it will take several days for the effect of the debate (if any) to make itself felt in horse-race polls, and the models will respond when the polls do.

This relates to something we’ve discussed before, which is how one might consider improving a forecast such as ours at the Economist magazine so as to make use of available information that’s not in the fundamentals-based model and also hasn’t yet made its way into the polls. Such information includes debate performance, political endorsements, and other recent news items as well as potential ticking time bombs such as unpopular positions that are held by a candidate but of which the public is not yet fully aware.

Pointing to the above graph that shows the different prices in the different markets, Sethi continues:

While the markets responded to the debate in similar fashion, the disagreement between them regarding the election outcome has not narrowed. This rasies the question of how such disagreement can be sustained in the face of financial incentives. Couldn’t traders bet against Trump on Polymarket and against Harris on PredictIt, locking in a certain gain of about four percent over two months, or more than twenty-six percent at an annualized rate? And wouldn’t the pursuit of such arbitrage opportunities bring prices across markets into alignment?

There are several obstacles to executing such a strategy. PredictIt is restricted to verified residents of the US who fund accounts with cash, while trading on Polymarket is crypto-based and the exchange does not accept cash deposits from US residents. This leads to market segmentation and limits cross-market arbitrage. In addition, PredictIt has a limit of $850 on position size in any given contract, as well as a punishing fee structure.

This is all super interesting. So much of the discussion I’ve seen of prediction markets is flavored by pro- or anti-market ideology, and it’s refreshing to see these thoughts from Sethi, an economist who studies prediction markets and sees both good and bad things about them without blindly promoting or opposing them in an ideological way.

Sethi also discusses public forecasts that use the fundamentals and the polls:

While arbitrage places limits on the extent to which markets can disagree, there is no such constraint on statistical models. Here the disagreement is substantially greater—the probability of a Trump victory ranges from 45 percent on FiveThirtyEight to 49 percent on the Economist and 62 percent on Silver Bulletin.

Why the striking difference across models that use basically the same ingredients? One reason is a questionable “convention bounce adjustment” in the Silver Bulletin model, without which its disagreement with FiveThirtyEight would be negligible.

But there also seem to be some deep differences in the underlying correlation structure in these models that I find extremely puzzling. For example, according to the Silver Bulletin model, Trump is more likely to win New Hampshire (30 percent) than Harris is to win Arizona (23 percent). The other two models rank these two states very differently, with a Harris victory in Arizona being significantly more likely than a a Trump victory in New Hampshire. Convention bounce adjustments aside, the correlation structure across states in the Silver Bulletin model just doesn’t seem plausible to me.

I have a few thoughts here:

1. A rule of thumb that I calculated a few years ago in my post, Is it meaningful to talk about a probability of “65.7%” that Obama will win the election?, is that a 10 percentage point share in win probability corresponds roughly to a four-tenths of a percentage point swing in expected vote share. So the 5 percentage point swings in those markets correspond to something like a two-tenths of a percentage point swing in opinion, which can crudely be thought of as being roughly equivalent to an implicit model where the ultimate effect of the debate is somewhere between zero and half a percentage point.

2. The rule of thumb gives us a way to roughly calibrate the difference in predictions of different forecasts. A difference between a Trump win probability of 50% in one forecast and 62% in another corresponds to a difference of half a percentage point in predicted national vote share. It doesn’t seem unreasonable for different forecasts to differ by half a percentage point in the vote, given all the judgment calls involved in what polls to include, how to adjust for different polling organizations, how to combine state and national polls, and how you set up the prior or fundamentals-based model.

3. Regarding correlations: I think that Nate Silver’s approach has both the strengths and weaknesses of a highly empirical, non-model-based approach. I’ve never seen a document that describes what he’s done (fair enough, we don’t have such a document for the Economist model either!); my impression based on what I’ve read is that he started with poll aggregation, then applies some sort of weighting, then has an uncertainty model based on uncertainty in state forecasts and uncertain demographic swings. I think that some of the counterintuitive behavior in the tails is coming from the demographically-driven uncertainties and also because, at least when he was working under the Fivethirtyeight banner, he wanted to have wide uncertainties in the national electoral college forecasts, and with the method he was using, the most direct way to do this was to give huge uncertainties for the individual states. The result was weird stuff like the prediction that, if Trump were to win New Jersey, that his probability of winning Alaska would go down. This makes no sense to anyone other than Nate because, if Trump were to have won in New Jersey, that would’ve represented a total collapse of the Democratic ticket, and it’s hard to see how that would’ve played out as a better chance for Biden in Alaska. The point here is not that Nate made a judgment call about New Jersey and Alaska; rather, a 50-state prediction model is a complicated thing. You build your model and fit it to available data, then you have to check its predictions every which way, and when you come across results that don’t make sense, you need to do some mix of calibrating your intuitions (maybe it is reasonable to suppose that Trump winning New Jersey would be paired with Biden winning Alaska?) and figuring out what went wrong with the model (I suspect some high-variance additive error terms that were not causing problems with the headline national forecast but had undesirable properties in the tail). You can figure some of this out by following up and looking at other aspects of the forecast, as I did in the linked post.

So, yeah, I wouldn’t take the correlations of Nate’s forecast that seriously. That said, I wouldn’t take the correlations of our Economist forecast too seriously either! We tried our best, but, again, many moving parts and lots of ways to go wrong. One thing I like about Rajiv’s post is that he’s willing to do the same critical work on the market-based forecasts, not just treating them as a black box.

It’s Stanford time, baby: 8-hour time-restricted press releases linked to a 91% higher risk of hype

Adam Pollack writes:

You and the blog readers might find this interesting: https://newsroom.heart.org/news/8-hour-time-restricted-eating-linked-to-a-91-higher-risk-of-cardiovascular-death.

Yesterday, my friend was very concerned for me after he found out I usually don’t eat breakfast. He told me it’s dangerous. I thought it was as simple as not being hungry for a few hours after I wake up.

He showed me the above press release from the American Heart Association newsroom. I have never seen the results of an abstract for a poster publicized like this. It even made it to CNN (https://www.cnn.com/2024/03/19/health/intermittent-fasting-pros-cons-wellness/index.html). Both the press release and the CNN article emphasize that the findings are preliminary. For example, press release says “As noted in all American Heart Association scientific meetings news releases, research abstracts are considered preliminary until published in a peer-reviewed scientific journal.”

This doesn’t make me feel better about the situation. Let’s pretend this analysis was conducted perfectly (whatever that means). How would the AHA newsroom & CNN report the results if this was peer-reviewed and published? From the newsroom quote above, I get the sense that if it’s in the peer-reviewed scientific journal the press release wouldn’t have any caveat. Maybe they’ll even recommend people change their lifestyles and diets?

I’m being a little disingenuous because the editor’s note from the date after the first press release tells readers they should always consult with their doctor before making changes to their health regimens. Wait, why is there an editor’s note the day after a press release that provides full poster presentation details?? I’m guessing this caused an uproar to some degree in the community. In general, there’s a lot to unpack from this about science communication and the role of science in informing decisions. I’d be most interested in a discussion on your blog about those points, though I’m sure that the poster could inspire some nice statistical discussion too (https://s3.amazonaws.com/cms.ipressroom.com/67/files/20242/8-h+TREmortality_EPI+poster_updated+032724.pdf). For example, the press release reports the authors “were surprised to find that people who followed an 8-hour, time-restricted eating schedule were more likely to die from cardiovascular disease” and it turns out that’s one of 4 effects (look like interaction effects) w/ p < .05 across all the comparisons they make.

The press release refers to “those who followed an 8-hour time-restricted eating schedule, a type of intermittent fasting,” but from the poster, we see that this “eating duration” variable is the average eating duration for the two dietary recall days in the survey. Of the 414 people in the study who reported less than 8 hours averaging those two days, 31 died of cardiovascular disease during the period of the study. In comparison, the reference group is 12-16 hours, which included 11,831 people, of whom 423 died of cardiovascular disease. (31/414)/(423/11831) = 2.09. The estimated risk ratio is 1.91, which they estimated from a a hazard regression adjusting for a bunch of variables including demographics, smoking, and drinking but also total energy intake, body mass index, and self-reported health condition status.

Looks like noise mining to me, but, hey, all things are possible.

Based on what I see in the paper, the statement, “people who followed an 8-hour, time-restricted eating schedule were more likely to die from cardiovascular disease,” does not seem like an accurate description of the data. How you ate in two days of a survey is hardly an “eating schedule.”

Also they say, “Our study’s findings encourage a more cautious, personalized approach to dietary recommendations, ensuring that they are aligned with an individual’s health status and the latest scientific evidence,” which sounds like gobbledygook. You don’t need a statistical analysis to know that, right?

The press release quotes someone else as saying, “Overall, this study suggests that time-restricted eating may have short-term benefits but long-term adverse effects.” B-b-but . . . if they only asked about how people ate for 2 days, in what sense is this telling us about long-term effects? He does follow up with, “it needs to be emphasized that categorization into the different windows of time-restricted eating was determined on the basis of just two days of dietary intake,” and I’m like, yeah, but then how do you get away with that first statement? OK, he is at Stanford Medical School.

B-school prof data sleuth lawsuit fails

Stephanie Lee tells the story: “She Sued the Sleuths Who Found Fraud in Her Data. A Judge Just Ruled Against Her.” Good touch in the headline to say “Found” rather than “Alleged.”

Further background here (“Ted-talking purveyors of fake data who write books about lying and rule-breaking . . . what’s up with that?”), and lots more links here.

P.S. More here from Gideon Lewis-Kraus.

Decisions of parties to run moderate or extreme candidates

Palko points to this article by political journalist Jonathan Chait, “The Democrats Remembered How Politics Works Again: An end to a decade of magical thinking.” Chait points out that pundits and political professionals on the left, right, and center anticipated a Democratic wipeout in 2022, all for different reasons: The left thought the Democrats were insufficiently progressive and so would not be able to motivate their core voters; the right thought that Biden and the Democratic congress had alienated voters with their liberal policies; and everybody expected that historical patterns of midterm elections would result in big wins for the out-party. A few months before the election, Chris Wlezien and I argued that these expectations should change given the controversial decisions by the Republican-controlled Supreme Court during the months before the election.

Chait puts it well:

And indeed, the election results, both in the aggregate and in many of the particulars, vindicate the belief that voters tend to punish rather than reward parties and candidates they associate with radical ideas. To be sure, a tendency is not a rule. The largest factor driving election results is external world events: economic prosperity (or its absence), rallying around the flag in the event of a foreign attack, or widespread disgust with a failed war or major scandal. Midterm elections generally have large swings against the president’s party. One reason 2022 defied the pattern is that the Dobbs decision made Republicans, not Democrats, the party carrying out radical change. Candidates and parties seen as safe and moderate have an advantage — one that may not always override other factors but which matters quite a bit.

This is a fairly uncontroversial finding among political scientists.

I agree that this is a fairly uncontroversial finding among political scientists. See, for example, this unpublished paper with Jonathan Katz from 2007, and there’s a lot more literature on the topic.

Chait continues:

Yet in recent years, many influential figures in the Democratic Party had come to disbelieve it. A series of mistakes followed from this belief that Democrats would pay no penalty or may even benefit from moving farther away from the center.

I guess that some Democrats have this attitude, as do some Republicans. It’s natural to believe that the positions that you deeply hold would be shared by a majority of the population, if they were only given an opportunity to choose them. But I wonder if part of this, on both sides, is the rational calculation that moderation, while beneficial, doesn’t help that much, and so it can make sense in an election where you have a big advantage to run a more extreme candidate in order to get a policy benefit. That would explain the Republicans’ decision to choose extremist candidates in high-profile close races in 2022. Chait’s article has some interesting background on debates within the Democratic party on similar decisions in past elections.

Awesome online graph guessing game. And scatterplot charades.

Julian Gerez points to this awesome time-series guessing game from Ari Jigarjian. The above image gives an example. Stare at the graph for awhile and figure out which is the correct option.

I don’t quite know how Jigarjian does this—where he gets the data and the different options in the multiple-choice set. Does he just start with a graph and then come up with a few alternative stories that could fit, or is there some more automatic procedure going on? In any case, it’s a fun game. A new one comes every day. Some are easy, some not so easy. I guess it depends primarily on how closely your background knowledge lines up with the day’s topic, but also, more interestingly, on how much you can work out the solution by thinking things through.

This graph guessing game reminds me of scatterplot charades, a game that we introduce in section 3.3 of Active Statistics:

Students do this activity in pairs. Each student should come to class with a scatterplot on some interesting topic printed on paper or visible on their computer or phone, and then reveal the plot to the other student in the pair, a bit at a time, starting with the dots only and then successively uncovering units, axes and titles. At each stage, the other student should try to guess what is being plotted, with the final graph being the reveal.

In the book we give four examples. Here are two of them:

The time-series guessing game is different than scatterplot charades in being less interactive, but fun in its own way. The interactivity of scatterplot charades makes for a good classroom demonstration; the non-interactivity of the time-series guessing game makes for a good online app.

Bayesian social science conference in Amsterdam! Next month!

E. J. Wagenmakers writes:

This year Maarten Marsman and I are the local coordinators of the workshop Bayesian Methods for the Social Sciences II. We hope to organize this conference every two years, alternating between Paris and Amsterdam.

We have another great line-up of speakers this year, and we’d like to spread the word to a larger audience.

The conference takes place 16-18 October 2024 in Amsterdam.

Here’s the list of scheduled talks:

Merlise Clyde (Duke): Estimating Posterior Model Probabilities via Bayesian Model Based Sampling
Marie Perrot-Dockès (Université Paris Cité): Easily Computed Marginal Likelihoods from Posterior Simulation Using the THAMES Estimator
Joris Mulder (Tilburg): An empirical Bayes factor for testing random effects

Monica Alexander (Toronto): Estimating Childlessness by Age and Race in the United States using a Bayesian Growth Curve Model
Leontine Alkema (University of Massachussetts): A Bayesian approach to modeling demographic transitions with application to subnational estimation and forecasting of family planning and fertility indicators
Douglas Leasure (Oxford): Population nowcasting in a digital world to support humanitarian action and sustainable development

Radu Craiu (Toronto): Bayesian Copula-based Latent Variable Models
Daniel Heck (Marburg): Bayesian Modeling of Uncertainty in Stepwise Estimation Approaches
Riccardo Rastelli (University College Dublin): A latent space model for multivariate time series analysis

Daniele Durante (Bocconi): Bayesian modeling of criminal networks
Nial Friel (University College Dublin): Bayesian stochastic ordered block models
Maarten Marsman (Amsterdam): Bayesian Edge Selection for Psychometric Network (Graphical) Models

Marco Corneli (Université Côté d’Azur): A Bayesian approach for clustering and exact finite-sample model selection in longitudinal data mixtures
Irene Klugkist (Utrecht): Bayesian Evidence Synthesis in the context of informative hypotheses
Eric-Jan Wagenmakers (Amsterdam): Optional Stopping

Herbert Hoijtink (Utrecht): Bayesian evaluation of single case experimental designs
Adrian Raftery (University of Washington): Bayesian climate change assessment
Robin Ryder (Paris-Dauphine and Imperial College London): Can Bayesian methods reconstruct deep language history?

François Caron (Oxford): Sparse Spatial Network Models for the analysis of mobility data
Geoff Nicholls (Oxford): Partial order models for social hierarchies and rank-order data
Amandine Véber (Université Paris Cité): Modelling expanding biological networks

Lots of great stuff here! I could do without the Bayes factor, but everything else looks really cool, an interesting mix of theoretical and applied topics.

The mainstream press is failing America (UK edition)

This is Bob.

I’m in London (about to head to StanCon!), so I saw today’s opinion piece in The Guardian (a UK newspaper not immune to the following criticisms), which I think is a nice summary of the sorry state of the media (and courts) in the United States.

It has a classic strong open,

The first thing to say about the hate and scorn currently directed at the mainstream US media is that they worked hard to earn it. They’ve done so by failing, repeatedly, determinedly, spectacularly to do their job, which is to maintain their independence, inform the electorate, and speak truth to power.

If you haven’t been following Mark Palko’s blog, West Coast Stats Views, or this whole storyline elsewhere, this article does a good job of framing the issue.

In the article, Jeff Jarvis, a former editor and columnist, is quoted as saying (er, tweeting [or do they call it X-ing now?]),

What ‘press’? The broken and vindictive Times? The newly Murdochian Post? Hedge-fund newspaper husks? Rudderless CNN or NPR? Murdoch’s fascist media?

The article reprises what Mitzi Morris has been saying ever since she worked at the New York Times in the 1990s when it was first going digital. She was appalled by the entire organization’s highly misleading approach to their readership statistics and focus on click-bait. Her take agrees with the article’s,

In pursuit of clickbait content centered on conflicts and personalities, they follow each other into informational stampedes and confirmation bubbles.

As Palko has been pointing out on his blog, the mainstream media, in part led by the New York Times, no longer seems concerned with candidates’ mental health and age now that they can’t criticize Joe Biden. Instead, you get what is described by the article this way,

They pursue the appearance of fairness and balance by treating the true and the false, the normal and the outrageous, as equally valid and by normalizing Republicans, especially Donald Trump, whose gibberish gets translated into English and whose past crimes and present-day lies and threats get glossed over.

This whole trend goes back to at least the Clinton/Trump election. Their relentless focus on Clinton’s email while ignoring all of Trump’s malfeasance led me to stop reading the Times after that. Also, whenever I read anything I know about in the press, like epidemiology or computer science, the coverage is appallingly misleading.

There’s a lot more detail in the article. And a whole lot more on Palko’s blog if you want to do a deeper dive. On a related topic, I’d also recommend Palko’s coverage of Elon Musk’s shenanigans.

Modeling Weights to Generalize (my talk this Wed noon at the Columbia University statistics department)

In the student-organized seminar series, Wed 11 Sep 2024 noon in room 903 room 1025 Social Work Bldg:

Modeling Weights to Generalize

A well-known rule in practical survey research is to include weights when estimating a population average but not to use weights when fitting a regression model—as long as the regression includes as predictors all the information that went into the sampling weights. But what if you don’t know where the weights came from? We propose a quasi-Bayesian approach using a joint regression of the outcome and the sampling weight, followed by poststratifcation on the two variables, thus using design information within a model-based context to obtain inferences for small-area estimates, regressions, and other population quantities of interest. For background, see here: http://www.stat.columbia.edu/~gelman/research/unpublished/weight_regression.pdf

Research!

Here’s a useful response by Christakis to criticisms of the contagion-of-obesity claims

Yesterday I posted an update on citations of the influential paper from 2007 by sociologist Nicholas Christakis and political scientist James Fowler, “The Spread of Obesity in a Large Social Network over 32 Years,” which concluded, “Network phenomena appear to be relevant to the biologic and behavioral trait of obesity, and obesity appears to spread through social ties.”

As I wrote yesterday, several other researchers had criticized that paper on methodological grounds, and in my post I characterized it as being “many times debunked” and expressed distress that the original claim seems to be regularly cited without reference to the published criticisms by economists Jason Fletcher and Ethan Cohen-Cole, mathematician Russ Lyons, political scientists Hans Noel and Brendan Nyhan, and statisticians Cosma Shalizi and Andrew Thomas.

That said, I am not an expert in this field. I have read the articles linked in the above post but have not kept track of the later literature.

In comments, Christakis shares his perspective on all this:

I [Christakis] think this post is an incomplete summary of the very carefully expressed claims in the original 2007 paper, and also an inaccurate summary of the state of the literature. You also may want to look at our original exchanges on this blog from years ago, and to our published responses to prior critiques, including to some of the decades-old critiques (often inaccurate) that you mention.[1]

Many papers have replicated our findings of social contagion with respect to obesity (and the various other phenomena discussed in our original suite of papers), and many papers have evaluated the early methods we used (based on generalized estimating equations) and have supported that approach.

For instance, analyses by various scholars supported the GEE approach, e.g., by estimating how large the effect of unobserved factors would have to be to subvert confidence in the results.[2],[3],[4],[5] Other papers supported the findings in other ways.[6],[7],[8],[9],[10] This does not mean, of course, that this GEE approach does not require various assumptions or is perfectly able to capture causal effects. This is one reason the 2007 paper described exactly what models were implemented, was judicious in its claims, and also proposed certain innovations for causal identification, including the “edge-directionality test.” The strengths and limitations of the edge-directionality test for causal identification have subsequently been explored by computer scientists,[11] econometricians,[12] statisticians,[13] and sociologists.[14]

Work by other investigators with other datasets and approaches has generally confirmed the 2007 findings. Pertinent work regarding weight and related behaviors is quite diverse, including everything from observational studies to experiments.[15],[16],[17],[18],[19],[20],[21],[22],[23] Of course, as expected, work has also confirmed the existence of homophily with respect to weight. Still other studies have used experimental and observational methods to confirm that one mechanism of the interpersonal spread of obesity might indeed be a spread of norms, as speculated in the 2007 paper.[24],[25],[26],[27]

Of course, methods to estimate social contagion with observational data regarding complex networks continue to evolve, and continue to require various assumption, as ours did in 2007. As before, I would love to see someone offer a superior statistical method to observational data. And I should also clarify that the public-use version of the FHS-Net data (posted at dbGap) is not the same version as the one we based our 2007 analyses on (a constraint imposed by the FHS itself, in ways documented there); however, the original data is available via the FHS itself (or at least was). At times, this difference in datasets explains why other analyses have reached slightly different conclusions than us.

In our 2007 paper, we also documented an association of various ego and alter traits to a geodesic depth of three degrees of separation. We also did this with respect to public goods contributions in a network of hunter-gatherers in Tanzania[28] and smoking in the FHS-Net.[29] Other observational studies have also noted this empirical regularity with respect to information,[30],[31] concert attendance,[32] or even the risk of being murdered.[33] We summarized this topic in 2013.[34]

Moreover, we and others have observed actual contagion up to three degrees of separation in experiments which absolutely excludes homophily or context as an explanation for the clustering.[35],[36],[37] For instance, Moussaid et al documented hyper-dyadic contagion of risk perception experimentally.[38] Another experiment found that the reach of propagation in a subjective judgment task “rarely exceeded a social distance of three to four degrees of separation.”[39] A massive experiment with 61 million people on Facebook documented the spread of voting behavior to two degrees of separation.[40] A large field experiment with 24,702 villagers in Honduras showed that certain maternal and child health behaviors likewise spread to at least two degrees of separation.[41] And a 2023 study involved 2,491 women household heads in 50 poor urban residential units in Mumbai documented social contagion, too.[42]

In addition, my own lab, in the span from as early as 2010 to as recently as 2024 has published many demanding randomized controlled field trials and other experiments documenting social contagion, as noted above. For instance, my group published our first experiment with social contagion in 2010,[43] as well as many other experiments involving social contagion in economic games using online subjects[44],[45],[46] often stimulating still other work.[47],[48],[49]

Many other labs, in part stimulated by our work, have conducted many other experiments documenting social contagion. The idea of using a network-based approach to exploit social contagion to disseminate an intervention – so as to change knowledge, attitudes, or practices at both individual and population levels – has been evaluated in a range of settings.[50],[51],[52],[53],[54],[55],[56],[57],[58]

Finally, other rigorous observational and experimental studies involving large samples and mapped networks have explored diverse outcomes in recent years, beyond the examples reviewed so far. For instance, phenomena mediated by online interactions include phone viruses,[59] diverse kinds of information,[60],[61],[62] voting,[63] and emotions.[64] In face-to-face networks, phenomena as diverse as gun violence in Chicago,[65] microfinance uptake in India,[66] bullying in American schools,[67] chemotherapy use by physicians,[68] agricultural technology in Malawi,[69] and risk perception[70] have been shown to spread by social contagion.

The above, in my view, is a fairer and more complete summary of the impact, relevance, and accuracy of our claims about obesity in particular and social contagion in general. Work in the field of social contagion in complex networks, using observational and experimental studies has exploded since we published our 2007 paper.

The list of references is at the end of Christakis’s comment.

Back in 2010 I wrote that this area is ripe for statistical development and also ripe for development in experimental design and data collection. As of 2024, the area is not just “ripe for development” in experimental design, data collection, and statistical analysis; there have also been many developments in all these areas, by Christakis, his collaborators, and other research groups, and my earlier post was misleading: just because I was ignorant of that followup literature, that’s not an excuse for me to act as if it didn’t exist.

One question here is how to think about the original Christakis and Fowler (2007) paper. On one hand, I remain persuaded by the critics that it made strong claims that were not supported by the data at hand. On the other hand, it was studying an evidently real general phenomenon and it motivated tons of interesting and important research.

Whatever its methodological issues, Christakis and Fowler (2007) is not like the ESP paper or the himmicanes paper or the ovulation-and-voting papers, say, whose only useful contributions to science were to make people aware of the replication crisis and motivate some interesting methodological work. One way to say this is that the social contagion of behavior is both real and interesting. I don’t think that’s the most satisfying way to put this—the people who study ESP, social priming, evolutionary psychology, etc., would doubtless say that their subject areas are both real and interesting too!—so consider this paragraph as a placeholder for a fuller investigation of this point (ideally done by someone who can offer a clear perspective than I can here).

In summary:

1. I remain convinced by the critics that the original Christakis and Fowler paper did not have the evidence to back up its claims.

2. But . . . that doesn’t mean there’s nothing there! In their work, Christakis and Fowler (2007) were not just shooting in the dark. They were studying an interesting and important phenomenon, and the fact that their data were too sparse to answer the questions they were trying to answer, well, that’s what motivates future work.

3. This work does not seem to me to be like various notorious examples of p-hacked literature such as beauty-and-sex-ratio, ovulation-and-clothing, mind-body-healing, etc., and I think a key difference is that the scientific hypotheses involving contagion of behavior are more grounded in reality rather than being anything-goes theories that could be used to explain any pattern in the data.

4. I was wrong to refer to the claim of contagion of obesity as being debunked. That original paper had flaws, and I do think that when it is cited, the papers by its critics should be cited too. But that doesn’t mean the underlying claims are debunked. This one’s tricky—it relates to the distinction between evidence and truth—and that’s why followups such as Christakis’s comment (and the review article that it will be part of) are relevant.

I want to thank Christakis again for his thoughtful and informative response, and I apologize for the inappropriate word “debunked.” I’ve usually been so careful over the years to distinguish evidence and truth, but this time I was sloppy—perhaps in the interest of telling a better story. I’ll try to do better next time.

P.S. I think that one problem here is the common attitude that a single study should be definitive. Christakis and Fowler don’t have that attitude—they’ve done lots of work in this area, not just resting their conclusions on one study—and I don’t have that attitude either. I’m often saying this, that (a) one study is rarely convincing enough to believe on its own, and, conversely, (b) just because a particular study has fatal flaws in its data, that doesn’t mean that nothing is there. We usually criticize the single-study attitude when researchers or journalists take one provisional result and run with it. In this case, though, I fell into the single-study fallacy myself by inappropriately taking the well-documented flaws of that one paper as evidence that nothing was there.

That all said, I’m sure that different social scientists have different views on social contagion, and so I’m not trying to present Christakis’s review as the final word. Nor is he, I assume. It’s just appropriate for me to summarize his views on the matter based on all this followup research he discusses and not have the attitude that everything stopped in 2011.

Sports gambling addiction epidemic fueled by some combination of psychology, economics, and politics

We’ve written about this before, for example:

2012: There are four ways to get fired from Caesars: (1) theft, (2) sexual harassment, (3) running an experiment without a control group, and (4) keeping a gambling addict away from the casino

2022: Again on the problems with technology that makes it more convenient to gamble away your money

2023: There are five ways to get fired from Caesars: (1) theft, (2) sexual harassment, (3) running an experiment without a control group, (4) keeping a gambling addict away from the casino, (5) refusing to promote gambling to college students

Corbin Smith shares some new stories on the unfortunately topical subject of gambling addiction and how it relates to the financing and the sports media. In his article, Smith implicitly makes a strong case that to understand the problem you need to think about interactions between psychology, economics, and politics. The sports, news, and entertainment media are pushing gambling so hard. I guess that in the future we will look back on the present era and laugh/cringe in the same way that we laugh/cringe at the “Mad Men”-style drinking and smoking culture from the 1950s.

Evil scamming fake publishers

David Weakliem reports:

There is a journal called the EON International Journal of Arts, Humanities & Social Sciences. I recently discovered that I [Weakliem] am listed as the Editor. I am not the editor—I had never even heard of this journal before, and would have declined if they asked me to be involved, since it looks pretty sketchy. I have written to the publisher telling them to remove my name from their site but also wanted to announce it publicly just in case anyone has noticed.

Apparently there was a previous fake editor—when he found out and objected, they put me in.

Here it is:

This is evil, all right! More evil than anything Wolfram Research has ever done, I think.

P.S. Unfortunately this is not new. From Retraction Watch, see here, here, and here.

Remember that paper that reported contagion of obesity? How’s it being cited nowadays?

The original title of this post was, “Remember that many-times-debunked claim of ‘contagion of obesity’? How’s it being cited nowadays?”, but that was misleading—see the followup. What I want to say is not that the claim was “debunked” but that the paper where it first appeared had statistical problems. This post is about how that particular paper was cited but, as its coauthor Nicholas Christakis helpfully pointed out in a comment, it did not at all engage with the later literature on the topic.

I happened to be talking with some students today about social network research—we’re doing some followup on our penumbra paper—and the topic came up of the controversial study by Nicholas Christakis and James Fowler from the 2000s on the contagion of obesity.

We covered the topic in this space back in 2010 and 2011:

Controversy over social contagion

Controversy over the Christakis-Fowler findings on the contagion of obesity

Christakis-Fowler update

There we discussed the work of Christakis and Fowler; criticisms of that work by economists Jason Fletcher and Ethan Cohen-Cole, mathematician Russ Lyons, political scientists Hans Noel and Brendan Nyhan, and statisticians Cosma Shalizi and Andrew Thomas; and a response by the original authors, who wrote:

We do not claim that this work is the final word, but we do believe that it provides some novel, informative, and stimulating evidence regarding social contagion in longitudinally followed networks. Along with other scholars, we are working to develop new methods for identifying causal effects using social network data, and we believe that this area is ripe for statistical development as current methods have known and often unavoidable limitations.

The quick summary is:

1. Christakis and Fowler were doing interesting, innovative social science; they just went too far in the interpretation of their data. You know that saying, “high-risk, high-reward”? That’s what was happening here. There was potential high reward, but this study was ultimately a failure, except to the extent that failures can be useful too in helping us avoid certain dead-end paths in the future.

2. The claim of social contagion of obesity wasn’t supported by the data from the Framingham Health Study; the critics (Fletcher, Cohen-Cole, Lyons, Noel, Nyhan, Shalizi, and Thomas) were right.

3. There are social effects on attitudes and behavior, and they’re hard to study. As Christakis and Fowler wrote, this area is ripe for statistical development and also ripe for development in experimental design and data collection.

I was curious how this work is being cited, over 15 years later. Google Scholar lists 7000 citations. I searched for citations from this year, and here are the first few:

The first link above is to a book, and here’s the relevant passage:

The next is from a review article, which mentions the Christakis-Fowler paper as reference 31 here:

The next is a review on peer effects in “weight-related behaviours of young people,” which has this wrong summary:

Kinda makes me concerned about the rest of that literature review!

The next is a paper on “Community influence on microfinance loan defaults under crisis conditions,” which cites Christakis and Fowler not for their substantive claims but for a method they used:

It’s reference 40 in this next paper:

And it’s cited just a little bit too credulously as reference 15 here:

And then there’s this one:

They cited work of Brian “Pizzagate” Wansink! That’s not good. If this blog were a drinking game, everyone would have to take a swig right now.

Summary

The incorrect claim about contagion of obesity is out there, and just about nobody seems to be qualifying it with references to the critics. Sorry, Fletcher, Cohen-Cole, Lyons, Noel, Nyhan, Shalizi, and Thomas. Your hard work has come to (almost) naught.

P.S. In the comment section, Lyons shares some horrible examples of the persistence of the debunked claim an uncritical reference to the original, flawed article, in an exam for medical students. Your future doctor! This continuing episode says bad things about the scientific, media, and academic establishment demonstrates the challenges of citation and reference: if you want to discuss a topic, you should cite the original paper, but then you should cite the criticism, but then you should cite later literature on the topic, but then that’s a lot to chew on if you’re just trying to write an exam question . . .

“Very interesting failed attempt at manipulation on Polymarket today”

Rajiv Sethi points to this thread and writes:

Very interesting failed attempt at manipulation on Polymarket today (would have been very profitable if successful).

I have to say, this sort of thing creeps me out. Recreational or business-hedging betting on elections doesn’t bother me, but this idea of manipulating sources of information . . . it seems wrong somehow. Not just wrong in the same way that it would wrong, and possibly illegal, to manipulate sports-betting odds or stock prices or whatever, but more wrong in the sense of interfering with democracy.

I expect many of you will disagree with me and say it’s just a funny story—and it is a funny story—and I can’t offer strong arguments in favor of my reaction to this one, but here it is.

P.S. More here from Rajiv, with lots of details and the conclusion:

Derivative contracts of this kind continue to be listed on Polymarket. It would be a good thing if they were discontinued.

If you want to play women’s tennis at the top level, there’s a huge benefit to being ____. Not just ____, but exceptionally ___, outlier-outlier ___. (And what we can learn about social science from this stylized fact.)

If you want to play basketball at the top level, there’s a huge benefit to being tall. Not just tall, but exceptionally tall, outlier-outlier tall. If you’re an American and at least 7 feet tall and the right age, it’s said that there’s a 1-in-7 chance you’ll play in the NBA (but maybe that’s an overestimate; we’re still looking into that one).

Here’s another one for you. If you want to play women’s tennis at the top level, there’s a huge benefit to being ____. Not just ____, but exceptionally ___, outlier-outlier ___.

Take a guess and continue:
Continue reading

The NYT sinks to a new low in political coverage

OK, this is really embarrassing: “Harris or Trump? The Prophet of Presidential Elections Is Ready to Call the Race.” The video includes a tacky horserace-themed graphic.

No, I do not think the method described at that link is useful, for reasons explained at this post from a few years ago. The short answers are:
(a) Some elections are not close at all and any prediction method will get them right,
(b) Some elections are so close that for a prediction method to pick the winner is just chance, like picking a coin flip,
(c) There is information in the vote margin that is being thrown away if you just try to predict the winner; additional information is being thrown away by using true/false questions.
“The Prophet of Presidential Elections” . . . this is just magical thinking, my dude.

It’s bad social science—ok as an amusing feature story, I guess, in the same way that you might get a funny news item about astrologers or Elvis sightings or whatever—but to be featured in this way in the country’s leading newspaper . . . this is really embarrassing, sinking to the level of NPR’s science coverage or stories about aircraft taking off in ferocious tailwinds.

So, yeah, news media outlets can get fooled by promoters, but, jeez, the Times has some serious political reporters—couldn’t someone have run this by them first? And, sure, clickbait works (I linked to them above!), but . . . reputation counts for something, no? This really makes me sad for everyone else who works for this newspaper. They must feel kind of like how I felt after learning about Columbia faking its U.S. News numbers. (You might be interested in this story too.)

And this just took the Times’s reputation down one notch for me. Before, I’d have characterized them as being occasional suckers. Now I’d say they’re active promulgators of junk science.

The sad thing is, I’m pretty sure that most of the other media outlets out there are much worse. Octopuses from outer space on the History Channel, anyone?

P.S. On the plus side, we recently noticed a recipe in the NYT food section that looked pretty good. And their crossword puzzle remains excellent, much better than the crap that they have at the New Yorker.

In search of a theory associating honest citation with a higher/deeper level of understanding than (dishonest) plagiarism

In response to my post, Plagiarism means never having to say you’re clueless, Gur Huberman writes:

Is plagiarism evidence of lack of understanding?

Compare & contrast situation A & B.
In both somebody develops an argument, supports an assertion, evaluates dat etc.
The following appears in both,
Since X holds, Y is implied. [X & Y can be quite elaborate; in fact they are whole scientific structures.]
In situation A we have, “As Z has shown, since X holds, Y is implied.”
In situation B we find only “Since x holds, Y is implied.”
Situation A is clean, situation B is a plagiarism. How/why can you argue that B reflects lack of understanding whereas A doesn’t?

My reply: I think that plagiarism is evidence of lack of understanding, in practice. In theory, sure, you can understand something perfectly and still plagiarize. In real life, plagiarists always seem to have a lack of understanding.

Gur responded:

I would still like to see some theory associating honest citation with a higher/deeper level of understanding than (dishonest) plagiarism.
Perhaps, Clinton-style, it depends what one understands by “understand.”

I replied that I have no theory on this, only empirics, so I would put the topic up for discussion, kinda like Monopoly when nobody has the spare cash to buy Pennsylvania Avenue so it gets put up for auction.

Gur responded:

Good idea, especially if the post mentions a few cases of plagiarism in which the offender didn’t fully understand the material he was plagiarizing. Ideally the examples should be sufficiently different from each other so as to inspire theoretical on the association between plagiarism & lack of understanding.

OK, just search this blog for discussions of plagiarism or copying without attribution. I suspect that in all cases the offenders did not fully understand the material they were plagiarizing or copying.