This is Jessica. AI-text detectors are coming to play a bigger role in adjudicating what texts are worthy of our attention. There was the surprising case of an apparently AI-generated short story winning the Commonwealth Foundation Short Story Prize, which returns 100% AI generated by Pangram, the leading detector whose false positive rate is reported as roughly 1 in 10,000 in its own audits and near zero on medium-to-long passages in an external audit. Applying Pangram to the other 4 stories that won awards this year suggests two others were heavily AI-assisted. More recently, the NeurIPS Position Paper track announced that it was desk rejecting 18% of submitted papers that were detected by Pangram as fully AI-generated. Another 13% are getting followed up on with the authors to investigate AI use.
We’re having to reconsider what authorship means. Can a person create literature or express their position on a subject without writing a single sentence themselves? When do we really care who strung the words together?
Some people think detection is a waste of our collective time because we will never reach an equilibrium. AI-generated text will keep shifting toward what passes the detector. Human writers will continually update their beliefs about what features are indicative of AI-writing. There’s no stable target, just an endless cat and mouse game that incentivizes being savvy enough at any given time to avoid getting flagged. Meanwhile people are being morally scorned and suffering reputational damage for being caught on the wrong side of things. This may disproportionately affect some writers (like non-native english speakers) who are finally seeing the playing field leveled a bit.
On the other hand, there are situations where it really is important to know who strung the words together. Education is the most obvious one. It’s just very hard to teach someone to think if they’re not writing down their ideas themselves.
The problem is that outside of select scenarios like teaching, what we really tend to care about is who controlled the ideas, and this is not equivalent to who strung the words together. Some would argue that the latter is becoming increasingly irrelevant given that AI can write more fluently than many people and many people prefer AI-generated text.
Of course the reason we’re seeing detection used to filter paper submissions is because the ideal process–where the content of each paper is carefully considered on its own merits–is increasingly untenable given the huge surge in submissions in some fields. It’s easy to pump out credible-seeming papers with minimal human oversight using AI, and enough people are doing this to create serious problems.
Mostly my response is that if we are going to debate the value of detection we should be willing to make our assumptions explicit. So let’s walk through a toy model to think about what we’re really conjecturing about.
One way to think of the latent state that we actually care about in paper review is the author type. Let’s say type A authors come up with their ideas and do a lot of the writing themselves. Type B authors rely on AI to do much of the thinking for them, and also use AI to do much of the writing. Type C authors come up with their own ideas, but engage in extensive prompting to get AI to write everything they want to say for them.*
For each paper, we choose to either pass or reject, conditional on the output of a Pangram check. Let’s say we only care about whether it flags 100% AI generated or not, so the signal s is binary, where s=1 means AI detected.
Based on available Pangram audits, if a text is actually written heavily by AI there is a very high chance it flags as AI-generated: beta=P(s=1|AI written) with beta very close to 1. If a text is not written by AI, there is a very small chance it flags as AI-generated: alpha=P(s=1|human written). Pangram’s internal audits put alpha around 10^−4 but other audits find essentially zero false positives for medium-to-long passages.
So P(s=1| A)=alpha, and if we assume Types B and C use AI to a similar extent for the writing, then \beta=P(s=1|B) = P(s=1|C). The posterior probability that a flagged paper is from a Type B author is then:
P(B|s=1) = (beta × p_B)/(alpha × p_A + beta × p_B + beta × p_C), and since alpha is tiny and beta is close to 1, P(B|s=1) ≈ p_B/(p_B + p_C)
The relevant considerations become what we think the author population looks like, and how costly we think a false positive versus a false negative are.
As a starting point, let’s say that for NeurIPS position papers in 2026, Type C was the rarest, at 20%, and Type A and Type B equally split the remaining mass at 40% each. Let’s also say that we consider rejecting an acceptable paper, c_FP, to be twice as bad as passing an unacceptable one c_FN.
The optimal decision rule is to reject if c_FN * P(B|s=1)>c_FP * P(A or C|s=1), or equivalently P(B|s=1)>c_FP/(c_FN+c_FP)
With c_FP=2 and c_FN=1, this means we reject if P(B|s=1) > 2/3.
Under the prevalence assumptions above, P(B|s=1) is approximately 2/3, so we are right on the boundary. From the standpoint of making the right decisions for this particular conference cycle, it’s not obviously bad. But if Type C is a little more common, e.g., we shift a little mass from p_A to p_C to make p_C 0.25, then P(B|s=1) is 0.62, then we shouldn’t desk reject only based on the flag. Similarly if we were to decide that falsely rejecting an acceptable paper is three times as bad as passing an unacceptable one, we shouldn’t rely on it alone.
This model is obviously very simple. But it shows us what kinds of things we have to make assumptions about in the most basic case. Obviously I don’t really know how many people are using AI blindly to write papers, nor how many people are relying heavily on AI to write up their own ideas. You should take my numbers with a grain of salt. Personally I can’t imagine how relying on AI to do all the writing when I came up with the ideas would ever feel efficient, because I tend to have strong opinions on how things are said. But I can accept I am probably more of a control freak than many others. And AI overreliance is easy to slip into. Maybe papers chairs from recent ML conferences (or arXiv moderators) have estimates on bad-actor rates based on what they are seeing.
What this exercise can’t tell us is how scientific progress is impacted by the warping of incentives that can happen when we use AI-detection as a filter. Classic principal-agent problems suggest that when we care about something hard to observe—like scientific quality or long-term epistemic value—but must rely on observable proxy signals to judge authors’ outputs, we should expect authors to shift more effort toward improving exclusively on those proxies. Avoiding m-dashes and ‘not this, but this’ constructions and whatever else currently ups the posterior probability of AI-generation is orthogonal to the actual thinking that research requires. What if relying more heavily on AI to write up our ideas is a good idea for science in the long run, in terms of more clearly communicating the ideas or saving a lot of time, so that we can get more good ideas out in the same amount of time? Then too much emphasis on detection might slow us down. However, I’m doubtful we are currently anywhere near a state of the world where discouraging writing with AI is as costly for scientific progress as spending time reviewing and reading many more questionable AI-generated papers is. The bigger threat at the moment is the slop overwhelming our ability to find the good stuff.
*We could also posit Type D authors that get AI to generate the ideas, but then write the papers themselves to evade detection, or are extremely good at getting AI-written text to evade detection. But this seems much less likely so I’m ignoring it.
“ Meanwhile people are being morally scorned and suffering reputational damage for being caught on the wrong side of things. This may disproportionately affect some writers (like non-native english speakers) who are finally seeing the playing field leveled a bit. ”
Excuse me? That’s a mealy mouthed way of endorsing cheating and lying.
That paragraph is summarizing anti-detection views I’m seeing (note the topic sentence), not necessarily what I think we should be most worried about. Sone people feel strongly that AI-detection is just the latest form of academic moral panic.
Maybe a bit tangential, or perhaps not…
I’ve worked with students from other countries who don’t really quite get the American focus on plagiarism. What does it matter, exactly, who authored some thoughts in a paper? Isn’t the truly important thing whether or not the student has identified the correct answers or critical concepts?
Of course the issues you’re addressing are more complicated than that, but there is something important there thsr overlaps with your post, I think: Who owns ideas anyway? If some traditions treat writing as a craft separate from idea generation, or see (ownership of) idea generation as less important, then the distinction between Type A and Type C authors may itself be somewhat culturally specific.
Is the American (or academic) concept of owning ideas substantially a cultural artifact, perhaps an artifact of Western, or capitalist values?