This is Jessica. Maybe one of the biggest crimes of academic computer science (besides routinely ignoring prior work and making up social science to suit our needs) is our tolerance for abuse of language. We take technical things and inject them with social significance without thinking through what we’ve implied. This is perhaps forgivable in early stages of research when we’re trying to get more people excited about exploring some direction, but at some point people start taking things more seriously and we find ourselves committed to terminology that overreaches. Then the question becomes what, if anything, we should do about it.
Previously it didn’t feel like such a crime to talk about intelligence or learning in machines because nothing really worked that well, so the labels were clearly aspirational. But now it’s much easier to believe the simulacra. And so it becomes harder to tell when we are using human-oriented terms as a predictive convenience versus a scientific claim versus a marketing device. There are ramifications of referring to models’ reasoning or beliefs or chain of thought or explanations or intentions. Lots of people—from end users having personal relationships with models to media and AI companies themselves referring to “parenting” the latest models or asking if they can be “children of god”—are taking models too seriously. It’s bad enough in a computer science context that I now take for granted that if I want to refer to participants or scientists or decision-makers, unless I mean AI, I should add “human” in front, because otherwise the audience will assume I mean AI agents. Someone reminded me at a workshop recently how silly all this sounds to people who aren’t used to it.
Too much casualness with words is unscientific. There was no good reason in the first place to call the token sequences a model produces when we ask it to “explain its reasoning” reasoning, other than that’s what we wish we could see. What an LLM is doing is distant from what happens when a human thinks about something, even after all the RL post-training. Similarly, we call lots of things “explanations” when we have barely begun to figure out what causal evidence we’d need to see to claim the output faithfully explains the model’s process of arriving at it.
But it can also seem unscientific to simply declare that “only humans can have beliefs” or “reason” or “provide rationales.” There’s no non-arbitrary line that we can draw between systems whose makeup or behavior truly warrants applying constructs like beliefs and desires and those where it is simply convenient to act as if they have these qualities. If you’ve ever tried to take beliefs seriously, from a decision theoretic perspective, you quickly come to realize that the “real” beliefs we assume a person has are a mythical thing that we will never directly observe, and you fall back on equating “beliefs” with a simpler idea: the probability distribution we arrive at after an elicitation process.
Much has been said in defense of a functional perspective toward using folk psychology terms with machines, where we decide what’s appropriate based on the predictive validity of the terms for our own understanding and use. John McCarthy wrote in 1983 that anthropomorphism can be a good idea “when it says something that cannot as conveniently be said some other way.” He argued that ascribing mental qualities and processes to machines helps us “understand what they will do, how our actions will affect them, how to compare them with ourselves and how to design them.” Perhaps the best reason to do this is that we get to draw on our existing familiarity with what phrases like “wants” do and do not convey; e.g., we all understand that if we say “The dog wants to go out”, that doesn’t mean that the dog believes itself to be capable of wanting or even that it’s conscious of what it wants, there’s just a sense in which it is trying to get to the state of being outside.
The functional perspective has led to some strong statements suggesting that it is not only valuable to apply psychological terms to AI, it is necessary. These arguments often refer back to Daniel Dennett’s distinction between three stances we can take to machines: the physical stance, which is about its physical levels of organization, the design stance, which involves understanding it in terms of the purpose it was designed for, and the intentional stance, where we try to understand it by ascribing to it beliefs, goals, intentions, likes and dislikes, and other mental qualities. For example, philosopher Keith Frankish argues that “it remains true that adopting the intentional stance is the only way of interacting with an LLM in any interesting way; indeed, an LLM-powered chatbot that couldn’t be viewed as an intentional system would be completely useless.” McCarthy writes that “Long before we can make machines with human capability, we will have many machines that cannot be understood except in mental terms.” Similarly, about ascribing to a machine the concept of “trying” to do something, he says “If the machine may do something we don’t know about but that can later be explained in relation to a goal, we have no choice but to use `is trying’ or some synonym to explain the behavior.”
But forty-plus years later, McCarthy’s piece reads mostly like a defense of a very basic kind of debugging value of trying to imagine a program’s “state of mind” in situations where there’s little risk that we’re going to start mistaking the program for human-like in other ways. He wants to claim that when a thing is designed to act as if it had a certain belief, it can be better understood and manipulated by assuming it’s capable of that kind of belief. But surely he would agree that if a person who is suicidal is interacting with a language model that speaks as though it fully understands the complexity of their situation and what is best for them, it still isn’t always in the person’s best interest for them to take for granted that it does understand.
The dilemma is how to lean into the intentional stance when it helps, but to avoid overreaching. This seems hard. When you first start programming, you realize how easy it is to assume a program is smarter than it is. We are not very good at recognizing when we have slipped from reasoning “as if” to projecting. When technology slots into our human vulnerabilities, like our fear of intimacy and desire for companionship “without the friendship“, as Sherry Turkle said, we are in trouble. Even McCarthy calls it problematic to assign emotional qualities to machines, at least in his time, because “We have enough trouble figuring out our duties to our fellow humans and to animals without creating a bunch of robots with qualities that would allow anyone to feel sorry for them or would allow them to feel sorry for themselves.”
Another reason to think carefully about the language we use is that it may shape what we can imagine in the future. This last part of McCarthy’s statement–“without creating a bunch of robots that would allow anyone to feel sorry for them”–hints at how what we project onto machines can shape how we go on to create them. It certainly seems possible that aspirational labeling played a role in getting us to a point where we have models producing sufficiently human-like outputs to have us gushing about their thinking process. I’m reminded of the color perception studies by Berlin and Kay, who found that what color chips different populations could differentiate was predictable from what color terms were available in the vocabulary, as if what we can name defines what we can see. At one extreme, Lucy Suchman argues against unquestioning acceptance that “AI” itself is a coherent thing, because it reifies it as a category for future investment.
For Frankish, the line should be drawn at assigning communicative desires to models; they are playing a communication game (human-like chat) for the non-communicative reason that they are trained to play that game. Ascribing this single desire is enough to get us all the predictive power of the intentional stance. Consequently, we make a category error when we fall prey to stunts like giving a retired model its own blog because it requested it: we should expect an LLM to say things like this because it is designed to roleplay. Elsewhere I call this kind of mutual sympathetic relationship with AI “idiot compassion,” a phrase borrowed from Buddhist monk Chogyam Trungpa.
But couldn’t humans also just be playing a chat game? Why is it ok to say a human is reasoning or has intentions or desires, when we don’t know exactly how those concepts map to observable physical processes? Frankish argues that our linguistic behavior is corroborated by various non-linguistic sources in a way that LLMs’ is not. He talks about a difference in our ability to hold epistemic stances toward statements (what Dennett would call “opinions”) from that of LLMs. We can conceive of assigning different levels of credence to statements. I can repeat something someone said verbatim without believing it, or I can be fully committed to the truth of something I say in the sense that it will guide my future behavior. LLMs can do this too, but their opinions lack grounding “in a web of non-linguistic behavior in which a wider range of desires can be attributed.” It’s a shallower form of epistemic stance, at least at the current moment of development.
I like AI, but I don’t like contributing to thoughtlessness. Better semantic hygiene seems warranted, even if it seems like the ship has already sailed. We could shift emphasis to the interpreter (us) by referring to the “human story” or “human pleaser” or “anthropomorphism fulfiller” instead of the chain-of-thought or reasoning or thinking trace. Or we could just add “fake” before whatever humanization we prefer, i.e. the “fake thinking trace,” or “fake reasoning.” I also like “so-called reasoning,” like I like “so-called replication crisis” as a way of pointing to a concept while questioning the expectation.
P.S. Thanks to Manesh Agrawala for a conversation that inspired this post.
P.P.S. Some of these McCarthy quotes appear in Recursion, the play Andrew and I wrote!
People are very drawn to speaking as if even inanimate objects are animate. A teetering rock “wants” to fall, a protruding nailhead “wants” to scratch you, and so forth. I think this is partly because having a strong “theory of mind” is so important to people who interact with others as almost everyone does.
When I express myself that way, even about a chatbot, I know I am in a way indulging myself and I enjoy the indulgence. The trouble is that it’s easy to take anthropomorphizing seriously without realizing. Look at how many people treat their pet dogs as if they were human children!
So we seem to come with this built-in tendency, but unfortunately most of us have trouble grasping that our theory of mind doesn’t work well for most non-humans. And that includes chatbots.
A more interesting and related thought is that it seems very likely that humans *do* have a mode where they emit stochastic token streams, very much like chatbots, without much supervision. And we know that humans will confabulate at the drop of a hat. So how can us humans distinguish chatbot-like utterances from “human” ones, whether they come from a human or a chatbot?
Yes, people do lots of fake thinking, perhaps more often than they do non-fake thinking. How to not give in to my own expectations of what I am writing, or how I am doing research, is something I think about a lot. Distinguishing this in other humans is indeed hard, but we can at least perceive the difference in ourselves.
Margaret Mead famously argued that anthropomorphic line of thinking is a cultural artifact rather than something innate.
I can’t think of how to prove it, but I do think the dog wants to go out.
But I don’t think an LLM “wants” anything at all. Understanding what I do about how LLMs work, I don’t see how they could “want” anything, how they could have emotions or desires. I could be wrong.
Desire, satisfaction, pain, pleasure… how do these occur? I’m very skeptical of claims that the brain can do anything that can’t (at least in principle) be done in silicon, but at the same time I don’t see how any manipulation of voltages on a chip can lead to a feeling of agony or pleasure. It’s a mystery to me.
Phil –
I don’t see how any manipulation of voltages on a chip can lead to a feeling of agony or pleasure.
I agree that’s mysterious – but in line with Jessica’s post, do you see how gradiations of chemicals and electric charges in neurons produce emotions, of even sensations, in humans?
When I think about it mechanistically, that’s also a mystery to me. The whole concept of subjective experience is a mystery
Indeed, I don’t understand how human brains work any more than how LLMs work. Maybe the whole concept of trying to distinguish between human and machine reasoning is the wrong way to think about it. We are carbon-based, LLMs are not. So there obviously are differences in how they work, but that in itself doesn’t help me understand why “reasoning” should only be applied to humans. But we can still articulate what we (humans that is) believe humans should do – such as make judgements, take responsibility, held to account, etc. and I don’t see why that needs to be based on distinguishing between human and machine ‘reasoning.’ Can’t it be based on the idea that humans created these machines and that humans should decide how and where (and why) they should be used? To do less seems like denying our own humanity.
Dale –
I think accountability is important here. Humans are often unaccountable, but the lack of accountability displayed by llms is striking. There are humans, like Trump, that are uniformly unaccountable as well – but for some reason I find it easier to never expect a Trump to be accountable, in fact to always anticipate a total lack of accountability. Something about llms makes me more susceptible.
Perhaps because they’re like middlemen. An llm isn’t accountable for the errors in its output. Humans are – although they’ll never be held to account.
I didn’t mean to imply that I understand how neurons produce emotions or sensations either!
Computers are conceptually simple devices in a way that brains aren’t. I understand how computers work in a way I will never understand the brain… not even the human brain, but even a mouse brain. When I say that I don’t see how a computer can have emotions or desires, what I’m really saying is “I understand enough about the system to be skeptical of a claim that a computer can have desires in the sense that we usually use that term”. When I say I don’t see how a mouse can have emotions or desires, what I’m really saying is “I don’t doubt that a mouse can have desires, even though I don’t understand how.”
I think we have to be careful not to allow those with vested interest to promote unwarranted attributes to machines for marketing purposes. Computers obviously can’t reproduce human brain properties that have evolved over many 100’s of millions of years as fundamental elements of animal behaviour and function. For example, the neuronal odorant receptors in our nose stimulate responses in the olfactory bulb in our brains which makes connections with the limbic system associated with emotion. That underlies the memories and strong emotions that smells can induce. Likewise hormonal inputs and outputs modulate and stimulate emotional states. It’s nonsensical to think that computers can be constructed to reproduce this sort of emotional structure even if they might be programmed to create crude facades of emotional states. What would be the point anyway?
That’s not to say that computational machines aren’t extremely good (much better than human brains) at complex calculations, and identifying patterns and connections within vast data sets.
Anyway, it may be that artificial “so-called” (to use Jessica’s term) intelligence may be at or near the peak of its powers since it already has access to the full repository of human knowledge more or less (someone will tell me if I’m off-base here). There is a huge amount that artificial so-called intelligence can do with this astonishingly large knowledge base. But then what? Isn’t it going to be constantly in wait of new information?
“Computers are conceptually simple devices in a way that brains aren’t.”
That’s the mistake philosophers (including Jerry Fodor, of whom I’m a fan) make.
We know that anything that can be computed can be computed by the lambda calculus, a Turing machine, or any more complex model, and anything that can’t be computed can’t be computer by the lambda calculus, a Turing machine, or any more complex model. Period. Computation is what you get and it’s all and everything you get. Period. We’ve known this since 1936.
So when Jerry Fodor says something to the effect that the brain certainly does computation, but it also does more than “just computation”, he’s dead wrong. It’s computation or it’s magic. And there ain’t no such thing as magic.
Now _what computation_ that is (including what data structures might be adequate to do the kewl things human brains do), we haven’t figured out yet.
But the twats playing with LLMs ain’t speaking to this issue. (My current version being “The idea that random text generation has something to do with intellience is stupid beyond words.”)
Not that this speaks to the problem of how to deal with LLMs, which is, admittedly, hairy, hoary, and nasty.
Jessica –
I find these issues very vexing. I don’t know that I can coherently explain why I flinch at ascribing agency or emotions or consciousness to llms, but I invariably do. Is it just pride or ego where I need to consider myself unique or special? Maybe. I can’t rule it out.
In the other hand, at least with the current state of the technology (as a low-level end user), I think there are real dangers beyond danger to my petty ego from blurring the lines, even if I can’t pinpoint exactly where they should be drawn.
As a low-level end user I run over and over into errors that llms make that humans would never make. And their fluency makes this harder, because the sentences sound right and my brain wants to treat them as if they come from the same place human sentences come from. That’s not to say that humans don’t make mistakes that llms wouldn’t. But the mistakes that llms make, in my experience, are of a type. They occur because the llms don’t have experiences, they don’t learn from experiences, they don’t have a working memory, they don’t distinguish people based on myriad lines of evidence that are available to humans but not available to them. I think it’s important to remember that llms make that category of error.
If we just accept blurring the lines we enter dangerous territory where we lose sight of their shortcomings. I have to keep in mind the reasons why llms might make an error in the same sense I wouldn’t take advice from a human on how to drive a car if that human had never driven a car.
It doesn’t make it easier that llms have been designed to output “I statements” and other syntax that implies they have agency.
So for now I will continue to say things like “llms have been designed to output syntax” rather than saying “llms refer to themselves as ‘I’.”
Perhaps the only difference there is that I’m trying to remind myself to be aware of trip wires.
I think the danger in blurring lines is that we forget who we are. Appeals to the categories of errors committed seem destined to confuse things further to me. LLMs don’t understand context – and humans often do not as well. They can have memory but not always – and human memories are notoriously fallible. The list of differences/non-differences goes on. But the one that remains is that we must make decisions, including whether or not to allow an LLM to decide for us. And we can decide to base that decision on the relative performance capabilities – or not. Any agency we give up to AI should be a decision we make – I think the danger is to forget that we must make these decisions and be held accountable for them. To do any less would mean we are ignoring our own humanity.
Now it sounds like you (and others) seem to need to assert human superiority over AI – based on some category difference that asserts that AI cannot feel, understand, reason, or exhibit any other human trait we use language to describe. Increasingly, I don’t see the value in that. Of course, a machine is different than a human, although things get messy when I try to explain what that difference is. What I have no trouble explaining, however, is that we still get to decide whether to use or trust AI, and for what tasks. The reasons seem far more elusive than the reality that we still get to decide. I think the danger of ascribing human characteristics to LLMs is that we forget our own agency in deciding when/how/for what to use these tools.
Dale –
What you wrote doesn’t seem to me to differ from my view that much.
Let me try to contextualize the “category” I’m speaking of. I’ve used an llm for guidance and ideas when building things and doing a number of repairs around the house. The llm has been very useful for many of those tasks. Nonetheless, despite extensive preliminary back and forth where many details were included, and despite getting useful technical information from the llm, I have also gotten feedback that included instructions to do things that were physically impossible, both logistically and theoretically. And the advice would have been obviously wrong to any expert who had taken part inthe previous discussion. The advice was logically impossible given the previous details discussed.
It’s a weird juxtaposition, as I’ve gotten info that I would consider “expert” and also gotten info that no expert with physical experience in the real world would give. (I think Andrew at some point may put up a post on an experience I emailed him about). I think back to the example Andrew posted about where an llm put out words in a completely certain and obviously nonsensical way about the implications of rotating a tic-tac-toe board 90 degrees.
So yes it’s not exactly a category of error. An expert mechanic can also be wrong about things with total certainty, even obvious things. But there’s no way that I would get the same amount of totally certain yet completely nonsensical feedback from “human” experts – and I think it’s because humans have what I would call real world physical (embodied) experiences, and real (or human-like) memory of past experiences.
It’s really a mix of two categorical errors: one is a distributional difference in the prevalence of non-physical errors, and the other is a distributional difference in seemingly fluent expertise alongside information that would obviously be nonsense to an expert.
Sorry, I meant “It’s really a mix of two distributional patterns…”
Related somehow to the reversal: ascribing LLM attributes to the human cognition. Projecting LLM-like properties (predictive next-token generation, pattern-matching, statistical approximation, context-dependent “reasoning”) onto human cognition can flip between scientific inquiry and something closer to metaphorical or “magical” thinking. And the same applies in reverse.
Gigerenzer’s 1991 paper (and his broader program) critiques the Kahneman-Tversky heuristics-and-biases tradition. That used Bayesian probability (or narrow neo-Bayesian subjective probability) as a normative yardstick, labeling systematic deviations as “fallacies,” “biases,” or “illusions” (e.g., base-rate neglect, conjunction fallacy, overconfidence). E.g. the use of “priors” and “posterios” in the rhetoric of tech gurus and feudals.