More Claims that AI is Sentient are (Probably) Coming
What we should learn from the LaMDA story, and how we can all be more careful going forward
On June 11, Google engineer Blake Lemoine published an “interview” he and a collaborator at Google conducted with the language model-powered chatbot LaMDA. Following his publication of the interview transcript, Lemoine was put on administrative leave, with the stated reason being breach of confidentiality.
According to Lemoine, he was asked in the Fall of 2021 to assist with a particular AI ethics effort within Google. As he investigated the concerns in line with the Google effort, he discovered his own “tangentially related but separate AI ethics concern.” This article, which does not mention LaMDA specifically, was released five days before Lemoine posted his interview transcript – presumably, LaMDA’s sentience was the concern Lemoine was raising. In another article titled “What is LaMDA and What Does it Want?” Lemoine notes that LaMDA is not merely a chatbot, but a system for generating chatbots, and that “to better understand what is really going on in the LaMDA system we would need to engage with many different cognitive science experts in a rigorous experimentation program.”
As we explained at the time, LaMDA is not sentient in any meaningful sense (see the next section). But, even as interest in LaMDA wanes, discussions of the possibility of AI becoming conscious are only going to happen more often. As an article from the Washington Post notes, Lemoine is among “a chorus of technologists who believe AI models may not be far off from achieving consciousness.” In this article, we will explore in more depth why LaMDA and models similar to it are not likely to attain anything that might be called consciousness anytime soon. We will also consider why humans–even those who develop and understand AI systems–are apt to personify and attribute consciousness and agency to technological systems.
First, before getting into the debate, let’s spend some time considering why the sentience question is important in the first place. I’ll also note that while we make a strong claim that LaMDA and its ilk are not sentient in this piece, humans do not have access to all knowledge–thus, claims of certainty, be they arguing that LaMDA is sentient or not, should be taken with caution. I’ll do my best to qualify claims in this piece to that effect, but will note here that I’m attempting to approach this question with at least some epistemic humility.
In 2016, Slate wrote that humans are bound to have a difficult time determining whether machines are, in fact, sentient. This is important because sentient beings–those who can experience valenced states like pain and pleasure–are moral patients. If we, in fact, did create a robot that experienced happiness and anger and the glut of emotions we humans do, then we should behave morally towards it. When we consider claims about sentience and consciousness, we enter shaky territory because we do not understand what it takes for consciousness to arise in the first place, and the evidence we can gather for consciousness is far from indisputable (see the zombie problem).
All this being said, I feel a consideration of how LaMDA and similar systems work warrants a dismissal of claims that they are sentient. Without a theory of consciousness or sentience it is hard to make this claim without some sort of qualification. When I interviewed philosopher David Chalmers, responsible for articulating the Hard Problem of Consciousness (which asks why we have subjective experience), he maintained that he was open to the idea that neural networks are conscious “in the same way [he is] open to the idea that an ant with three neurons is conscious.”
It is difficult, as a human, to conceptualize what these “degrees” of consciousness might look like because we only have access to our own subjective experience. So I’ll revise my claim to make it slightly more precise, in line with the idea of “degrees” of sentience or consciousness (which I will consider in more detail later): I do not believe that LaMDA or other large language models have the degree of sentience that renders them moral patients–I do not believe they experience feelings of pleasure or pain.
Now, let’s explore the technical reasons why LaMDA and models like it can behave as they do without being sentient: (1) its conversational training data, (2) its size, (3) Lemoine’s prompting in the conversation.
Experts agree that AI systems are not “sentient.” LaMDA, GPT-3, and their ilk are incredibly powerful information processing systems. But it is a stretch to say that they are sentient in the relevant sense. While some, like Chalmers, are “open to the idea” that a system like GPT-3 might be conscious in the sense of having a subjective experience, but this is along the lines of his openness to the idea that “a worm with 302 neurons is conscious.” Lemoine’s claims about LaMDA’s consistent sense of self, experience of the world, and desire to do good in it go far beyond this openness. Furthermore, as we pointed out in a previous article and in the previous section, concepts like consciousness and sentience remain nebulous–it seems odd to claim that we would have developed such capabilities in machines.
Language models are powerful and will only grow more capable with time–but fundamentally they are merely information processing systems that “exist” in the sense of taking input text, performing a series of computations on that text, and spitting out more text. This does not imply having a self, the capacity to feel and act in the world as an autonomous agent. Thomas Dietterich notes that it is strange to think of LLMs as having sentience when their only sensory input is an incoming text stream. They might have the ability to “sense things,” but this does not equate to having feelings or internal experience in the way we do–while we rely on people’s self reports as evidence for consciousness or sentience, we must be fare more careful with language models’ self reports of emotions or “inner lives.”
They have merely learned from reams of text–humans’ self reports–to produce text that says the same things you or I might say when describing emotions or desires. But being able to produce a coherent description of having such experience is totally different from having that experience. The former can be achieved via statistical models, lots of matrix multiplication; the latter, not so much.
What I find most interesting about this story is the human desire to impute motives and even sentience to algorithmic systems.
This is a really tricky point, and it’s worth spending some time on. First, let me get into some details about Lemoine’s claims in particular. In interview after interview, Lemoine has made a big deal of the fact that LaMDA is not a language model, but has a language model.
Lemoine has also stated that Google more or less “threw everything” at LaMDA in allowing it access to all the information Google could provide it so that it could make queries in order to surface information. Furthermore, he has noted the consistency of LaMDA’s responses to him (regarding feelings and other conversational information) over long timespans. This is interesting! But again, I do not think it constitutes “strong sentience” or phenomenal consciousness in the sense of actually having subjective experience/feelings. A “memory” is something that can reasonably be implemented in many computational systems today–a powerful language model who has access to troves of information as well as what it said in the past, perhaps, could reasonably produce self-consistent dialogue. In the presence of far more plausible explanations (many provided by AI experts) for LaMDA’s behavior, I think we should be very skeptical of claims about sentience.
But let me get back to what I find truly interesting here. That Lemoine was willing to make such a claim about LaMDA, and that people are inclined to think these things about algorithms at all (indeed, let’s admit that without some amount of training and knowledge, many of us might be more convinced by claims like Lemoine’s). When I examine an AI system like LaMDA or GPT-3, I only have contact with its outputs–what it “says” to me. In another essay, I wrote that we rely on verbal reports for evidence of others’ having consciousness–we have few other resources to come to this conclusion, for we cannot enter other people’s heads and have their subjective experience for ourselves. This is, of course, coupled with other information about the person (observing how they act in the world) that provides some level of consistency with their verbal reports and gives additional evidence of their autonomy and subjective capacity to experience the world.
We have no interaction with language models or AI systems besides their outputs. Today’s most powerful are something of a “black box” that are capable of producing reams of coherent text. What are we to make of such prolificacy? Perhaps we know that what is going on inside is “just a bunch of linear algebra,” but it feels odd to see words that could so easily have come from a human without wanting a deeper explanation. In some theories of the mind, our cognition has to chase the “why” question to its end–until we find an “ungrounded ground,” a final reason, we remain unsatisfied. In this search, the “bunch of linear algebra” explanation for a large language model’s coherence can feel unsatisfying.
Humans Seeing Sentience in AI
Humans are sometimes apt to personalize agents and technological systems–even those who develop and understand those systems are not immune to this error. Ian Bogost observes that Lemoine–an engineer on Google’s Responsible AI team, should know better than most how LaMDA works. Despite that, he was apparently willing to risk his job on the claim that there was a ghost in the machine. Indeed, while engineers and researchers who develop AI models like GPT-3 or LaMDA understand the models and training procedures, there are important aspects of those models’ behavior that we still do not understand. It is this basic, mechanistic understanding that researchers like Chris Olah are actively trying to develop.
But we humans are likely to continue to make errors about these systems’ capabilities and supposed mental states, as Molly Roberts writes for the Washington Post. Roberts says that “we see ourselves in everything, even when we’re not there,” that we seek connection and may actively want to seek sentience.
I think there is a deeper layer to what Roberts is saying here. In his Critique of Pure Reason, Immanuel Kant wrote that reason seeks an unconditioned ground–that is, the human faculty of reason wants a complete explanatory account of events in the world. But this demand for the unconditioned, according to Kant, leads us to an error that he terms “transcendental illusion.” This illusion is the propensity to “take a subjective necessity of a connection of our concepts…for an objective necessity in the determination of things in themselves” (A297/B354). In other words, our reason takes its subjective principles and interests to hold objectively. Unfortunately for reason, and for us, this demand for the unconditioned is doomed to be unmet–we are precluded from a “full” knowledge of things in the world.
Why does this matter for our consideration of language models? If we’re willing to take a Kantian perspective, then we might see ourselves as demanding reasons for why language models are capable of displaying behavior that we normally only see in humans. Faced with an impenetrable “black box,” we supply our own explanations for that “true” reason we cannot find. We give explanations for this behavior that are simple, familiar. The complex intercourse of matrices boils down to “there’s a ghost in the machine.” LaMDA can hold a conversation with me just as well as my friends can–the only explanations I’m familiar with for such behavior are that the thing I’m talking to has some real experience with what it’s talking about.
We don’t have to go as far as Kant to say that we demand full explanations for everything in the world, but it seems sensible to say that there is something unsatisfying about how well most people–and even researchers–understand modern AI systems. It is, then, reasonable to expect that people will look for satisfying explanations for these systems’ behavior. It is said that any sufficiently advanced technology is indistinguishable from magic, but most of us do not suffer magic as the sole explanation for events in our world.
I suspect that going into the future, more research will be dedicated to building and improving new systems than scientifically investigating existing ones. Newness is exciting, and it’s incredibly satisfying to see progress. But this may leave behind an understanding of how and why modern AI systems do what they do. I think this gap between understanding and capability, while inevitable to some extent, is worthy of concern if it becomes too egregious (the recent sentience claims may even be an indication that this gap is already approaching worrying levels).
As a result, we should expect more and more claims about sentience to arise as AI continues to progress. More advanced systems might make Lemoine’s sentience claims feel less ridiculous (or at least hard to falsify), unless we understand those systems better than we currently understand LaMDA. In light of all this, I’ll conclude with three recommendations:
ML researchers should pursue more basic science to understand the things we’re building.
Anyone who communicates broadly about AI systems should be very careful about their language. We have previously published an article detailing best practices for AI coverage.
Everyone should seek to become more aware of their cognitive biases and potential errors in their reasoning. We all make incorrect judgements and evaluations in everyday life–while many of these errors may be innocuous, some matter. I think the ones we discussed in this article really do.