Eliza Effect - Hughes 1

clingfawnAI and Robotics

Feb 23, 2014 (8 years and 8 days ago)


Eliza Effect - Hughes 1

The Eliza Effect: Conversational Agents and Cognition
Laura Hughes
English 301
Unit 3

Eliza Effect - Hughes 2

1.1 Introduction
One major goal in the fields of artificial intelligence and computational linguistics
is the creation and improvement of conversational agents, computer programs designed to
imitate natural language. Conversational agents have a variety of potential applications,
including the development of online salesbots for commercial websites and incorporation
into user-friendly personal digital assistants. Natural language use is a major step toward
the creation of robots that can communicate smoothly with human users.
Language use alone is by no means an indictor of deep cognitive processing;
many chat programs rely heavily on random algorithms and basic keyword searches. Yet,
when evaluating language-producing computer programs, humans must actively struggle
not to interpret an algorithm’s output as the result of human-like intelligence and
emotion. Two major cognitive processes contribute to this tendency: we ascribe
intelligence and personality to non-human entities; and we naturally look for cohesion
and meaning in utterances, a predisposition which I argue results from the linguistic
principle of conversational implicature. The impulse to “believe in” conversational
computer programs may partially be a result of unconscious attributions, but it also seems
clear that, to some extent, users willingly suspend their disbelief in order to enhance their
Eliza Effect - Hughes 3

1.2 Artificial Intelligence & Language
Language use holds an important place in artificial intelligence. In his
groundbreaking 1950 article “Computing Machinery and Intelligence,” Alan Turing
posed the philosophical question “Can machines think?” He argued that it would indeed
be theoretically possible to program computers that we would deem intelligent by human
standards. Computers’ language use was an essential benchmark. A computer might be
considered “intelligent” if a human conversational partner is fooled into believing he is
communicating, not with a computer program, with another human. Today, tests based on
the one proposed in Turing’s article are used to evaluate conversational agents. The
$250,000 Loebner prize for the first chatbot that passes the test has yet to be awarded.
When Turing predicted that the test would be passable by computers in the year
2000, he couldn’t have foreseen the difficulties we now face in designing programs for
natural language production. As Wallis (2005), creator of the chatbot Eugene, explains,

When Turing was thinking about AI, the general consensus was that machine
intelligence was the interesting problem, and that natural language processing with
a computer would be relatively simple. Although a whole series of classic AI
problems have fallen to smart algorithms and powerful computers, a computational
model of language processing is still in the too hard basket. The Turing test was
intended as a test that pitted the wits of a judge against the intelligence of the
machine. The judge’s aim is to test the machine. A more practical problem, that
looks similar, is everyday conversation in which conversation itself is the aim.

The past fifty years have brought new insights into the incredible intricacy of natural
language; as such, the field of linguistics is a long way from developing perfect
descriptive models of the deceptively complex mental rules we use in everyday
communication. Yet even if we were to discover these rules and represent them in a
Eliza Effect - Hughes 4
computer program, the computer couldn’t necessarily be said to be intelligent. Language
is a tool for communicating thoughts; a machine capable of using language would still
need something to say. Duffy (2003) argues that deeper, human-like understanding is not
necessary for artificial agents to give the “illusion of life”; the issue is not “whether a
system is fundamentally intelligent, but whether if it displays those attributes that
facilitate or promote people’s interpretation of the system as being intelligent.”
Currently, conversational agents use a number of shortcuts to get around these
problems. One now-common strategy is demonstrated in the first interactive chat
program, Weizenbaum’s “Eliza” (1966), designed to parody a Rogerian psychotherapist.
The program works by simply transforming the user’s input into a question (“Why do
you feel that…”) or by providing a canned response to certain keywords in the input
(“Tell me more about your mother.”) Wallace’s “Alice” (2004) program works on the
same principles, but with modern programming and linguistic techniques, it is a more
refined implementation, and is not limited to the role of psychologist.
Another contemporary chatbot system, Jabberwacky by Carpenter & Freeman
(2005), uses a purely statistical method, circumventing the need to program syntactic and
other linguistic rules. Jabberwacky logs all user inputs along with information about the
preceding utterance and produces those inputs should a similar context arise. Thus, all of
Jabberwacky’s utterances are former user inputs (leading to many arguments about who
is the computer and who is the human). Jabberwacky works very well for short, common
phrases, particularly socially dictated response pairs, such as “Hello” and “How are you”
or “Thank you” and “You’re welcome.” Even more obscure phrases are often met with a
Eliza Effect - Hughes 5
surprisingly relevant response.
However, responses are often non-sequiturs
, although its
creator insists that “it is never random, merely imperfect, not yet having sufficient data to
cope with complex sentences.”
A more general conversational strategy is employed by Wallis’s Eugene program
(2005), which is not designed to pass as an actual human, but as a fictional character in a
game world. The idea is that, to achieve believability, fictional characters are defined by
their goal and intentions; Eugene therefore runs on an “intention map” framework which
involves the selection and pursuit of specific conversational goals (such as getting
None of these bots could be said to have passed the Turing test. Their limitations
are rapidly made evident to the judges. If their nature is not betrayed by linguistic
shortcomings (such as the production of syntactically ungrammatical sentences), they
share a general inability to produce sufficiently meaningful or relevant responses.
Yet, while the Loebner prize has never been awarded, chatbots have on many
occasions “unofficially” passed the Turing test: users have been convinced, for varying
numbers of conversational turns, that they are communicating with another human.

From a recent experimental chat session at jabberwacky.com (4/2/06, 12:35 PM):
EXPERIMENTER: “Indeed, my enormous intellect frightens even me.”
JABBERWACKY: “How big is your brain?”
EXPERIMENTER: “I should guesstimate that my brain is probably at least one cubic mile in volume.”
JABBERWACKY: “Probably. ;-)”
From another chat session at jabberwacky.com (4/2/06, 12:40 PM):
EXPERIMENTER: “The evidence is too great against you. Nothing shall convince me that you were not
the one to reveal... The DEADLY SECRET.”
JABBERWACKY: “Nothing acts faster than anadin so take nothing.”
Eliza Effect - Hughes 6

1.3 The Eliza Effect
When Joseph Weizenbaum created Eliza in the 1960s, he was not making a
serious attempt at computerized psychiatry or even at passing the Turing test. He merely
wanted to “vividly demonstrate the information-processing power of a computer to
visitors who did not already have some specialized knowledge… [Eliza] could be
appreciated on some level by anyone.” (1976, p. 5)
Eliza’s domain of speech (client-directed psychotherapy) was chosen purposely to
temporarily hide ‘her’ inherent limitations, “because the psychiatric interview is one of
the few examples of categorized dyadic natural language communication in which one of
the participating pair is free to assume the pose of knowing almost nothing of the real
world.” Weizenbaum was intensely aware that Eliza’s believability depended on users’
attributions of intelligence to the program.
If, for example, one were to tell a psychiatrist "I went for a long boat ride" and he
responded "Tell me about boats", one would not assume that he knew nothing about
boats, but that he had some purpose in so directing the subsequent conversation. It
is important to note that this assumption is one made by the speaker. Whether it is
realistic or not is an altogether separate question. In any case, it has a crucial
psychological utility in that it serves the speaker to maintain his sense of being
heard and understood. The speaker further defends his impression (which even in
real life may be illusory) by attributing to his conversational partner all sorts of
background knowledge, insights and reasoning ability. But again, these are the
speaker's contribution to the conversation. (1966)

Despite Weizenbaum’s cautions, many people who conversed with Eliza, even
those fully aware of its artificial nature, treated the program as if it had real
understanding. In 1976 Weizenbaum wrote, “I was startled to see how quickly and how
very deeply people conversing with [Eliza] became emotionally involved with the
Eliza Effect - Hughes 7
computer and how unequivocally they anthropomorphized it.” (p. 6) Weizenbaum was
alarmed by “the enormously exaggerated attributions an even well-educated audience is
capable of making, even strives to make, to a technology it does not understand.” (p. 7)
To his mind, to overestimate the computer’s understanding was a grave mistake.
The popularization of Eliza and similar programs in recent years has led to the
term “Eliza effect” to describe the tendency of people to attribute human-like intelligence
to chatbots. A number of factors contribute to its power.

1.4 Anthropomorphism
The general tendency of humans to attribute human-like characteristics to non-
human entities is, of course, not limited to conversational agents. But Marakas, et. al.
(1999) argue that anthropormorphic metaphor is particular salient when applied to
computers. “Computers use language, respond based on multiple prior input, fill roles
traditionally held by humans, and are capable of producing human-sounding voices.
These, often extreme, social cues have until recently only been associated with other
humans.” Considering conversational agents’ use of these social cues, the Eliza effect
could be considered simply as a specific subtype of anthropomorphism. Yet it seems that
conversational agents are particularly susceptible to strong, undeserved attributions of
emotion and intelligence.
Eliza Effect - Hughes 8

1.5 Conversational Implicature
Conversational agents’ use of language particularly poises users to believe in the
meaningfulness of their actions, in this case, linguistic utterances. The unconscious
assumptions we make about the intentionality and meaningfulness of our human
conversational partners’ utterances are carried over when we chat with computer
programs, where those expectations are largely unfounded.
In our daily, constant acts of communication, we are naturally accustomed to
assume that our conversational partner is trying to be as cooperative and informative as
possible. In his 1975 article “Logic and Conversation,” Grice outlined four maxims for
smooth conversation: quantity (provide the right amount of information); quality (be
truthful); relation (be relevant); and manner (avoid confusion). If a speaker violates the
rules, we as listeners assume there is a reason for doing so. This forms the basis of the
theory of conversational implicature, in which, through maxim violations, speakers
routinely imply more than is said on the surface. For example, the following dialogue
technically violates the maxim of relation, since the second utterance, on the surface, is
irrelevant to the first:

A: I need a doctor’s note.
B: I’m a librarian.

Yet it is easy for us, as listeners, to draw the inferences that make B’s response seem
relevant to A’s utterance. Depending on the context and tone of voice, B might be saying
“If you want me to write the note, forget it; I’m not a medical doctor” or “Perhaps they
Eliza Effect - Hughes 9
will accept my signature, with my doctorate in library science.” On no account do we
assume that B is simply stating her profession irrelevantly.
According to Grice’s Cooperative Principle, we assume whenever possible that
others are trying to offer an informative response. Levinson (2003) argues that
conversational implicature is a basic and essential part of everyday communication,
allowing us to get around the “phonological bottleneck”: routinely implying more than is
said allows us to condense a great deal of meaning into a single utterance. (p. 29)
Maxim violations are frequent in conversation with chatbots. Because they are
also frequent in normal human conversation, though, this does not diminish the
expectation of cooperativeness. We presume in human-human conversation that maxims
are violated to achieve a particular communicative purpose, and this unconscious
assumption is retained when conversing with robots.
The conversational maxims, as generalized by Grice, are relatively broad and
work on a level of discourse currently inaccessible to conversational agents (application
of the maxims depends on speakers’ understanding the meaning of the utterances, which,
as noted earlier, is beyond current technology). While chatbots are currently incapable of
intentionally creating an implicature, a bot might make an irrelevant response for many
reasons, from misanalysis of a keyword to pure random chance.

Human: The chrystanthemums are in bloom again.
Computer: Tell me more about your mother.

Human: Sheila was bitten by a bat.
Computer: Would you like me to call you Sheila?

Eliza Effect - Hughes 10
Even knowing consciously that the computer program could not have intended the
implicature, it is quite a simple matter for us to invent a reason that the chatbot’s
utterance makes sense after all. Perhaps the computer believes that your mother loved
chrysanthemums, or that you are the one who was bitten by a bat. We apply the
cooperative principle and invent conversational implicatures in conversation with
computerized, as well as human, partners.

1.6 Suspension of Disbelief
According to Grice’s cooperative principle, concluding that our conversational
partner is simply being uninformative is a last resort. Unless they are given no other
option, users will mentally provide any information necessary to make a chatbot’s
utterance work. Wallis (2005) argues that users routinely give chatbots the benefit of the
doubt, not necessarily just because of the unconscious conversational principles they are
accustomed to using with humans, but because of a basic desire to believe in the
character. “Human users want to believe, and as long as they are not setting out to test,
they will be willing to play along with an intentional agent.” The crucial argument here is
that users are applying their cooperative assumptions, not because they are unable to
ignore them, but because they choose to.

In the early 1800’s Coleridge introduced the idea of “willing suspension of
disbelief” by the audience, and the critical factor was not to break it. This raises the
question about what people are not willing to believe. The answer is that they will
believe pretty well anything, as long as it is established in the early parts of the
story. Pieces of string or lumps of clay can think and express emotions, liquid steel
men can melt through metal grills, and there is a platform 9 ¾ at Kings Cross.

Eliza Effect - Hughes 11
Suspension of disbelief is willing, not unconscious; users are not ignorant of the
lack of deeper understanding inherent in computerized conversation, but they choose to
ignore it. It is easy to dismiss a chatbot as a simple program, but it is more fun to pretend
that it is an intelligent entity; and it is possible to do this as long as the program’s errors
don’t force users to recall its limitations.

1.7 Future Directions
Even users who are fully aware of chatbots’ lack of deeper understanding can
quite cheerfully engage in intellectually and emotionally meaningful conversations with
them. In the 1960s Weizenbaum was shocked by what he described as the ability of “a
relatively simple computer program [to] induce powerful delusional thinking in quite
normal people.” (1976, p. 3-4) This view is directly opposed by Wallis’s interpretation of
chatbot interaction as a result of willing suspension of disbelief. To what extent the
treatment of language-using computer programs as if they are intelligent and reasonable
conversational partners is the result of cognitive biases, anthropomorphism, and
conversational implicature, and to what extent it is a conscious effort driven by users’
desire to believe and play along, is an area for further research.

Eliza Effect - Hughes 12
Carpenter, R. & Freeman, J. (2005) Computing machinery and the individual: the
personal Turing test.

Duffy, B. (2003) Anthropomorphism and the social robot. Retrieved March 22, 2006
from http://www.eurecom.fr/~duffy/publications/duffy-

Grice, H. P. (1975). Logic and conversation. In Martinich, A.P. (ed). Philosophy of
Language. (pp. 165-175) New York, NY: Oxford University Press.

Levinson, S. C. (2003). Presumptive meanings: the theory of generalized conversational
implicature. Cambridge, MA: MIT Press.

Marakas, G. M.; Johnson, R. D.; Palmer, J. W. (1999). A theoretical model of differential
social attributions: when the metaphor becomes the model. Retrieved March 17,
2006, from http://www.bus.ucf.edu/rjohnson/ISM3011H/socialattribution.pdf

Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59, 433-460.
Wallace, J. (2004) An introduction to A.L.I.C.E., the Alicebot engine, and AIML
Retrieved February 10, 2006 from http://www.alicebot.org/about.html

Wallis, P. (2005). Believable conversational agents: introducing the intention map.
Retrieved March 7, 2006, from http://www.dcs.shef.ac.uk/~peter/wallis05-3.pdf

Weizenbaum, J. (1966), ELIZA - A computer program for the study of natural language
communication between man and machine. Communications of the ACM 9:36-45.
Retrieved February 18, 2006, from

Weizenbaum, J. (1976). Computer power and human reason: from judgment to
calculation. San Francisco: W. H. Freeman.