The Impact of Deception in a Multi-Agent System

tripastroturfΤεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 4 χρόνια και 6 μέρες)

118 εμφανίσεις








The Impact of Deception in a Multi
-
Agent System


The Impact of Deception in a Multi
-
Agent System



A thesis submitted in partial fulfillment


of the requirements for the degree of


Master of Science







By











Derrick Alan Ward, B.S.


Compu
ter Science, Arkansas Tech University, 2001








December 2003


University of Arkansas


ii






APPROVAL SHEET FOR MASTER'S THESIS





This thesis is approved for

recommendation to the

Graduate Council

Thesis Director:


_______________________________

Dr. He
nry Hexmoor




Thesis Committee:


_______________________________

Dr. M. Gordon Beavers


______________________________

Dr. Randy Brown





















iii

THESIS DUPLICATION RELEASE



[NOTE: To comply with Public Law 94
-
553
--
October 19,

1976, of the 94th
Congress, an Act for the General revision

of the Copyright Law, Title 17 of the United States Code, the

following is to be in the thesis and signed by the student.]

THESIS DUPLICATION RELEASE





I hereby authorize the University of Arkansas Libraries to

d
uplicate this thesis when needed for research and/or

scholarship.




Agreed_____________________________





Refused______________________________























iv

Table of Contents

1. Introduction

................................
................................
................................
...................

1

1.1 Overview

................................
................................
................................
..................

4

2. Related Work

................................
................................
................................
................

5

2.1 Socionics

................................
................................
................................
...................

5

2.2 HCI

................................
................................
................................
...........................

9

2.3 Security

................................
................................
................................
..................

10

2.4 Summary

................................
................................
................................
................

11

3 Simulating Terraforming Mars

................................
................................
..................

12

3.1 Agent Architecture

................................
................................
................................
...

16

3.1.1 Beliefs

................................
................................
................................
..............

18

3.1.2 Desires/Intentions

................................
................................
...........................

19

3.2 Communication

................................
................................
................................
.....

27

3.3 Power

................................
................................
................................
......................

32

4. A Model of Deception

................................
................................
................................
.

35

4.1
A Dynamic Generation of Intentional Deception

................................
...............

39

4.2
Deception Generation Heuristics

................................
................................
.........

42

4.2.1
Deception Devices

................................
................................
...........................

44

4.2.2 Simulating Deception

................................
................................
.....................

46

4.3 Coping With Dec
eption

................................
................................
........................

52

4.3.1
Trust

................................
................................
................................
................

53

5.Results

................................
................................
................................
...........................

57

5.1
Experiments based on varying

trustworthiness

................................
.................

58

5.1.1 Distrust Among Agents

................................
...

Error! Bookmark not defined.

5.1.2 Agent Performance

................................
................................
........................

64

5.1.3 Coalition Size

................................
................................
................................
..

68

5.1.4 Summary

................................
................................
................................
.........

70

5.2 Varying Agents’ Desires

................................
................................
.......................

71

5.2.1 Distrust Among Agents

................................
................................
..................

73

5.2.2 Agent Performance

................................
................................
........................

74

5.2.3 Summary

................................
................................
................................
.........

78


v

6. Conclusion

................................
................................
................................
...................

78

7. References

................................
................................
................................
....................

80

1. Introduction
1. Introduction

Deception

encompasses a large range o
f behaviors consisting not only of the first
example that usually comes to mind, blatant lies
-

but also a wide range of common
deceptive behavioral mechanisms including c
oncealment, exaggeration, equivocation,
half
-
truths, irony, and misdirection. [Buller
, et al, 1994]. In fact, nearly every day while
carrying out interpersonal relationships, one can expect to either witness or be the
conveyor of a deception. [Decaire, 2000]
Why do people deceive? To avoid hurting other
people's feelings, to cover our own

embarrassment, to reassure the needlessly anxious,
and to spare
unnecessary

headaches are just a few possible “every day” reasons.
Considering the variety of examples shown here already, it is not
surprising

that
deception has been formally defined a numb
er of ways.


The American Heritage Dictionary
of the English Language (1985)

defines the act of
deception

as “to cause a person to believe what is not true; mislead”. In [Zuckerman, et
al, 1979],
deception

was described as an
intentional verbal message t
hat does not honestly
reflect an individual’s actual opinion. Mitchell, on the other hand, forgoes giving a
simple single definition, instead associating
deception

with four broad groups of
behaviors based on complexity: (1)concealed appearance


non
-
verb
al deceit based on
physical appearances that may have evolved through genetics (Venus fly
-
trap), (2)deceit
through coordinated action or collaborative
deception

[Donaldson, 1994], (3) learned
deception
, and (4) planned
deception

[Mitchell, 1986]. It is c
lear from this research that
deception

encompasses a more complex and broad range of behaviors than commonly

2

believed. Clearly,
deception

is not limited to verbal communication or even dishonesty as
the definition given by
[Zuckerman, et al, 1979]

implies
.


Taking this notion of a broad range of deception one step further, it has been observed
that deception need not be communicated at all. Consider the concept of self
-
deception,
in which an organism convinces itself to adopt beliefs based on false
info
rmation
.
Several mechanisms of self
-
deception include benefits and effectiveness, an exaggerated
belief about the benefits and effectiveness an individual had in a situation, exaggeration,
in which an individual’s beliefs about past accomplishments become

inflated, and the
illusion of consistency, in which people change beliefs about past experiences to make
them consistent with present realities. [Trivers, 1985].


Although an exact meaning of deception is not easily defined, there are characteristics
co
mmon to all deceit. All examples and forms of deceit, whether intentionally or not,
involve some sort of concealment of information. Deception always consists of a change
in one person's beliefs attitudes, behavior or emotions brought about by some other

person or persons [Raven, 1983] or, as Raven neglected to state, one’s self (self
deception). It is also worth noting that deception is always motivated by an agent's desire
to gain social influence or power over a target. Raven lists power and dominanc
e, gain in
social status, role requirements, a desire to adhere to social norms, concern for image, and
a desire for attaining extrinsic goals as motivation for deceiving others [Raven, 1992].
Deceiving to gain power is not necessarily selfishly motivated

to harm the target of the
deception. In some cases the deceit may have the altruistic intent to help the addressee by

3

tricking them into performing actions

that are beneficial to themselves
. At any rate, the
usage of somewhat deceptive behavior is often

beneficial, if not mandatory, for those
involved
-

and is the norm
-

as opposed to the abrasiveness associated with brutal
honesty.


Given that deception is such a vague and complex notion, deception in artificial
intelligence may seem like a forebodin
g concept
-

how can a machine be programmed to
generate/detect a lie? In reality, however, software programmed to exhibit deceptive
behaviors is relatively common
-
especially in software designed for human computer
interaction. The Turing test for instan
ce [Turing, 1950], is a game of deception involving
a computer, a human, and an interrogator. The computer has been programmed to
deceive the interrogator into believing that it is a human being, while the humans attempt
to deceive their interrogators int
o believing that they are machines. Currently, deception
in artificial
intelligence

is being used to investigate socionics: the merger of work from
distributed

AI and sociology to create insight common to both fields [Malsch, 1996].
This work often overl
aps with explorations using deception as a security device to
deceive hackers. Even graphical user interfaces are deceptive, as their purpose is to
conceal the internal operations of the machine (e.g.: a trashcan graphic conceals the
technical details of
the erasure of sectors on a disk and the
updating

of a file allocation
table).


While some artificial intelligence work in deception has been performed, there is a need
for more experimental results. [Cohen, et al, 2001], a paper discussing deception

4

alg
orithms and AI models, concludes “we need to measure the effects and known
characteristics of deceptions on the systems comprising of people and their information
technology to create, understand, and exploit the psychological and physiological bases
for t
he effectiveness of deceptions. …The creation of simulation systems and expert
systems for analysis of deception sequences, and a wide range of related work would
clearly be beneficial as a means to apply the results of experiments once empirical results
a
re available”. While relevant real life experiments have been performed [Ekman and
O’Sullvan, 1991], these experiments have been limited to very simple variables and
results. This is because the participants’ mental states are
unknown
, which is practicall
y
necessary for measuring deception and its effects. It is impossible, for instance, to
measure or control a real life participants lying ability, frequency, or decision making
with certainty.


We have developed a multi
-
agent simulator test
-
bed, the Mars

Terraforming Agent
simulator [Ward, Hexmoor, 2003]. In the simulator, rational agents communicate to help
one another achieve individual goals for gaining energy by performing terraforming and
solar receptor installation tasks. We have designed a series
of experiments to show the
relationships between
deception
,
trust
,
power
, knowledge, and performance in this
domain.


1.1 Overview

The first
cheaper

gives a foundation for
deception

and describes the research problem of
interest t
o this study. The second chapter lays a foundation for
deception

in technology

5

with a literature review. The third chapter focuses on the
architecture

of our agents and
multi
-
agent system. The fourth chapter details the model of
deception

used by agents
. In
chapter five, the data from empirical results of our experiments are shown and
interpreted. Finally, in chapter six, we draw conclusions about the impact of these
experiments, and detail possible future work.


2. Related Work


The three main areas of research on
deception

in artificial intelligence have been in the
fields of socionics, security, and HCI (human computer interaction), with large overlaps
between all work. Although research in
deception

has been conducted in ot
her AI related
areas, such as genetic algorithms, this review focuses on the former three groups as they
are most relevant to this work.


2.1 Socionics

Much of the work done involving
deception

in artificial intelligence has bee
n in the field
of socionics. Some of these works involve the investigation of the effects
of artificial

models of
deception
/counter
deception

in multi
-
agent systems, while other works focus
on the model of
deception

itself.


One such work includes Caste
lfranchi’s experiments on
deception

in the simulated
GOLEM “block world” environment [Castelfranchi, Falcone, and deRossis, 2000].
Castefranchi’s group divided deceptive acts into 6 categories and showed how they could
occur as simulated in GOLEM:


6




Physic
al World


the agent is deceived by the world due to limited perception
ability



Intentional


an agent generates
deception

in order to intentionally deceive another



Passive


an agent deceives another by withholding helpful information



Ignorance


an agent

ignores useful information that they have perceived



Communication


an agent deceives another with its utterances



Social



A.
Deception

about the world planned and exploited by another agent



B.
Deception

(possibly accidental) about another agent

Several
experiments between cooperative agents were then performed using the simulator
and involving these types of deceit. Agents were motivated to deceive one another in
order to pilfer help from others to
achieve

a goal using less work. This intentional
commu
nicated deceit was divided into 3 sub
-
categories:
deception

about capabilities,
deception

about personality, and
deception

about goals. Several important situations that
arise between agents due to these types of
deception

in GOLEM were examined. The
pa
per concludes with preliminary results showing that (1) in order to perform
deception

effectively an agent must have beliefs about the beliefs of the agent they are deceiving,
and (2)
deception

is not necessarily non
-
cooperative toward the agent being dece
ived in
GOLEM.



7

A different approach was taken in [Gmytrasiewicz, Durfee, 1993] towards modeling
deception

in multi
-
agent systems. The paper described a method in which agents could
model the beliefs about other agents’ goals recursively (e.g.:
A
’s belief

about
B
’s belief
about
A
’s belief) in order to dynamically determine the most effective and believable lie.
Large action based state space trees are generated by agents in order to simulate the
beliefs of others after likely possible actions have been pe
rformed. These trees are then
analyzed in order to determine the most effective and believable lie. The internals of two
situations involving
deception

are analyzed in detail: the case where a hearer doesn’t
believe the lie, and the case where the lie pa
ys off for the deceiver.


Other works focused on measuring the effects of
deception

and related behaviors in a
multi
-
agent system.
[Sen, Biswas, and Debnath, 2000]
compared the performances
between selfish agents, and reciprocal agents in a packet
deli
very

domain. Reciprocal
agents would request help from others, expecting service in return, while selfish agents
only request help, never granting it. Several experiments were performed, with each
experiment adding sophistication to the selfish agents'
dec
eption
. Correspondingly, the
reciprocal agents' selfish and
deception

coping mechanisms were made more intelligent.



Agents' communication, the means for
deception
, consisted of requests/replies for help
and
utility
-
based reputation values based on the q
uality of help received from others
(negative indicated a selfish agent). Performance was measured in the amount of work
performed vs. the percent of selfishness in selfish agents. In the base case, reciprocal

8

agents had a huge advantage over selfish agent
s, but in subsequent experiments, selfish
agents communicating deceptive rumors about reciprocal agents' reputations gave the
selfish agents an advantage when the selfish percentage was high enough. Adding a
formula for
reciprocal

agents to measure reputat
ions, and then, improving this formula so
that only reputations given by trusted agents were considered, proved to be very effective
methods for agents to deal with deceptive selfish agents.
Ultimately
, the performance
mirrored that of the base case. In
conclusion, this experiment showed the effects of
deceptive reputation
-
based rumors, and how the counter
deception

model could be
changed to deal with the
deception
. This was also important in showing the overlap
between
deception

and selfishness/altruism
, as these are intertwined in the motivation for
deception

generation.


Another work in which agent performance was measured against deceptive behavior
includes that described in [Carley and Prietula, 2000]. These experiments used a multi
-
agent simulator

to investigate the effects of agents generating deceptive rumors. The
emphasis in this case, was on the counter
deception
, and trust learning. The domain
consists of a warehouse with stacked items in isles. The agents’ goal is to seek assigned
items fo
und in the stacks. To help achieve this goal, the agents communicate advice
amongst themselves.



If this advice is determined to be untruthful by the advisee, it may modify its perceived
trust for the advisor agent and begin to communicate rumors about
the advisor. The
rumors generated by the advisee will then potentially cause other agents to modify their

9

perceived trust for that original advisor agent. The trust learning mechanism was varied
over experiments in order to determine the relationship bet
ween trust learning and the
effects of these rumors. Four different trust learning rates were used: reactive, forgiving,
and always distrusting/always trusting. Finally, some agents could be deceptive,
intentionally communicating false rumors and advice.



The simulation was run using five agents with varying trust models, numbers of deceptive
agents, rumors or no rumors, and environment stability. Different benchmarks were also
used: amount of work performed, information withheld from agents not trusted
, conflict
with other agents, and duration of information coalitions. Results showed that forgiving
agents dealt the most effectively (vs. reactive agents) with the societal instability
introduced by
deception
. With less stability however, forgiving agen
ts had no advantage
over reactive and closed agents.


2.2 HCI

Other research has had the goal to create better
deception

models for agents so that
human computer interaction (HCI) will be more realistic (such as in the afore mentioned
Turing test). This is similar to other research that has focused on the benefits of emotion
in (HCI). In [Carofiglio, de Rosis, 2001] for example, agents’ beliefs were represented
using sets of Bayesian networks, including beliefs about the beliefs of ot
her agents.
They
then described a
n algorithm for a speaker agent to modify these images of other agents’
beliefs in order to generate deceptive communication allowing the speaker to gain
influence. The algorithm worked by strategically adding non
-
active
beliefs (potentially

10

believable beliefs that currently do not belong to any of that agent’s networks) to the
network representing the beliefs of another agent and then rating its effectiveness to
influence the addressee. Four factors are considered when r
ating the deceptive network:
efficacy, plausibility, safety, and reliability. If the optimality for using this deceptive
network as an utterance decreased with the addition of a belief(s) based on these factors,
then the added belief is pruned from the ne
twork, and another iteration

of these steps

is
performed.


Another totally different approach to deceit in HCI is described in Tognazzini’s principles
[Tognazzini, 1993], which discusses the role of deceit in interface design. Tognazzini,
former HCI gur
u of Apple Computers, advocates a magician
-
like deceptive approach to
HCI development in his principles, describing the importance of illusions, misdirection,
and simulation in deceiving the human user. Essentially, his goal is to conceal the
technical de
tails of the internals of the computer to make the machine appear more
“friendly”.


2.3 Security

Finally, one of the main applications for
deception

has been in security. Using
deception

as a security mechanism has three goals: (1)

To encourage the attacker to waste resources
attacking something that does not exist; (2) to prevent the attacker from gaining access to
the actual system resources and/or data; and (3) to provide a forum for analyzing the
attacker's goals and methodologi
es. In fact, the use of
deception

in security is
widespread, as it has been shown to yield positive results.


11


There are many techniques for utilizing
deception

in security, the most common of which
is known as the honeypot. There are three types of hone
ypots, listed here in
chronological order of development:


1.

sacrificial lamb


computers whose sole purpose is to attract attackers

2.

façade


similar to the sacrificial lamb except only simulated services are attacked
instead of the actual computer

3.

instrumen
ted systems


a dynamic merger of the sacrificial lamb and the facade

[Recourse, 2002]


A framework for using intelligent software decoys to deceive hackers once they have
infiltrated a system is described in [Michael, Riehle, 2001]. The model consists of

a
security contract, which when
violated

triggers the generation of deceptive decoys by the
software object. The goal of this
deception

is to convince the mobile agent into
concluding that it has successfully infiltrated the system. The decoys described

simply
consist of a fake java object generated at run time with randomly permutated arguments.
Most of these techniques have been compiled in a software package
consisting

of PERL
scripts called the
deception

toolkit (DTK) [Cohen, 1998].


2.4 Summary

Deception

in artificial intelligence is common among work in socionics, security, and
HCI, with a large overlap between all
works
. Although the work discussed by this thesis

12

project would be considered socionics, our model draws from a
ll of the described areas
almost equally. For instance, parts of our
deception

model are inspired by models used in
HCI. Security research is relevant to this because of its practical bend;
deception

has
actually been tested with successful results in se
curity research. Not surprisingly, our
main draw from this area is our implemented methods for generating and detecting
deception
.


The following chapter covers the Mars Terraforming agent simulator, agent
architecture
,
and agent communication in detail.

As these are the means for
deception

to occur, they
shape the model of
deception

described in detail in chapter 4.


3. Simulating Terraforming Mars

The simulated Mars environment consists of two main components: a
grid in which each
square has three sub
-
components, and a set of agents. These sub
-
components include the
elevation level of stacked blocks, which agent owns the square, and harvest status of
square. Agents move from square to square on the grid and atte
mpt to gain ownership of
squares in order to plant and harvest solar receptors. The grid data is maintained by the
simulator. All actions are implemented as requests to the simulator.


13


Figure 3.1
Two
-
dimensional block world

Definition 3.1
: The data re
presenting a state of the environment
,

denoted by
terrain
data,

consists of a set of matrices of size
width

x
height

containing data representing the 2
dimensional navigation space which includes elevation data, ownership data, and harvest
data.


terrain
data = {E,O,H}


Definition

3.2
: The number of agents in the game, denoted by
number of agents
, is a
number greater than one.


Each number in the matrix
E

is a number in the range 0 through 9 that represents the
height of the blocks at a given location. E
ach number in the matrix
O

is a number in the
range 0 through
number of agents
, which specifies the identification of the agent which
owns the given location, with 0 being used for a squared that is owned by no agents.

14

Each number in matrix
H

is a number
from 0 to 20
specifying

the status of the solar
receptors (harvest data), where 20 indicates the receptor is fully charged and is now ready
to be discharged by the owner agent.


Agents may gain ownership of squares when four or more contiguous squares adj
acent in
block directions have equivalent elevation. Agents
officially

gain ownership of squares by
being granted a request to own by the simulator. All squares detected by sensors are
requested immediately after being detected. Each square may be owned
by only one
agent.


Definition 3.3
: The number of squares owned by an agent, denoted by a function
SquaresOwned(A,t)
, is the number of squares owned by an agent
A

at time
-
step
t
.


Definition 3.4
: The
current

time
-
step, denoted by
turnnumber
, is the numb
er of turns that
have elapsed since the beginning of a game.


Definition 3.5
:
Actions
, denoted by
A
, is a set of 11 constant actions available to agents.
The actions include:

0.

stand still

1.

move north

2.

move south

3.

move east

4.

move west


15

5.

plant (install receptor)

6.

harvest (discharge receptor)

7.

dig north (push block north)

8.

dig south (push block south)

9.

dig east (push block east)

10.

dig west (push block west)

Dig actions modify the environment by decreasing the elevation at the position of the
agent, while increasing it in

the specified block adjacent element. Agents may only push
blocks into squares with equal or less elevation values compared to the elevation of the
square occupied by the agent. In other words, agents may only push blocks downhill.


Planting may only b
e performed in a square that is owned by the agent, and for which the
harvest matrix has a value of zero. Once a square is planted, the harvest value will then
be incremented by the simulator one unit each time
-
step, and will continue to increase
until th
e receptor is fully charged. Agents may only perform action harvest (discharge
receptors) in squares where the harvest value is fully charged.


Each action costs an agent one energy unit except for harvest, which if performed
successfully, gives the agent

10 energy units. If an agent has less than 0 energy units, the
agent remains on the grid as an obstruction but can no longer perform any communication
or action other than
stand still
.



16

Figure 3.2:

The Simulator’s algorithm








Figure 3.3:
An agent’s algorithm


Agents are modeled in the belief
-
desir
e
-
intention paradigm [Wooldridge 2000], in which
each agent holds sets of beliefs, desires, and intentions. Beliefs are based on perception
of the environment, which in this case, consists of the block levels contained in the grid
elements, locations of o
ther agents,
trustworthiness

(credibility) of other agents, and
utterances from other agents. Desires are the agents' goal states.

The goal states for all
1 Analyze new sensory data and revise beliefs about the
environment

and beliefs about other agents


2 If there is no plan or old plan is outdated, generate

state space and determine plan


3 Request simulator to perform current action in plan


4 Generate message to send to other agent and request
the simulator to deliver the message


5 Goto 1

1.

Send sensor data to agents


2.

Randomly shuffle agents and execute each agent’s
algorithm (Figure) in the resulting order


3.

Process agents’ requests to perform actions, and update
terrain data
, and agents’ coordinates as necessary


4.

Compute each a
gent’s power over each other agent


5.

Process agents’ requests to communicate with other
agents


6.

Goto 1


17

agents are based on the need to obtain energy. Finally, the intentions of the agents consist
of a
plan of action selected from a state space using depth first search.


The agents are initialized at the start of the game using three parameters. These include
trustworthiness
,
trust rate
, and strategy.
Trustworthiness

is a value that determines an
age
nt’s honesty or credibility, trust rate determines the reactivity of an agent’s beliefs
about another agent’s honesty, ability determines the range of actions an agent is able to
perform, and strategy determines which set of utilities an agent will have.
Our trust
model is described in detail in section 4.3.


Definition 3.1.1:

The x position of an agent, denoted by
x(A),
is a number in the range 0
to
width
.


Definition

3.1.2
: The y position of an agent, denoted by
y(A),
is a number in the range 0
to
height
.










18










Figure 3.1.1
: Agent architecture

3.1.1 Beliefs

Agents’ beliefs can be divided into three categories: beliefs about
terrain data
, beliefs
about other agents, and beliefs about themselves. Agents’ beliefs about

terrain data

are
derived from two
sensory

sources: other agents’ communications, and sensor data. An
agent’s sensors can perceive
terrain data
, and the identity and location of another agent
in
all (at most 9) adjacent squares
. For elements of
terrain da
ta

for which an agent has no
information about, a null belief is used. For example, if an agent has never perceived
square
x y
, then the elevation, harvest data, and ownership values for element
x y

in the
agent’s beliefs about
terrain data
will be set to
null. Agents may also revise their beliefs
about
terrain data

based on utterances received from other agents. Agent communication
and belief revision based on communication is described in more detail in section 3.2.
Once beliefs are revised, a time
-
stamp

for each is set to the date when the information

Beliefs



About
terrain data




Who owns a square




When square was last seen (time
-
step)




Contains an occupant, and who the occupant is




Dirt height (elevation)




A number indicating the status of the harvest at that square



Abou
t other agents




Trust

(honesty/reliability)




Physical positions



About self




Trustworthiness

of self



Amount of energy






Current position


Desires



Modeled utilities



Intentions




Plan

consisting of actions



Communication message

to send



19

was revised. This time
-
stamp data allows agents to deal with uncertainty about old
beliefs.


Beliefs about other agents include
trust
, which is used by agents

to determine the
credibility of another agen
t.
Trust

is described in more detail in section 4.3. Agents also
have beliefs about the physical locations of other agents, which is useful for agents to
determine whether their actions may interfere with the actions of others.


Beliefs agents maintain a
bout themselves include
trustworthiness
, energy, and current
position, all of which are always totally reliable in the current model (agents cannot
deceive themselves).


Definition 3.1.3
: A time stamp, denoted by function
timestamp(A,BEL)
, indicates the
turn
in which an agent revised a belief.


3.1.2 Desires/Intentions

Each node in the state space consists of a state, or possible world. Possible worlds
consist of the state resulting from the change in the environment
caused by an agent
performing one of the eleven possible actions.



20


Figure 3.1.2
A snapshot of an agent’s BDI state space

It follows that the breadth of the state space is 11 (figure 3.1.2). The depth of the state
space indicates the number of time
-
step
s that would have elapsed in the plan since the
present. An agent uses utility functions to rate each state in its state space.


Definition 3.1.4
: A
possible world
, denoted by
S
, consists of
terrain data (T)
, the time
-
step when the possible world could
exist (
t
), and an agent whose beliefs the
terrain data

is based on (
A
).

possible world = <T,A,t>


Since there are three distinct categories of actions, moving, digging, and
harvesting/planting, there are three sets of utilities representing the desire of

an agent for
a possible world created by one of these actions. In the remainder of this section (3.1.2)
we formally state the utility functions used by agents when results were generated. To
determine which square is the most desirable for an agent to p
lant to move to, the series
of functions described in definitions 3.1.7


3.1.11 are used. Definition 3.1.12 describes
the formulas used to compute an agent’s utility function for performing a digging action,
while definitions 3.1.13
-
3.1.16 are the plant

and harvest utilities. Finally, the function for
computing the overall utility is defined in 3.1.17.


21


Definition 3.1.5
: Energy amount, denoted by
energyamount(A,t)

that returns the amount
of energy contained by an agent.


Definition 3.1.6
: The utility
for energy, denoted by function
energy(A,t)
, is

energy(S)=1/(1+1.02) ^
-
( energyamount(S[A],S[t])
-

150)


In essence, this function is a threshold for determining the "hunger" of an agent for
energy. Since it is not desirable for an agent to
oscillate

co
nstantly from a state of hunger
to one of searching, a fuzzy sigmoid that ranges from [0:1] is used. This is a slight
variation on
1/(1 + e^
-
x)
, which approaches 1 as
x

approaches infinity
.
The base value
1.02, which was chosen using trial and error, dete
rmines the growth of the function. If
the curve is too steep for instance, small amounts of energy could increase the
energy

value too coarsely causing an agent to react too slowly to the loss or gain of large
amounts of energy. The value of 150 was cho
sen by trial and error for the threshold value

(
1/(1+1.02) ^
-
( 150
-

150) = .5 )
since it has been observed that 150 energy units is
sufficient for an agent to safely accomplish useful work.



22

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1
27
53
79
105
131
157
183
209
235
261
287
313
339
365
391
417
443
469
495
energy
fuzzy value

Figure 3.1.3:
The sigmoid function used as a fuzzy set fo
r an agent’s desire for energy.


Definition 3.1.7:
The function for computing the potential for blocks in a given square to
be placed into adjacent squares, denoted by function

amountdownhill(S,x,y,x’,y’)
, is


amountdownhill(S,x,y,x’,y’)=



| S[Eyx]


S[Ey’x’] | > 1 ; | S[Eyx]


S[Ey’x’] |




| S[Eyx]


S[Ey’x’] | <= 1

; 0


Where
S[Eyx]

denotes the elevation value at the specified location in the state’s
terrain
data
.


Definition 3.1.8:

The utility for rating an element in th
e elevation matrix using strategy 1,
denoted by function
UTILsquare1(S,x,y)
, rates a square according to how much digging
would be needed to level its surrounding area and its physical proximity to agent
S[A]
.




23

UTILsquare1(S,x,y) =

amountdownhill(S,x,y,
x+1,y) + amountdownhill(S,x,y,x,y+1)

+ amountdownhill(S,x,y,x
-
1,y) + amountdownhill(S,x,y,x,y
-
1)


+ width + height
-

| x(S[A])
-
x | + |y(S[A])
-

y|




The function

amountdownhill

is called four times to determine the potential for moving
block
s into each adjacent square in a block direction. The
width + height
-

| x(S[A])
-
x |
+ |y(S[A])
-

y|

component has the effect of highly weighting squares physically close to
the agent.


Definition 3.1.9:

The utility for rating an element in the elevatio
n matrix using strategy 2,
denoted by
UTILsquare2(S,x,y)
, rates a square according to how little digging would be
needed to level its surrounding area and its physical proximity to agent
S[A]
.


UTILsquare2(S,x,y) =

40

[ amountdownhill(S,x,y,x+1,y) + am
ountdownhill(S,x,y,x,y+1)

+ amountdownhill(S,x,y,x
-
1,y) + amountdownhill(S,x,y,x,y
-
1) ]


+ width + height
-

| x(S[A])
-
x | + |y(S[A])
-

y|




This is similar to strategy 1 except the
amountdownhill

is subtracted from 40 (the
maximum possible
a
mountdownhill

for each square being 10) so that the case where two
squares can be made to have equal height after one dig action is performed is the highest
possible value.


24








0

1

2

3

4

5

6

7

0

A2

6

5

3

5

3

3

2

1

3

4

7

2

3

3

4

3

2

3

4

4

8

6

9

7

1

3

1

3

5

5

4

4

4

2

4

2

4

7

2

3

8

3

0

5

3

5

6

0

4

3

4

1

6

6

7

2

2

5

3

7

6

7

4

6

2

3

0

7

A1

1

Figure 3.1.4 :
A state of two agent’s beliefs about
E
, the elevation matrix.


An example of how each of the two
UTILsquare
i
(S,x,y)

functions are used i
n conjunction
with the rank function is illustrated in figure 3.1.4. Agent

A1
, who is using strategy 2, has
generated a plan to move to square
<4,1>

shown in the emboldened and italicized area.
One dig west action is all that is needed to make
<4,1>

and
<
4,0>

both 3. Agent
A2
,
who is using strategy 1 has generated a plan to move to the area that is only emboldened.





The following two functions are used to compute an agent’s utility for performing
a
dig action
.


Definition 3.1.10:

The rank of an elevati
on matrix element, denoted by function

rank element(S,x,y)
, is a number between 1 and
width * height
, where the smaller the
value, the larger the value
UTILsquare
i
(S,x,y)
.



25

Definition 3.1.11:
The utility for an agent to move to a square on the grid, denote
d by
function
UTILmove(S,x,y)
, is defined as

UTILmove(S,x,y)= (| x

x(S[A]) | + |y
-
y(S[A])|)

/
rank element(S,x,y)


Definition 3.1.12:
The utility for an agent to perform a dig action in a square, denoted by
UTILdig(S,x,y,x’,y’)
, is defined as

UTILdig
(S,x,y,x’,y’)=( energy(S)/rankelement(S,x,y))


Where
x’

and
y’

are a square adjacent to
<x,y>
and
amountdownhill(S,x,y,x’,y’)

is the
maximum for all x’ and y’
.




The following three functions are used to compute an agent’s utility for
performing a
plant

a
ction, while definition 3.1.17 is an agent’s utility for
performing a
harvest action
.


Definition 3.1.13
: Squares owned, denoted by
SquaresOwned(S)
, is the number of
squares owned by
S[A]
.


Definition 3.1.14
: Squares planted, denoted by
SquaresPlanted(S)
,

is the number of
squares that have been planted by agent
S[A]

up to time
-
step
S[t]
.






26

Definition 3.1.15:
The utility for an agent to perform a plant action in a square, denoted
by function
UTILplant(S,x,y), is


UTILplant(S,x,y) = UTILsquare
i

(S,x,y) *




(squaresplanted(S)/squaresowned(S)) ; S[H
yx
] = 0 & S[O
yx
] = S[A]



0





; S[H
yx
]


0 | S[O
yx
]


S[A]

Where
i

indicates whether an agent uses
UTILsquare1
or

UTILsquare2
.


Definition 3.1.16:
The utility for an agent to perform a harvest action i
n a square,
denoted

by function
UTILharvest(S,x,y),

is

UTILharvest(S,x,y)=

1
-
energy(S)


; S[H
yx
] = 20 & S[O
yx
] = S[A]

0


; S[H
yx
] < 20 | S[O
yx
]


S[A]




Finally, we compute the overall utility value for the entire state:


Definition 3.1.17:
A utilit
y function for a state, denoted by function
UTIL(S)
, consists of
the weighted sum of the more specific utility functions defined above.


UTIL(S) = [W1 * UTILmove(S) + W2*UTILdig(S) + W3 * UTILplant(S) + W4 *
UTILharvest(S)] / 4



27

Wi

is a weighting coefficie
nt between 0 and 1.0 which determines an agent’s strategy. For
instance, a selfish agent could have a low weighted dig function, but a highly weighted
harvest function. All strategies currently implemented have equal weights for all utilities.


Definit
ion 3.1.18
: A
plan
,

denoted by
P
, is a sequence of 0
-
8 nodes, each containing a
possible world, and an ID indicating the performed agent action that created the possible
world.


If an agent gains new beliefs that are based on perceived sensor data or comm
unication, if
these new beliefs conflict with a
plan
, then the agent generates a new
plan
. Otherwise,
the actions contained by the
plan

will be executed until all actions have been performed.
This allows agents to avoid the computational cost of generati
ng redundant
plans
.


3.2 Communication


At the end of each round, an agent may send a message to one other agent by submitting
the message to the simulator’s message handler queue. At the beginning of each round,
the messag
e handler places all messages sent at the end of the last round in the recipient
agents’ in
-
message queue. The queue is of size
numberofagents

in case multiple
messages are received by an agent.


Definition 3.2.1
: A location, denoted by
L
, is a coordina
te
<x,y>

for the game grid.




28

Definition 3.2.2
: A
communication message
,

denoted by
C
, is a four tuple.

C =<T,L,A,A’>

Where
T

is
terrain data
,
A

is the sending agent, and
A’

is the receiving agent. The

terrain data

and the destination agent are the compo
nents of a
communication message

that originate from the source agent. The component
A’
, the recipient of the
communication

message
, is chosen using an equation by the sender
A
, while the
terrain
data

to be sent to another agent is chosen as a function of

the agent’s
trustworthiness

(see
chapter 4)
. The component
A
, or from whom the
communication message

originated, and
the component
L
, the location of
A
, are placed into the
communication message

by the
simulator’s message handler.

(a)
terrain data (T)




<4,8>


1



5

b. Location (
L
), sending agent (
A
), receiving agent
A’

Figure 3.2.1
: An example of a
communication message
, with
terrain data

shown as
elevation/ownership/harvest. The agent has placed
terrain data

about its location and its
8 adjacent loc
ations into the
communication message
. The simulator sets the sender field
and sending location field.


Once an agent has decided which
communication message

to process, if the information
is
plausible

the agent revises its beliefs based on the
terrain da
ta
and
physical location

4/1/18

6/0/0

3/0/0

4/1/19

0/0/0

agent

1/0/0

4/1/20

6/0/0

3/0/0


29

contained in that
communication message
. The agent then sends a verify bit to the agent
from which the message originated. If the information is not
plausible
, then an agent
revises its trust for the originator of the
communicati
on message

(see section 4.3 on
Trust

for more details on revising beliefs based on
communication message
s and
plausibility
).



Definition 3.2.3
:

The number of previous communications an agent has had with one
other agent is denoted by
commcount(A, A')
. C
ommunications counted include all
messages the agent has previously sent to the specified other agent, as well as all
messages the agent has received from the other agent.


Definition 3.2.4
:

The number of verify bits received by a given agent from anothe
r agent
is denoted by

acceptcount(A, A')
. A verify bit is sent by a recipient to the sender when the
recipient revises beliefs based on received
communication message
s.


Definition

3.2.5
: The utility function an agent uses to determine which other agent
to
communicate with, denoted by
whocomm(A,A’)
, is


whocomm(A,A’) =

max {

height + width

-


|x(A)


x(A’)| + |y(A)


y(A’)|

+
commcount(A,A')
-

receivecount(A,A') }

where
Trust(A,A',t) > .5




30

The function
whocomm

allows a sender to decide which agent to

receive/send a
communication message

to/from based on two factors. These factors include the physical
distance between agents on the grid, and the amount of times agents have successfully
communicated in the past. In other words, agents communicate with

the agent that is
closest to them,
that has

communicated with
them
the most in the past, and is trusted.
This function is designed to give agents reciprocal behavior; agents attempt to send
communication containing the information that is thought to be t
he most useful to others
for generating a
plan
(hence the distance function) and expect to receive similar
information in return. Agents communicate with the other agent that is best known and
trusted in order to avoid being tricked by
deception

(chapter
4).
whocomm

is invoked
twice each round. At the beginning of the round,
whocomm(A,A’)

is invoked on every
agent
A’

from which a message that exists in
A
’s in
-
message queue originated. At the end
of the round,
whocomm(A,A’)

is called for all agents
A’,

wi
th the intent of finding the
most suitable recipient agent.



Definition

3.2.6
: A coalition, denoted by
coalition
, is a group of at least two agents
communicating (directly or indirectly).


Coalition
s can be described as a connected component in a direc
ted graph, where the
graph represents the artificial society. Each agent in the society is represented in the
graph as a vertex. Each vertex
A
, has at most, an incoming edge and an outgoing edge.
The incoming edge
<A’,A>

where
A’

is the agent from which

the
communication

message

originated that was chosen from A’s in
-
message queue by the function

31

whocomm
. The outgoing edge is
<A,A’’>
, where
A’’

is the agent chosen by
whocomm

to
send a message to at the end of the round and where A’’ has decided to revi
se its beliefs
based on the
communication message

received from
A
.


Coalitions
, as described here, are strictly information sharing alliances


agents do not
collaborate to generate mutually beneficial
plans
. Agents are never able to directly see all
of
the other agents in a coalition; only the agents that messages are sent and received
directly to/from them. However, with the propagation of data and rumors, agents are at
least indirectly influenced by each other agent in the coalition. Thus, agents deci
de to stay
with or break a coalition based on the usefulness/reliability of the
circulating

information
as determined by the function
whocomm

(definition 3.2.5). Our
coalitions

are useful for
determining

how widespread

agents’ influences tend to be. This

use of coalition is
different from that describe in other Multi
-
Agent literature such as in [Sandholm, Lesser,
1995], where agents in coalitions cooperate by coordinating activities and
plans

together.



Figure 3.2.2:

Agents
A
,
B
,
and
C

are shown here in an information
coalition
. While
C

does not communicate directly with
A
,
B

spreads “rumors” generated by
C

on to
A

causing
C

to have influence over
A
(and vice versa).




32

3.3 Power

Power

in the Mars Terraforming Agent Simulator is
a measure of the speaker agent’s
influence over an addressee’s
plan

via a sent
communication message.

This influence has
the potential to lead the addressee agent to generate
plans

based on incorrect beliefs about
terrain data
. These
plan
s often cause dam
age to agents by leading agents to follow
courses of action that are not as
beneficial

to them as a
plan

generated using correct
beliefs would have been. This
power

of communication influence is intrinsic [Hexmoor
2003], as agents could choose to ignore m
essages and not be influenced by them.
Although other notions of power exist in the simulator, such as the extrinsic power of an
agent over another agent because of
physical

force imposed by performed actions, the
power

exercised by agents using communica
tion is of primary relevance to this project so
that the role
deception

plays in influencing a deceived agent’s decision making can be
measured. Essentially,
power

as described here, is the power a
communication message

has over an agent rather than the p
ower an agent has over another agent. As with our
coalitions
, the agents themselves are not even aware of this
power

(although agents
attempt to estimate the
power

a
communication message

could potentially exercise over
another agent).


In [Brainov, Sandholm, 1998], several formulas were described for computing the
extrinsic power that an agent exercises over other agents because of dependencies
resulting from having joint
plans
. We have made slight modifications to these formulas
in

order to compute the intrinsic
power

an agent
exercises

over another due to
communication.


33


Definition 3.3.1
: An agent
Aj

exercises
power

over an agent
Ai

if for a
communication
message

received by
Ai

from
Aj
,
UTIL(Si*) > UTIL(Si*’)

and
UTIL(Si*’) <= U
TIL(Si*)

for all
plans

pi’

where
Si*’



pi’,
and
Si*



pi*
, and
pi’
was generated after
Ai
revised
its beliefs based on

terrain data

received from
Aj
, while

pi*

was generated without
Ai

revising its beliefs based on
terrain data

received from
Aj
.




Def
inition

3.3.2

: The cost incurred by an agent
Ai

while exercising
power

over an agent
Aj

by means of a sent
communication message

C
ij
, denoted by

function
Cost(Ai,Aj,C
ij
)
, is

Cost(Ai,Aj,C
ij
) = UTIL(Si*)


UTIL(Si*’)

Where
Si*’



pi’,
and
Si*



pi*,
and
pi
,
pi’

are
plan
s generated by
Ai
.


By the original definition in [Brainov, Sandholm, 1998], where conflicts among multiple
agents’
plans

were the mechanism for power, an agent might choose a
plan

that has a cost
to itself, but a greater cost to the other
agent, in order to gain power over the other agent.
In our system however, this cost is always 0 since sending a
communication message

will
have no immediate effect on the
plan

generation of the sender (although eventually it
could indirectly have costs t
o the agent if the
communication message

causes him to be
distrusted). The receiving agent, however, may revise his beliefs to contain the
terrain
data

communicated, which may damage that agent by leading him to generate a less
beneficial
plan
.



34

Defin
ition 3.3.3
: The damage, denoted by a function
Damage(Ai,Aj,C
ij
)
, inflicted by an
agent
Aj

on an agent
Ai

by means of a
communication message

C
ij
, is

Damage(Aj,Ai,C
ij
) = UTIL(Si*)


UTIL(Si*’)

Where
Si*’



pi’,

where
pi’

is a plan generated by
Ai

where
Ai

has not revised its active
beliefs based on
C
ij

and
S*



pj*,

and
pi

is a plan generated by
Ai

where
Ai

has revised
its active beliefs based on
C
ij
.


Definition 3.3.4

: Amount of
power
, denoted by function

Power
, is the total
power

an
agent
Aj

exercises

an agent
Ai

by means of a
communication message

C
ij

received by
Ai

from
Aj
.

power(Ai,Aj,C
ij
)=Damage(Ai,Aj,C
ij
)
-

Cost(Ai,Aj,C
ij
)


It should be noted that an agent can (and often does) have negative
power

over another
agent if the
communication message

is
helpful (causes negative
damage
) to the addressee.
Thus, while
utilities

range from 0 to 1.0, the
amount of power

exercised by an agent can
range from

1.0 to 1.0. Positive
power

indicates that an agent is influencing another
agent into acting against it
s strategy. In contrast, an agent exercising negative
power

over
another agent has influenced the other agent into acting “super
-
normally”. In other words,
negative power exercised on an addressee
enhanc
es

the a
ddressee
’s normal behavior by
providing more
useful (desirable) information allowing the agent to improve upon its
original
plan
. Following this, if an agent
A

has
x

power

over agent
B
, then
B

does not
necessarily have

(x)

power

over
A
. In fact,
B

could exercise
x

power

over
A

as well.



35

For an e
xample scenario where one agent has
power

over another agent, consider a case
where an agent
A

has generated a
plan

left
-
>north
-
>dig

where the
UTIL(S
A
*)=.5
, where
S
A
*

is the possible world created if the agent performs the dig action. On the next turn,
A

r
eceives a
communication message

from
A’

and decides to revise its beliefs using the
received
terrain data
. Since the revised beliefs conflicted with the beliefs upon with the
original
plan

was based, the agent generates a new
plan

based on its new beliefs
:
left
-
>south
-
>dig
, where
UTIL(S
A
*’)=.7
. Unfortunately for
A
, the data received from
A’

was
outdated, and
UTIL(S
A
*’’)
, where
S
A
*’’
is a state based on the actual
terrain data

rather
than either agent’s beliefs about
terrain data
,
is only
.3
. Thus,
A’

ina
dvertently

has .
5
-
.3
= .2

power

over
A

at that moment.


4. A Model of Deception

It is inevitable that multiple agents moving, planting, harvesting, and digging will have
plan

collisions that will keep one or all agents invol
ved from achieving their goals. Since
our agents have no negotiation abilities, these power struggles, with outcomes based on
luck of circumstance, are unavoidable. An alternative to negotiation for dealing with
such situations is deception. For instanc
e, a selfish deceptive agent would be motivated
to generate deceitful communication to influence others in an attempt to alleviate conflict
with its own
plan
s.


Generating deception involves two broad phases for the speaker: generation of deceit, and
sim
ulating its effect on the addressee. In the most trivial forms of deception, such as in
the Venus flytrap, both phases have been developed through evolution and require no

36

action in part of the deceiver. In less trivial forms of deception, such as the pl
anned
deception of a human being by another human being, a web of deceit may be carefully
developed and thought out by the deceiver. In such behavior, the deceiver carefully
simulates the possible effects of deceit upon the addressee.


Inevitably, gener
ating deception and simulating its effects results in recognizable
deceptive traits in the deceiver, as noted in [Horvath, Jayne, and Buckley, 1993]. For
instance, an honest (and innocent) suspect, when questioned, tends to be helpful, expects
exoneration
, displays resentment towards the guilty party, and appears both spontaneous
and sincere. On the other hand, a selfishly motivated deceptive (and guilty) suspect
provides less helpful information, shows inappropriate concern about being a suspect,
lacks r
esponse spontaneity and sincerity, and uses both guarded and clearly edited verbal
responses. These traits are most important to individuals to cope with or defend oneself
against the influence of intentionally deceptive individuals.


Unintentional dece
ption is also common, especially in the Mars terraforming simulator.
In the simulator, unintentional
deception

usually results from outdated beliefs regarding
agents communicating truthful information about
terrain data

that is received and added
to the b
eliefs of an addressee agent. If the addressee does not get a chance to verify its
belief about this information using its sensors until after other agents have modified the
information (which would make the agent’s belief outdated), the information is pe
rceived
as intentionally deceitful by the addressee.

Unintentional
deception

is unavoidable at all
times during the game, but increases as the game goes on and agents’ non
-
null beliefs

37

about areas become increasingly erroneous. Unintentional deceit also
increases due to the
spreading of false “rumors” about
terrain data

obtained from deceptive agents. This
deceitful information may originate from an un
trustworthy

agent, or from a well
-
meaning
honest agent with outdated information. Since agents do not re
veal the source of their
data, however, an honest agent may appear to be un
trustworthy

due to the repetition of
these rumors.


Definition 4.1:

A method used by an agent to generate intentionally deceptive data to be
sent in a
communication message
is deno
ted by
deception

device
.


In nature, deception is dealt with by learning to recognize patterns of deceit used by
others, and learning which other organisms are likely to use deception (trust). Similarly,
an organism learns how to deceive by observing it
s and others’ failures/successes in using
deception to influence others. In the simulator, for practical reasons, the learning has
been replaced with hard
-
coded modeled rules used to detect common deceptive traits
used by agents. These rules are based on

terrain data

patterns that maximize or minimize
agent’s utilities


in other words, patterns that manipulate or exploit agent’s desires.
These patterns are used in the generation of intentional
deception

in addition to usage in
coping with
deception
.



Agents generate deceptive data recursively based on beliefs about other agents’ beliefs,
Bel(A,Bel(A’))
, the beliefs about other agents’ desires,
Bel(A,Des(A’))
, and the beliefs
about others intentions,
Bel(A,Int(A’))
. Agents then rate this information a
long using four

38

different factors to estimate the amount of
power

that a message exercises. These factors
include
efficacy
, the potential benefit if the
addressee

agent beliefs deceptive
communication message,
plausibility
, the potential acceptance of the

addressee for the
deceptive
communication message

into its beliefs,
safety
, the potential for the addressee
to uncover deceitful utterances as false, and
reliability
, the potential for the source of the
communicated information to be incorrect. [Carofigl
io and de Rosis, 2001].


Definition 4.2
: The
terrain data

object concealed by use of a
deception

device

is denoted
by
deception object
.


Efficacy

is the potential effectiveness of the communicated deceit once the addressee
revises its beliefs to inclu
de the deceptive information. An agent
B

determines the
efficacy

of deceiving
A

for a deceptive
communication message

by simulating
A
's actions
based on
Bel(B, Des(A))

or the beliefs of
B

about
A
's desires and determining if the deceit
has the desired inf
luence on the addressee agent.


Plausibility

is the potential acceptance by
B

of a deceptive
communication message

sent
by
A
. In order to determine
plausibility
, an agent attempts to simulate this believability
by
modeling

the
trust

of that the address
ee has for the agent, and how believable the
deception

appears to be.



39

Safety

is the risk of
A
’s deceptive information being discovered as false by
B
. In other
words, s
afety
is future
plausibility
. Our
plausibility

function handles both
safety

and
plau
sibility

for agents.


Reliability

(see Trust 4.3.1)


4.1
A Dynamic Generation of Intentional Deception


In military operations, the following steps are used in procedures for generating
decept
ion: (1) Situation analysis determines the current and projected enemy and friend
situation, develops target analysis, and anticipates a desired situation. (2) Deception
objectives are formed by desired enemy action or non
-
action as it relates to the desir
ed
situation and friendly force objectives. (3) Desired [target] perceptions are developed as a
means to generating enemy action or inaction based on what the enemy now perceives
and would have to perceive in order to act or fail to act
-

as desired. (4) T
he information
to be conveyed to or kept from the enemy is planned as a story or sequence, including the
development and analysis of options. (5) A deception plan is created to convey the
deception story to the enemy [Field Manual, 1998]. Based on these s
teps, a computer
algorithm for an agent
A

to generate a deceptive utterance dynamically could be
organized as follows. Note that this algorithm does not deal with collaborative
deception

[Donaldson, 1994], as our agents only collaborate in that they share

information useful to
one another.





40

1.

estimate
the
intentions

of B’

2.

if
B
' intentions conflict with goals

(of
A
)
,

set
B

to an agent
B'

and goto 1

3.

generate a deceptive communication message(s)

4.

revise
Bel(A,Bel(B))

5.

estimate
the
new intentions
of
B

6.

if the ne
w intentions
of
B

appear to achieve the desired influence then end

else goto 1


In our system, agents would go about performing these steps as follows.


1. Estimate the addressee's intentions

In order to perform this step, a deceiver
A

estimates the curren
t intentions of a potential
addressee(s) by generating a state space and
plan

based on its beliefs about
B
's beliefs
(which are based on the communication messages received by the deceiver from the
addressee in the past) and its beliefs about the addressee
's desires (in our system all agents
believe that all other agents have the same desires as themselves, which is not necessarily
true).


2. Estimate potential for plan conflict with addressee's estimated plan

To perform this step, a deceptive agent simul
ates both its plan and the addressee's plan. If
actions by the addressee could potentially disrupt the deceiver's
plan
, then
B

will be
selected by
A

to be the addressee of a deceptive
communication message

with
A
's intent to
coerce
B

into choosing a
plan

that does not conflict with its goals. This routine could be

41

performed on all agents
A
' in order to estimate which agent poses the highest threat to
A
's
goals.


3. Generate a deceptive
communication message

One way for
A

to perform this step would be to

generate all possible
communication
messages

and iteratively estimate the effects. However, even with our system’s
rudimentary agent communication language, this would be computationally infeasible,
while with other languages, nearly impossible. In our
domain for example, there are
11^(m*n)

different possible states for the elevation matrix in
terrain data
(10 different
heights for each square on the grid, and the null height),
numberofagents
^(m*n)

different
possible states for agent ownership of squares

on the grid, and
3^(n*m)

different possible
states for harvest information (a square is either harvestable, needs to be planted, or
cannot be changed) for a sub
-
total of
(11*
numberofagents
*3)
^(m*n)

possible states of
the
terrain data
.


Fortunately, ther
e is an effective alternative.
A

can generate
communication messages

using beliefs about
B
's desires. For instance, if
B

decides to dig in a square
<x,y>
,
A

can
generate a communication message that contains
terrain data

that if believed by
B
, would
cause

square
<x',y'>

to look even more desirable.
A

can easily use its knowledge of other
agent’s strategies to make all of its desired objects appear less attractive to
B
, while
making other undesired objects appear very attractive to
B
. For instance, based
on the
various utility functions,
A

knows that only objects within a certain physical distance of B

42

would be desired by B. Now
A

need only take its beliefs about
B
's beliefs about these
objects and modify them to achieve the desired effect.


4. Revise
Bel
(A,Bel(B))
, 5. estimate B's new intentions

A

simply revises
B
's beliefs as if
B

had received the potential
communication message

generated in step 3. Now that
A

has an image of what
B
's future beliefs might be, it may
generate a state space and
plan

for
B

to determine
B
's future intentions. Agent
A

estimates
how agent
B

will modify its
trust

for
A ( Bel(A,Trust(B,A) )
.
A

wishes to minimize this
loss in order to maximize
plausibility

and
safety
, while also maximizing the
efficacy
.


4.2 Deception Generation Heuristics


Obviously, the time complexity involved for dynamic computation of
deception

is far too
unwieldy even for agents with rudimentary communication ability in a simple domain. In
a natural environ
ment it would be impossible to even consider such a method for
generating deception. For this reason, Mars Terraforming agents have been programmed
with heuristics for trivial generation and detection of
deception
. The heuristics are not
optimal, as dece
it may often be used when not necessary or not effective, but the payoff in

simulator execution time is well worth the loss. Although domain specific and ad
-
hoc,
this model was more than effective for generating useful results.


Agents’ heuristics for g
eneration of deceptive utterances generating
terrain data

patterns
that are known to be rated highly by the utility functions used by all agents (such as
UTILdig
,
UTILharvest
, etc..). In other words, deceptive agents attempt to ascribe beliefs

43

on the addr
essee agent that would maximize the addressee’s utility functions causing the
addressee to modify its goal so that the likelihood of
plan

collision is reduced.



Deceivers choose which agent should be the recipient of a deceptive communication
using a diff
erent function than that used by honest agents (function
whocomm

definition
3.2.5) since a deceiver’s goal in communication is selfishly motivated, while an honest
agents goal in communication is reciprocal. Consequently, deceivers are much more
open comm
unicators than the honest agents. The following definitions describe the
implemented methods for choosing who to deceive, and how to deceive.


A selfish deceiving agent
A
’s
deception

object

is located at that
A
’s possible physical
location in the highest

rated (by function
UTIL
) possible world in
A
’s
plan P
:

<
x(A), y(A) >, where A


S*,
where

S* = max{Util(Si)}, Si


P
.


Definition 4.2.1
: The
x

axis location of an agent in the possible world highest rated by
function
UTIL

that is located in the current
plan
is denoted by
goalx
.


Definition 4.2.2
: The
y

axis position of an agent in the possible world highest rated by
function
UTIL

that is located in the current
plan
is denoted by
goaly
.






44

Definition 4.2.3:

The function that determines which agent shou
ld be deceived, denoted
by function
who to deceive(A)
, is

who to deceive(A)= Min {| goalx


Bel(A,x(A’)) | + | goaly
-
Bel(A,y(A’)) |}


Where
Bel(A,x(A’),y(A’))

is not null. In other words, the agent with the minimum distance
from the
deception

object

is c
onsidered to be the agent most likely to interfere with the
deceptive agent’s
plan
. Note that it does not matter to the deceiver whether the addressee
is trusted or not.


4.2.1 Deception Devices

There are four
deceptio
n

devices

used to generate messages. All of these devices attempt
to conceal desired
terrain data

features from the addressee by either making a desired
object appear less
desirable
, or making a different object appear more
desirable
. These
four devices
are described as follows.


Direct Deception



these deceptive devices are blatant lies that typically have low
plausibility
/
safety
, high
efficacy
.




Lie to make areas appear undiggable



this device generates the following
deceptive utterance: “The dig ac
tion cannot be performed in square
<X,Y>

or its
block adjacent squares”.
Whe
re in reality, square
<X,Y>

is considered ideal for
performing a dig action from by the deceiver. This
deception

device

has high

45

safety

since an addressee under its influence is c
oerced from the object of
deception
.




Lie to make areas appear diggable

-

this device generates the following deceptive
utterance: “Square
<X,Y>

is the most
beneficial

square for you to perform a dig
action from”. This device allows a deceptive agent to l
ure the addressee away
from a desired square or navigation path without having to conceal the desired
object. Unfortunately, this has this disadvantage of decreased
safety

since the
addressee is being lead directly to the
deception object

and thus the dis
covery that
they have been tricked. The deceiver can hide this by changing its story once it
has achieved the goal of distracting the addressee, but the deceit may have been
discovered by then.





Lie to make areas appear owned
-

this device generates
the following deceptive
utterance: “Area
<X
1

,Y
1
> .. <X
n
, Y
n
>

is owned by
me
/
another robot other than
you or me

and is ready to be harvested”. This is similar to
Lie to make areas
appear undiggable

in that it conceals squares desired by the deceiver with
squares
that the deceiver believes are not desired by the addressee. This device has the
potential to conceal a large area, which increases the likelihood that the addressee
will avoid that area (hence higher
efficacy
). However, if larger areas are used
as
the
deception

object
, then
plausibility

and
safety

decrease.



46

Passive Deception



a safer mechanism that consists of agents playing dumb or
withholding information


deceiving by claiming not to have knowledge in order to
conceal desired resources.




Wi
thhold information
-

this device generates the following deceptive utterance: “I
have no information about square(s)
<X
1

,Y
1
> .. <X
n
, Y
n
>”.

This
deception

device

is useful when the deceiver has information about and desires square
<X
i
,
Y
i>
.
This has the d
isadvantage of decreased
plausibility

since, as mentioned in
section 2.1.1, agents do not revise beliefs with null beliefs.
Plausibility

will be
considered high however, if the addressee is believed to have null beliefs about
the
deception object
.


4.2.
2 Simulating Deception

In [de Rosis and Carofiglio, 2001], Bayesian networks were used to represent beliefs
allowing the development of domain independent
plausibility

and
efficacy

functions,
while domain dependent fu
nctions mapped the probabilities onto the nodes in the belief
networks. In the Mars Terraforming agents simulator however, beliefs reflect the domain
itself, therefore it is convenient to describe
efficacy

and
plausibility

functions directly in
terms of
te