A Computational Approach Wu, D.J., S. Kimbrough and F. Zhong

militaryzoologistAI and Robotics

Dec 1, 2013 (3 years and 6 months ago)

66 views


1




Artificial Agents Play the “Mad Mex Trust Game”:

A Computational Approach
1


Wu, D.J., S. Kimbrough and F. Zhong


Abstract


We investigate the “Mad Mex Trust Game,” which cannot easily be represented in strategic form. In
outline the game is as follows.

N

players of various types are free to negotiate with each other. The
players and their types are identified and known to the other players. Players of a given type produce
a particular commodity at a certain rate. The well being of the individual player
s depends upon
having a mixture of commodities; hence the players have incentive to negotiate trades with players of
other types. After arriving at an agreement, there is a fulfillment stage. Players are free to renege on
their agreements, and players are
able to remember who has reneged and who hasn't. Will cooperative
behavior emerge and under what conditions? What are some of the efficient and effective
mechanisms for trust building in electronic markets? How will these mechanisms affect the
emergence of

trust and cooperative behavior? What are the key ingredients in building distributed
trust and what destroys trust? This game constitutes a more realistic model of negotiation support
systems in electronic markets, particularly on the Internet.












1

This material is based upon work s
upported by, or in part by, DARPA contract DASW01 97 K
0007. File: Trust_HICSS35_J22.doc.
Partial support by a mini
-
Summer research grant and a
research fellowship by the Safeguard Scientifics Center for Electronic Commerce Management,
LeBow College of B
usiness at Drexel University are gratefully acknowledged.

Corresponding author
is D.J. Wu, his current address is: 101 North 33rd Street, Academic Building, Philadelphia, PA
19104. Email:
wudj@drexel.edu
. The Java c
ode we developed in this study can be downloaded for
interested readers at www.lebow.drexel.edu/wu300/trust_game/.



2

1
. Introduction


An important aspect of electronic commerce is that often it is not trusted (
Tan and
Thoen

2001
), since
it is often difficult for a user to figure out who to trust in online communities
(
Dusgopta 1988;
Schillo and Funk 1999
). Recently much

interest has developed from researchers into how

to build
trust in electronic markets operation in such environments as the Internet. The literature has
approached the study of trust with various well
-
defined “trust games”.


Basically there are two versio
ns of the trust game, the classical trust game in economics or the
investment game (Lahno 1995; Berg et al. 1994;
Erev and Roth 1998
), and the “Electronic Commerce
Trust Game” or the “
Mad Mex Trust Game”
2
. The former is well studied in the economics litera
ture
and it is regarded as
revealing and of fundamental importance in social interaction and knowledge
management, as fundamental as the prisoner’s dilemma game (Hardin 1982; Lahno 1995).
The latter
is due to Kimbrough and Tan where the players exchange g
oods such as red sauce or green sauce
rather than money. In this paper, we focus on the Mad Mex Trust Game using artificial agents. We
leave the Economics Trust Game to a subsequent paper where we plan to use the agent
-
based
approach as well. Will trust/
cooperative behavior emerge? If so, under what conditions (and when
and how)? Put in another way, what are the conditions that promote of trust/distrust? How do we
explain/reveal/understand the behavior of agents (What they are doing, why they are doing wh
at they
are doing)? The ultimate goals are to study the effects of markets, characteristics of markets, and
market mechanisms associated with systems of artificial agents.





2
After its place of conception, a restaurant near the University of Pennsylvania, by Kimbrough and
Tan.


3

The contributions of this paper are the integration of several strands of resea
rch literature: The trust
literature (we focus here on the computational approach of social trust); The Electronic Communities
and Electronic Markets literature (we focus here on what kind of market mechanisms would facilitate
trust and cooperation); what
kind of market mechanism would disrupt trust and cooperation.



The rest of the paper is organized as follows. Section 2 provides a brief literature review.

Section 3 outlines our key research methodologies and implementation details. Section 4 reports our

experimental findings for two agents. Section 5 reports further experiments where an additional
player, the third agent has been introduced. Section 6 summarizes and discusses future research.


2. Literature Review

There are roughly two major streams i
n the trust literature. One stream is interested in developing
trust technology (example: security technology such as password setting or digital watermarking).
Representative work can be found in the recent special section of
CACM

(December 2000). The
se
cond stream focuses on social trust
(
Shapiro 1987
) and the work on social capital (e.g.
Uslaner
2000
)
. Our interest is in the latter, e.g., an online trader can well have access to the trading system but
not cooperate with the other online traders due to s
elf
-
interest. In particular, we are interested in trust
based on cooperation
(
Gűth et al. 1997
)
, i.e., social trust is viewed as cooperative behavior.


Methodologically, there are several approaches to the study of trust, illustrating a broad interest from
several disciplines in social sciences. These include the behavioral approach

(e.g.,
Das and Teng
1995; Mayer, Davis and Schoorman 1995
); the
philosophical and logical approach (e.g.,

Tan 2000;
Tan and
Thoen

2001); the computer science
approach (e.g., Holland and Lockett 1998; Zacharia,

4

Moukas and Maes 1999); the sociology approach

(e.g. Shapiro 1987); the psychology approach (e.g.
Gűth et al. 1997); the classical economics

approach (e.g.
Ellison 1993
) and the experimental
economic approach (e.g. Engle
-
Warnick 2000;
Erev, E. and Roth 1998
;
Sundali, Israeli and Janicki
2000
). In this paper, we use an interdisciplinary approach that in
tegrates the economics and computer
science approaches, or the computational economics approach. Details of our methodology and
framework are provided later in section 3. We now define our specific trust game.


As mentioned earlier, there are several vers
ions of the trust game. The following is our version of the
game, known in the literature as the investment game.
There are two players, the principal and the
agent. The principal has some amount of money to invest, say
x
, so he hires an agent (she) to do
this
for him. The agent, in term, gives the money to an investor or a broker, who invests the money in the
market, and truthfully reports to the agent on the return of the investment, say
3x
. The agent then
decides how to split with the principal on the pr
ofit. The game is played repeatedly, i.e., the principal
has the choice to whether to hire the agent or not.
Under some regularity conditions, it has been
shown in the literature that trust can be built if the game is played repeatedly (Lahno, 1995).


In
the
Mad Mex Trust game, the money is replaced with goods.

This game cannot easily be
represented in strategic form. In outline the game is as follows.
N

players of various types are free to
negotiate with each other. The players and their types are identi
fied and known to the other players.
Players of a given type produce a particular commodity at a certain rate. The well being of the
individual players depends upon having a mixture of commodities; hence the players have incentive
to negotiate trades with

players of other types. After arriving at an agreement, there is a fulfillment

5

stage. Players are free to renege on their agreements, and players are able to remember who has
reneged and who hasn't.


We now describe our research framework and methodology
in more detail.


3. Methodology and Implementations

In our framework, artificial agents are modeled as finite automata (Hopcroft and Ullman 1979). This
framework has been adopted by a number of previous investigations. Among them, Rubinstein
(1986),
Sandh
olm and Crites (1995),
Miller (1996) and many others used it to study interated
prisoner’s dilemma (IPD). Kimbrough, Wu and Zhong (2001a) used it to study the MIT “Beer
Game”, where genetic learning artificial agents played the game and managed a liner sup
ply chain.
Wu and Sun (2001a, b) investigate the electronic market off
-
equilibrium behavior of artificial agents
in a price and capacity bidding game using genetic algorithms (Holland 1992). Arthur et al. (1996)
modeled a realistic stock marketplace compos
ed of various genetic learning agents. Zhang,
Kimbrough and Wu (2001) study the ultimatum game using reinforcement learning agents. These are
merely examples to illustrate the acceptance of this framework in the literature. The reader is referred
a Kimbrou
gh and Wu (2001) for a survey.


In this study, we depart from previous research by integrating several stranded approaches. First, we
study a different game, namely, the Mad Mex Trust game, rather than games such as the RPD, Beer,
Bidding, or Ultimatum;
Second, in studying this game, we use a computational/evolutionary
approach, in comparing classical or behavioral game theoretical approaches with artificial agents;
Third, our agents are using a reinforcement learning regime (
Sutton and Barto 1998
), Q
-
lea
rning, as a

6

learning mechanism in game playing. Previous studies of the trust game are not computational (with
the exception of Zacharia et al., where they employed a reputation rating mechanism). Finally, our
agents are identity
-
centric rather than strate
gy
-
centric as used in previous studies (e.g., Kimbrough,
Wu and Zhong 2001a; Wu and Sun 2001a, b).

That is, our agents may meaningfully be said to have
individual identities and behavior. They are not just naked strategies that play and succeed or fail.

Individuals, rather than populations as a whole, learn and adapt over time and with experience.
Fielding these kinds of agents, we believe, is needed for e
-
commerce applications.



We now describe our model and prototype implementations in more detail in

the framework of Q
learning. This includes how the rules or state
-
action pairs are embedded in our artificial agents, how
the rewards were set up, what was the long
-
term goal of the game (returns), and finally the specific Q
learning algorithm we designed
.


Rules (State
-
Action pairs):

The Q
-
learning algorithm estimates the values of state
-
action pairs
Q(s, a).

At each decision point, the
state of an agent is decided by all the information in its memory history, e.g. its own and opponent’s
last trade volum
e. The possible action at this decision point that an agent can take is any number
between zero and its endowment. In this sense, the agent’s strategy is the mapping from its memory
of the last iteration to its current action. To balance exploration and ex
ploitation, we use the

-
greedy
method to choose randomly with trivial probability. The value of


starts from 0.3, and then decreases
to 0.01 by steps of 0.000001.

Rewards:


7

The instant reward an agent can get is determined by
a modified Cobb
-
Douglas func
tion

that is the
mixture of the amounts of different types of sauces the agent posses after each episode.


U = ∏ a
i
1/n

Where n is the number of the types of comedies in the market.

We chose this Cobb
-
Douglas utility
function for our simulation because comm
odity A and B have equal weight for the agents.


Returns:

The long
-
run return is simply the total utility an agent obtained after playing every episode so far.

R = ∑ U
i

The goal of an agent at any iteration is to select actions that will maximize its di
scounted long
-
term
return following certain policies. The use of Q
-
learning ensures that the artificial agents are non
-
myopic.

Q learning:

The learning algorithm used by the artificial agents is one
-
step Q
-
learning described as following:


Initialize Q(s,

a) to zero

Repeat


From the current state
s
, select an action
a

using

-
greedy method


Take action a, observe an immediate reward r, and the next state s’.


Q(s, a) = Q(s, a) +


[r +


max
a’
Q(s’, a’)


Q(s, a)]


s


s’

Until the end of a trial


The expe
riment runs with the learning rate


= 0.05 and discount factor


= 0.95. The values of


and


are chosen to promote learning of cooperation.

4. Two
-
Agent Experiment Design and Results


8

We compare the following trust mechanisms: (1) Moving average of the p
ast five observations; (2)
Exponential smoothing; (3) Zacharia
-
Moukas
-
Maes reputation rating; (4) Tit
-
for
-
tat; and (5) Most
recent move (or last move). We are interested to seeing if under each of the above five mechanisms,
will agents trust each other and

if so, will cooperative behavior converge. By comparing these five
mechanisms, we are interested to see which mechanism does better in terms of building trust and
promoting social welfare.


Experiment one: Two Q
-
learner agents play against each other


Two

learning agents play the iterated trust game. In each iteration, both agents start with the same
amount of endowment. Player A first offers its commodity to player B, upon receiving the
commodity, player B decides how much of his commodity he will trade.
Since there is no third party
agent to deal with the transaction, the exact number of the trade volume is open to both parties and
thus there is no asymmetric information. The whole experiment is a long trial, i.e. two artificial agents
play the game an in
definite number of times.


To test the validity of the experiment and analyze the results, the endowment is set to be rather small,
3, so that each agent has 4 * 4 * 4 = 64 state
-
action pairs. Based on the utility function, one agent
learns to switch its t
rade volume between 3 and 0, the other agent learn to switch between 0 and 2
correspondingly through long iterations. This can be illustrated as:

Agent A (trade volume): 3 0 3 0 3 0 …



(utility) : 0 6 0 6 0 6 …

Agent B (trade volume): 0 2 0 2
0 2 …



(utility)


: 9 0 9 0 9 0 …



9

Thus the utility of the first agent changes between 0 and 6, which gives it an average of 3; the second
agent gets either 9 or 0 in turn, which gives it 4.5 on average. Although this is still not as good as the
followi
ng result that will give both of the agents utility of 4.5 on average, it’s better than sticking on
trading 1 unit or 2 units all the time.

Agent A (trade volume): 3 0 3 0 3 0 …



(utility) : 0 9 0 9 0 9 …

Agent B (trade volume): 0 3 0 3 0 3 …



(utility)


: 9 0 9 0 9 0 …


Experiment two: Two Q
-
learner agents with reputation index play against each other


Experiment two includes a set of sub
-
experiments to test the efficiency of different reputation
mechanisms. At the end of any time period
t
,

both agents rate each other. The rating to one’s
opponent,
r
i
’,
is simply the ratio of the opponent’s trade volume
V
i


and the endowment
N
:

r
i


= V
i
’ / N

The reputation index will also be updated based on this rating according to different mechanisms. T
he
value of this reputation index will be normalized in each mechanism, where 1 is perfect and 0 is
terrible. Now the strategies of each agent are the mappings from the reputation information to
possible actions. We specifically test the following four rep
utation mechanisms.


1. Moving Average

The value of the reputation index is simply the arithmetic average of the most recent five ratings.



2. Exponential Smoothing

The reputation index is the weighted average of the past ratings.


10


3. Reputation Rating

(Zacharia
-
Moukas
-
Maes)

Introduced by Zacharia, Moukas and Maes (1999), every agent’s reputation index will be updated
after each iteration based on its reputation value in the last iteration
R
t
-
1
, its counterpart’s reputation
R’
and the rating it received

for this iteration
Wt
. So the recursive estimate of the reputation value of
an agent at time t can be expressed by:




R
t

= R
t
-
1

+ 1/θ * (Φ(R
t
-
1
) R

(W
t



E
t
)),



Φ(R) = 1


1/ (1 + e
-
(R


D)/σ
),



E
t

= R
t
-
1
/D,



W
t

= V
t
’ / N


Where V
t
’ is the trade volume

of an agent’s counterpart, D is the range of the reputation values,
which is 1 here.

4. Tit
-
for
-
tat

Under this mechanism, the agents will trade the amount that its counterpart traded to it in the last time
period, or V
t

= V
t
-
1
’.


5. Performance Compari
son of various mechanisms

The total utility of each agent under different reputation mechanisms is compared in figures 1 and 2.
Furthermore, the joint utility of both agents under different reputation mechanisms is compared in
figure 3.

-------------------
-----------------------

Insert Figure 1 here

-------------------------------------------



11

------------------------------------------

Insert Figure 2 here

------------------------------------------

------------------------------------------

Insert Figure 3
here

-------------------------------------------

Furthermore, we assign different reputation mechanisms to the two agents and compare the total
points after 300,000 iterations. In the following set of figures, the performance of each reputation
mechanism v
s. other mechanism is described.

------------------------------------------

Insert Figures 4a


4e here

-------------------------------------------




5. Three
-
Agent Experiment Design and Results



Three agents selling two goods


In this experiment, three
agents trade two types of goods, i.e., Agents B and C produce the same type
of good, while agent A produces a different type. At the beginning of each episode, agent A chooses
the agent with higher reputation value from agent B and C to give his goods. The

reputation of the
chosen agent and agent A will be upgraded after each episode. All three agents are assigned the same
reputation mechanism. We test the performance of different reputation mechanisms: moving average,
last move, exponential smoothing and
Z
acharia
-
Moukas
-
Maes reputation rating
. Figure 5 displays the
total utility of agent A under these mechanisms, the experiment shows that the “most recent”

12

reputation mechanism quickly wins out against the others, except tit
-
for
-
tat. Here, if all agents are
using tit
-
for
-
tat, then obviously, all agents would cooperate and each agent would achieve its best
performance.

------------------------------------------

Insert Figure 5 here

-------------------------------------------


We now study the impact of additi
onal trade partner by comparing the total utility of agent A in two
agent and three agent contexts. Not surprisingly, agent A benefits from the introduction of the third
player as there is now competition between agent B and C. The results are summarized i
n Table 1.

Table
1
: Total Utility of Agent A in 2
-
agent and 3
-
agent context.


Three agents

Two agents

Moving Average

506829.6

337441

Most Recent Move

577475.8

279490.9

Exponential Smoothing

423258.9

449496.1

Zacharia
-
Moukas
-
Maes

452451.9

449606.6



What if A uses a different fixed reputation mechanism such as tit
-
for
-
tat, while Agent B and C are
using another but identical reputation mechanisms? Will agent A benefit from such a differentiation?
The results are somewhat mixed and

show that the performance depends on what the others are using,
as summarized in Table 2.

Table
2
: Total Utility of Agent A with Tit
-
for
-
Tat strategy playing against other reputation mechanisms in two
-
agent and three
-
agent environm
ents.


Two agents

Three agents

Moving Average

413824.6

418158.5

Most Recent Move

423078.2

389489.9

Exponential Smoothing

394672.8

429542.0

Zacharia
-
Moukas
-
Maes

423752.8

382129.0



13


Three agents selling three goods


W
e now let agent C sell the third type of sauce
, i.e., we are considering the general case when each
agent has a different good
. The endowment
at each period
/episode

of each agent is
set to 3
, reflecting
a steady state production rate of reach agent
(i.e., each agent can produce a fixed amount of goods
during a period)
.
Now we describe the

trade game.

At the beginning
of
each episode,

each agent
decide
s

simultaneously how

many to give
to the other two agents
, expecti
ng an exchange from them
.
It turn
s

out that
the system can turn quickly into

being

too complicated to be
tractable even
i
n the two
agent
or the three agent
learning situation
(
Sandhol
m

and Crites 1995
; Marimon, McGrattan and
Sargent

1990
)
.
In our setting,
it will be very difficult for the Q
-
learning agents to learn the true value
of the state
-
action functions.
In this initial exploration, we will start with one agent
(A)
learning by
fixing the strategies of the other two
agents

(B and C). We note here in pa
ssing that we leave the case
of three agents learning simultaneously for future research.



We have identified from the literature the following heuristics for players B and C, which were
suggested by previous research, including the nice, nasty, fair, and

modified tit
-
for
-
tat strategies (e.g.
Axelrod 1984; Axelrod and Hamilton 1981
). For benchmarking purpose, as in the literature, we also
add the random strategy. Table 3 describes these five strategies.

Table
3
: Possible Strategie
s of Player B and C.

Random

Agent randomly decides how many goods to give to the other two agents.

Nasty

IF V’
t
-
1

= 0

THEN V
t

= 1

ELSE V
t

= 0

Fair

IF V’
t
-
1

= 0

THEN V
t

= 0

ELSE V
t

= 1

Nice

Agent always gives 1 unit of its good to each of the other two agents. (V
t =
1)

Modified Tit
-
4
-
Tat

Agent always gives the amount its opponents gave to it in the last episode

(V
t

= V’
t
-
1
) when the total amount it got from last episode doesn’t exceed its

14

en
dowment. If the total amount exceeds its endowment, the agent will give its
good to the other agents proportional to the amount he was given in the last
episode.



We experiment with the behavior of our learning agent (A) in the following typical scenario
s of agent
B and C
3
. Agents B and C are using the following strategies: random and random; tit
-
for
-
tat and
random; tit
-
for
-
tat and nasty; tit
-
for
-
tat and fair; tit
-
for
-
tat and tit
-
for
-
tat; tit
-
for
-
tat and nice; and
finally fair and nice. Our interest is to

investigate the performance of agent A using the above five
reputation mechanisms. Figure 6a
-

6e display the results for moving average, most recent move,
exponential smoothing,
Zacharia
-
Moukas
-
Maes rating and tit
-
for
-
tat mechanisms accordingly.

--------
----------------------------------

Insert Figure 6a


6e here

-------------------------------------------


Can the artificial intelligent agent learn to cooperate? The answer is yes based on the above
experiment. When using exponential smoothing, agent A s
eems to learn slowly and performance is a
bit inferior; otherwise it performs well under all other reputation mechanisms. The value of
intelligence (Zhong, Kimbrough and Wu 2001) is further confirmed in this study, i.e., intelligence
pays. The learning age
nt can quickly learn how to exploit the other agents’ fixed strategies.





3

We note here that it is straightfo
rward to conduct a statistical significant test of all possible combinations of agent B and
C’s strategies. However, we choose not to do so since we believe such statistical formalism would not add much
additional insights into our interest here.


15

The results show that the emergence of trust depends on the strategies used by B and C, or the
“climate”. When the climate is nice, then the agent will learn to cooperate and the soci
al welfare is
maximized and rather fairly distributed (almost equally split).


In terms of the comparison of the five different reputation mechanisms, except for the exponential
smoothing reputation mechanisms, there does not seem to exist any significant

difference in building
trust. This is so, due to the commonality of these mechanisms, which is to forgive or discount
previous actions taken by other parities. This is interesting and the role of forgiveness in promoting
trust building deserves further i
nvestigation. Again, we leave this for a subsequent project.

Overall, this experiment demonstrates the promise of artificial intelligent agents in the Mad Mex trust
game, and indeed in market negotiation contexts generally.


6. Conclusions and Future rese
arch


It is well known in the literature that trust will emerge in repeated games such as the Max Med Trust
game studied here. However, this study deepens previous work by examining when trust will and
when trust will not emerge using a framework that allo
ws parameter settings. The agents here are
identity
-
centric using reinforcement Q
-
learning and the performance has been compared with the
strategy
-
centric agents.


Artificial agents using Q learning have been found to be capable of playing the Mad Mex Tru
st game
efficiently and effectively.
Cooperative behaviors have emerged and conditions for such cooperation
or trust building has been studied experimentally. Several efficient and effective mechanisms for trust
building in electronic markets have been tes
ted and compared. The study explores, initially, how
these mechanisms affect the emergence of trust and cooperative behavior. Key ingredients in building

16

as well as destroying distributed trust have been experimented. Can we find characteristics of
trustin
g/distrusting systems? Our initial yet original exploration shed light on this.


We believe our Mad Mex Trust game constitutes a more realistic model of negotiation support
systems in electronic markets, particularly on the Internet. We are actively invest
igating other forms
of trust games, including the classical investment game (or the trust game) as well as the ultimatum
game (see Zhong, Kimbrough and Wu 2001 for initial results). Of particular interest, we plan to
investigate a closed related game, the
Santa Fe Bar Game, first proposed by Arthur (1994). In the
long
-
run,
we hope to develop computational principles for understanding social trust.


17

7. References


1.

Arthur, B. “Inductive Reasoning and Bounded Rationality,”
The American Economic Review
,
V. 84,

No. 2, pp. 406
-
411, May 1994.


2.

Arthur, B., Holland, J., LeBaron, B., Palmer, R., and Tayler, P. “Asset Pricing Under
Endogenous Expectations in an Artificial Stock Market,” Working Paper, Santa Fe Institute,
December 1996.


3.

Axelrod, R.
The Evolution of Co
operation
. Basic Books, New York, N.Y., 1984.


4.

Axelrod, R., and Hamilton, W. “The evolution of cooperation”,
Science
, Vol. 211, No. 27 pp.
1390
-
1396, March 1981.


5.

Berg, J., Dickhault, J., and McCabe, K. “Trust, Reciprocity, and Social History,”
Games and

Economic Behavior
, 10, pp. 122


142, 1994.


6.

CACM (Communications of ACM), Special Section on Trusting Technology,
http://www.acm.org/cacm/1200/1200toc.html
,
Vol. 43, No. 12, December 2000
.


7.

Das,
T.K., and Teng, B.
-
S. “Between Trust and Control: developing confidence in partner
cooperation in alliances”,
Academy of Management Review
, Vol. 23, No. 3, pp. 491
-
512,
1998.


8.

Dusgopta, P. “Trust as a Commodity,” In D. Gambetta, editor, Trust: Making and
Breaking
Cooperattive Relations, pp. 49


72. Blackwell, Oxford and New Work, 1988.


9.

Erev, E. and Roth, A. “Predicting How People Play Games: Reinforcement Learning in
Experimental Games with Unique, Mixed Strategy Equilibria,”
The American Economic
Review
, 88, pp. 848


881, 1998.


10.

Ellison, G. “Learning, Local Interaction, and Coordination,”
Econometrica
, V. 61, No. 5, pp.
1047
-
1071, September 1993.


11.

Gűth, W., Ockenfels, P., and Wendel, M. “Cooperation Based on Trust: An Experimental
Investigation,”
Journal of Economic Psychology
, 18, 15


43, 1997.


12.

Hardin, R. “Exchanges Theory on Strategic Bases,” Social Science Information, Vol. 21, No.
2, pp. 251
-
2
72, 1982.


13.

Holland, C.P., Lockett, A.G., “Business Trust and the Formation of Virtual Organizations”,
Proceedings of the 31st Annual Hawaii International Conference on System Sciences (
HICSS
-
31
), IEEE Computer Society, 1998.



18

14.


Holland, J. Artificial Adapt
ive Agents in Economic Theory.
The American Economic Review
,
81, pp. 365


370, 1975.


15.


Hopcroft, J. and Ullman, J.
Introduction to Automata Theory, Languages and Computation
.
Addison
-
Wesley, Reading, MA, 1979.


16.


Kimbrough, S., Wu, D.J., and Zhong, F. “Com
puters Play the Beer Game: Can Artificial
Agents Manage the Supply Chain?”
HICSS
-
34
, 2001.


17.


Lahno, B. “Trust, Reputation, and Exit in Exchange Relationships”,
Journal of Conflict
Resolution

Vol. 39, No. 3, pp. 495
-
510, 1995.


18.

Marimon, R., McGrattan, E.,
and Sargent, T. “Money as a Medium of Exchange in an
Economy with Artificially Intelligent Agents”,
Journal of Economics Dynamics and Control
.
Vol. 14, pp. 329
-
373, 1990.


19.

Mayer, R.C., Davis, J.H., and Schoorman, F.D., “An Integrative Model of Organization
al
Trust”,
Academy of Management Review
, Vol. 20, No 3, pp. 709
-
734, 1995.


20.


Miller, J. “The Coevolution of Automata in the Repeated Presioner’s Dilemma,”
Journal of
Economic Behavior and Organization
, 29, pp. 87
-
112, 1996.


21.

Rubinstein, A. “Finite Automata

Play the Repeated Prisoner’s Dilemma”,
Journal of
Economic Theory
39, pp. 83
-
96, 1986.


22.

Sandholm, T., and Crites, R. “Multiagent Reinforcement Learning in Iterated Prisoner's
Dilemma,”
Biosystems
, 37, pp. 147
-

166, 1995. Special Issue on the Prisoner's D
ilemma.



23.

Schillo, M. and Funk, P. “Who Can You Trust: Dealing with Deception”, in
Proceedings of
the Workshop Deception, Fraud and Trust in Agent Societies at the Autonomous Agents
Conference,

pp. 95
-
106, 1999.


24.

Shapiro, S. P. “The Social Control of Impe
rsonal Trust”,
The American Journal of Sociology,

Vol. 93, No. 3, pp. 623
-
658, 1987.



19

25.

Sundali, J., Israeli, A., and Janicki, T. “Reputation and Deterrence: Experimental Evidence
from the Chain Store Game,”
Journal of Business and Economic Studies
, Vol. 6,
No. 1, pp. 1

19, Spring 2000.


26.

Tan,
Y. H. and Thoen, W. “Formal Aspects of a Generic Model of Trust for Electronic
Commerce,” Working Paper,
Erasmus University Research Institute for Decision and
Information Systems (EURIDIS), Erasmus University Rotterdam,

The Netherlands, 2001.


27.

Uslaner, E.M.

Social Capital and the Net
,”

Communications of ACM,

http://www.acm.org/cacm/1200/1200toc.html
,
Vol. 43, No. 12, December 2000
.


28.

Van der Heijden E.C.M., Nelis
sen, J.H.M., Potters, J.J.M, and Vernon, H.A.A. “Simple and
Complex Gift Exchange in the Laboratory,” Working Paper, Department of Economics and
CentER, Tilburg University.


29.

Wolfram, S.
Cellular Automata and Complexity
. Addison
-
Wesley Publishing Company,
Reading, MA, 1994.


30.

Zacharia, G., Moukas, A., and Maes, P. “Collaborative Reputation Mechanisms in Electronic
Marketplace,”
HICSS
-
33
, 1999.


31.

Zhong, F., Kimbrough, S. and Wu, D.J. “Cooperative Agent Systems: Artificial Agents Play
the Ultimatum Game”, Worki
ng Paper, The Wharton School, University of Pennsylvania,
2001.


20



0
50000
100000
150000
200000
250000
300000
350000
400000
450000
500000
0
30000
60000
90000
120000
150000
180000
210000
240000
270000
300000
Time
Utility
Avg.
Rrecency
Smoothing
Maes
Tit-4-Tat

Figure
1
: Total utility of agent A under different mechanisms for 300,000 episodes


0
100000
200000
300000
400000
500000
600000
0
30000
60000
90000
120000
150000
180000
210000
240000
270000
300000
Time
Utility
Avg.
Recency
Smoothing
Maes
Tit-4-Tat

Figure
2
: Total utility of agent B under different mechanisms for 300,000 episodes



21

0
100000
200000
300000
400000
500000
600000
700000
800000
900000
0
30000
60000
90000
120000
150000
180000
210000
240000
270000
300000
Time
Utiltiy
Avg.
Recency
Smoothing
Maes
Tit-4-Tat

Figure
3
: Joint utility of both agents under different mechanisms for 300,000 episodes


0
100000
200000
300000
400000
500000
600000
700000
800000
900000
aa
ar
as
am
at
4a
Avg.
Opponent
Joint


22

0
100000
200000
300000
400000
500000
600000
700000
800000
900000
ra
rr
rs
rm
rt
4b
Recency
Opponent
Joint



0
100000
200000
300000
400000
500000
600000
700000
800000
900000
sa
sr
ss
sm
st
4c
Smoothing
Opponent
Joint


0
100000
200000
300000
400000
500000
600000
700000
800000
900000
ma
mr
ms
mm
mt
4d
Maes
Opponent
Joint


23

0
100000
200000
300000
400000
500000
600000
700000
800000
900000
ta
tr
ts
tm
tt
4e
Tit-4-Tat
Opponent
Joint

Figure
4
:The performances of different reputation mechanisms when playing against each other. ( a
-

moving
ave
rage, r
-

most recently, s
-

exponential smoothing, m
-

Maes, t
-

tit4tat)


0
100000
200000
300000
400000
500000
600000
700000
0
30000
60000
90000
120000
150000
180000
210000
240000
270000
300000
Time
Utility
avg.
recent
smoothing
mae

Figure
5
: Total Utility of Agent A in three agents context.




24

0
50000
100000
150000
200000
250000
300000
350000
400000
450000
random-
random
t-random
t-nasty
t-fair
t-t
t-nice
fair-nice
6a
Agent A
Agent B
Agent C

0
50000
100000
150000
200000
250000
300000
350000
400000
450000
random-
random
t-random
t-nasty
t-fair
t-t
t-nice
fair-nice
6b
Agent A
Agent B
Agent C

0
50000
100000
150000
200000
250000
300000
350000
400000
450000
random-
random
t-random
t-nasty
t-fair
t-t
t-nice
fair-nice
6c
Agent A
Agent B
Agent C


25

0
50000
100000
150000
200000
250000
300000
350000
400000
450000
random-
random
t-random
t-nasty
t-fair
t-t
t-nice
fair-nice
6d
Agent A
Agent B
Agent C

0
50000
100000
150000
200000
250000
300000
350000
random-
random
t-random
t-nasty
t-fair
t-t
t-nice
fair-nice
6e
Agent A
Agent B
Agent C

Figure
6
: Total Utilities of the three agents under different reputation mechanisms and strategy combinations.