CS510 AI and Games Term Project Design

foulchilianΤεχνίτη Νοημοσύνη και Ρομποτική

20 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

69 εμφανίσεις

CS510 AI and Games

Term Project Design

Juncao Li


Abstract
:

Computer
games, as

entertainment or education

media, are becoming
more and
more popular.
The
design

of AI
s

in computer games usually

focuses on the

adversaries that play
against the
human players.
However the ultimate goal of the game design is n
ot to
make the AIs
win

against players
, but to entertain or educate

the players.
As a result, it is important to have
AI
s

that can mimic the
behaviors of
human
players and serve

as the benchma
rk in
game

designs
.

In this project,
we design
player
AIs in the game Advanced Protection (AP) trying to mimic the
human players and win the game.

We
first
hardcode

a static Finite State Machine (FSM) that
mimic
s

player
s
’ strateg
ies
, and then

use the FSM to

train

a
Neural Network (NN)
as an adaptive
player AI.
We also design

an
approach to evaluate the AIs on both side
s
, i.e., the Human
and

the
Chaos.

The evaluation is based on the
w
in/
l
ose ratio of the AIs given random initial treasury on a
f
ix
ed

map.


Introduction
:
As shown in Fig. 1,
t
he game of Advanced Protection

(AP)

is played between a
human player and a computer opponent (known as Chaos) on a 24 x 24 wraparound grid. The
game is split into turns, which are composed of 50 phases. Before
each turn begins, the human
player is able to buy, distribute, and salvage as many units as his or her treasury allows. When
the turn begins, Chaos's minion
s

are randomly placed on squares not occupied by the human's
units. During each phase of the turn, e
very minion is allowed one or two moves and every
human farming unit generates money. Minions can 1) Move Forward, 2) Turn Right, 3) Turn Left,
and 4) Special Action. (The Special Actions vary between minions. The special action of Scouts is
to broadcast.
The Special Action of Scavengers is to farm. The Special Action of barbarians is to
attack a human unit.) When the turn ends, Chaos's remaining units are removed from the board
and then salvaged for the units' full value. Both the human player and Chaos cr
eate new units
between turns by purchasing them using their respective treasuries. Chaos and the human both
start the game with $2000. The human player wins when Chaos surrenders

(
when
the human
treasury overwhelms the Chaos treasury)
. Chaos wins when the
human has no units and no
money left in his or her treasury.


Fig. 1
T
he UI of Advance Protection

AP is an
adaptive turn
-
based strategy game that updates its
AI strategy each turn
according to

the
human player’s performance.
The difficult
y

ratio will scal
e up or down if the player wins or
loses.

To do this, AP
encode
s

each minion with a brain (namely automaton)

by a 128
-
bit string
.
Each minion has 250 candidate brains to use
, among which 20 brains are hardcoded and 230
brains are generated
from genetic alg
orithms
.

During the play, brains are rated based on
their

performance against the player
.
The brain rating varies on different players because the
differences between players’

strategies.

AP has a fitness function that dynamically
choose
s

the
most fit

minion brain
to

satisf
y

the
player’
s need
.


AI design details
:
We design two
player
AIs
in this project: a Finite State Machine (FSM)
that hardcodes the player’s strategy and a Neural Network that
is firstly trained by the FSM and
further improved by rand
omly generated
playing
cases.

The goal of a player AI is
to

maximize
its
treasury income
each

turn
comparing to the Chaos


income.



The
FSM captures game strategies in AP to play as
a

human player.
It

gathers

necessary
information

to make decisions in eac
h turn: (1) the treasury of both Human and Chaos; (2) the
Terrain; and (3) current human nodes on the map.

The output of the F
SM

includes: (1) where to
place the units;
and
(2) what units to place.
We classify the

human units into

two types: (1)
f
arming un
it
s

such as drone

and settler, whos
e main purpose is to make money; (2)
aggressive
units such as mine and
a
rtillery,
whose main purpose is to cause damage to the Chao
s
’ units.
The FSM makes unit placement following this two categories to maximize its
income and
minimize the Chaos’ income.


Fig.
2

T
he
strategy of the FSM

on the first game turn


Fig.
3

T
he
strategy of the FSM on the last game turn (
before its victory
)

Figure 2 shows the
strategy of the FSM on the first game turn, where it tries to make money by
selecting most

human
units

as

farming
type
.
F
igure 3 shows the strategy of the FSM
on the last
game turn

before its victory
, where it has enough
treasuries

to sustain its growing a
nd
instead of
making money

its
top

interest

is to
suppress the Chaos.



The FSM
is

static
,
where
it

hardcodes the human knowledge and

never learn
s

by itself
.
Further
more,
programming

is limited to
consider

all the
implications that may contribute the
game strategy.
For example
s
,
some small templates that contain certain unit placement
combinations
are complex

be
ing hard
coded
;

game

strateg
ies

can also depend on map
terrain
pattern
s
, where the patterns are dynamic generated during the game.
These limitat
ions request
us to design an adaptive
player
AI.


Fig.
4

T
he
strategy of the
NN

on the
first

game turn

We use Neural Network (NN) to design a
n

adaptive

player

AI that
can evolve

by proper training.
The input size of our NN is 578, which contains the map
terrain information (24*24) and the
initial Human/Chaos treasury (1 respectively)
.
The output is the human unit placement on the
map
(
with the map size
: 24*24)
.

To simplify the design,

we

employ one hidden neural layer

that
contains the same number of node
s as the input.
We normalize the
inputs to fall in the range
between [0.0, 1.0], so
a large
-
number input
s

will not overwhelm other inputs

at the starting
phase of the training
. In
development
, we borrow

two neural network implementations from
the “AI Game
Engine” and “Tim Jones Book”.
The later implementation
is simple and proved to
be efficient during our
practice
.



We
design

two steps to
train

our NN player AI. Firstly, we use

the static FSM to train the NN,
where the training inputs are randomly generated maps and random initial Human/Chaos
treasury, and the output
s

are

generated by the FSM.

The first step can efficiently help the NN
recognize different maps and map patterns.

Secondly, we use randomly generated strategy to
train the NN if the strategy performs better than
the

NN.

By doing this, the NN can keep
improving itself against the Chaos’ brains.


During the training, the NN player AI adapts to recognize different maps.
It is often possible that
the outputs of the NN are not as what we expect exactly, so we need to
interpret

the outputs.
We first normalize the outputs to the range between [0.0, 1.0]. Then we
match

the outputs
to

the
most similar answe
rs.

Figure 4 shows th
e
NN player AI’s first
-
turn output on the same map

as the FSM did
.

The NN has been trained for about 4 hours
with

1 million iterations
. The picture
shows that the NN is adapting to recognize maps.


Evaluation
s
:
We evaluate the player AI
s

and Chaos AI based on
single

turn
s

because only for
each turn, the Chaos’ brain is certain.
We fix the test map

in order to
redu
ce

the
uncertainty

cause by map
s
.
Give
n

certain amount of initial treasury, the player and Chaos AI
should

run
50
times for sta
tistics in order to minimize the influence caused by the
encoded
AI
fuzzy logics.

We
consider o
ur AI
s perform

well

or win
if
f:

given random initial treasury and map, they have
advantage against most Chaos brains

(out of 250)

in terms of the money earned
. T
his
comes
from

a simple observation that good performance on each turn
can
lead

to final win

of the game.


Fig.
5
S
ta
tistics on the FSM: win
s

out of the 250 brains for given initial treasury

Figure
5 shows the statistics of the
hardcoded
FSM against the 250 Chaos brains.
We can infer
from the picture that
the FSM is n
ot smart dealing with the treasury be
tween 4000 and 10000,
where it can never win
against any Chaos brain.

Figure 6 shows
the statistics of the NN player AI

trained after abou
t
1

million iterations.

Although trained by the FSM, the NN
performs bet
ter than the FSM in many cases, especially
when the treasury is between 4000 and 10000
.
The NN AI does not perform better than the FSM
when the initial treasury is larger than 10000, b
ecause
the selecting of the training set does not
favor

the

cases with
large initial
treasury
.



Fig.
5 Statistics on the NN AI: wins out of the 250 brains for given initial treasury

Conclusion and Discussion:

In this project, we studied the

turn
-
based

game
Advanced
Protect
ion

(AP)

especially

on its adaptive AI strategy.
We developed a static FSM
to encode the
human player’
s game strategy against the
AP AI

in terms of brains
.
The FSM can well deal with
certain
brains

under certain initial settings, but
i
t does not perform well in general case
s
,
especially when the initial treasury is large.
We designed a
N
eural
N
etwork

(NN)

that hopefully
can adapt to
perform better than the FSM.
We didn’
t get

enough

time to train the NN
based on
the random generated cases, but
the
training results of
the NN by the FSM already showed us a
potential
ly

promising

outcome

of the

second training step
.


T
his project
show
s

us an approach to
memorize and mimic the
game
player
s
’ behavior
s.
This
cou
ld potentially
help

game companies
improve their game
s

after release.
The player AIs
can be

created by users for free

during the
ir

game

plays

and
used as training cases to improve the
game AIs.


Developing NNs could be hard because
the unpredictability of
AIs

hide
s

bugs
deeply.
We

ran
into several bugs that
make

the
NN

fail to evolve properly.
We

found those bugs mostly by
breakpoint check on data status and code review.
An efficient way to check if the NN is
implemented correctly is to train the NN by
always
the same training pair, seeing if it adapts as
expected.


Link to the source code
, this report and associated presentation slides
:

http://web.cecs.pdx.edu/~juncao/links/src/


Most of my
code is in the files listed below:

The NN player AI class:
NNPlayer.
h,
NNPlayer.cpp

The FSM player AI class:
Player.
h,
Player.cpp

The NN code I borrowed and modified:
backprop.
h,

backprop.cpp

My code of learning:
JLLearning.
h,
JLLearning.cpp

Although I hav
e code in
other files
, I don’t think it’s interesting
.

Please search “JL” for
my comment
s

and code
.