Othello Artificial Intelligence With Machine Learning

unknownlippsAI and Robotics

Oct 16, 2013 (3 years and 9 months ago)

69 views

Othello Artificial Intelligence

With

Machine Learning


Computer Systems TJHSST

Nick Sidawy

Introduction


Machine

learning

is

an

extensive

filed

of

study
.

Most

of

what

is

done

with

machine

learning

is

tied

to

artificial

intelligence

which

is

why

Othello

seemed

to

be

a

good

vessel

for

my

research
.

It

is

a

simple

enough

game

for

me

to

work

on,

yet

difficult

enough

to

keep

me

working

throughout

the

quarters
.

Purpose


The

purpose

of

this

research

project

is

to

implement

machine

learning

with

artificial

intelligence
.

T h e

r e a s o n

f o r

t h i s

is

two
-
fold
:



First,

to

create

a

very

effective

Othello

AI
.



Second,

and

more

academically

oriented,

is

to

gain

a

deeper

understanding

of

machine

learning
.

Goal Breakdown


The

first

goal

I

would

like

to

achieve

with

this

project

is

to

program

an

effective

forward
-
checker

for

the

AI
.


The

second

goal

is

to

use

a

genetic

algorithm

to

formulate

the

best

evaluation

function

for

the

AI

to

use
.


My

last

goal

is

to

have

the

AI

learn

from

each

move

it

makes

so

that

the

more

it

plays,

the

faster

and

better

it

will

perform
.

Development


The

project

will

be

coded

in

java
.


I

have

separated

it

into

7

different

programs
:


The

driver



The

panel

(which

controls

the

game)



The

AI


The

game

functions

(such

as

capturing

pieces)



A

class

called

MyButton


A

program

to

run

the

Genetic

Algorithm



A

program

for

the

game

functions

that

will

be

used

in

the

Genetic

Algorithm
.


A

program

to

the

MachineLearning

functions
.

Construction (First Iteration)


There are two algorithms that the first quarter was
focused on:



The forward
-
checking algorithm (Minimax)




The evaluation function



A MyButton class was also created

MyButton Class

The

MyButton

class

extends

Jbutton

and

has

the

following

variations
:



It

has

a

paint

component



It

can

store

values

which

indicate

if

the

spot

on

the

board

is

occupied

by

a

piece

and

will

draw

images

accordingly

Forward
-
Checker



The

goal

of

a

forward

Checker

is

to

traverse

a

tree

of

possible

moves

and

picking

the

move

that

will

lead

to

the

best

scenario

down

the

line
.

The

ply

determines

how

many

levels

of

the

tree

it

goes

through
.



The

trick

is

that

at

each

level

of

the

three

it

picks

the

move

that

is

best

for

the

player

it

is

simulating

for
.

Therefore,

the

computer

assumes

the

opponent

will

play

perfectly
.


It

is

nicknamed

the

Minimax

algorithm

for

this

reason
.

(See

Diagram

A)


Evaluation Function


This

function

returns

a

number

rating

how

good

a

particular

board

is

for

a

player
.

It

does

this

based

on

the

positions

of

the

pieces

and

amount

of

available

moves

for

each

player
.

For

example,

pieces

in

the

corners

are

very

valuable

and

will

add

many

more

points

to

the

rating

than

a

piece

near

the

center
.

Construction (Second Iteration)



Recreating the GUI that the AI would run in so
different‏AI’s‏and‏simulations‏could‏be‏used‏
when running the program just once, instead of
having to restart it.


Creating a program to run the Genetic
Algorithm.


Creating a program that would contain the
moves necessary for the Genetic Algorithm to
run.

The New GUI

Genetic Algorithm


The values used to evaluate a given Othello board
are very important for picking the best move and I
hope to use the Genetic Algorithm to find the best
values for evaluation.


The main components:


The population (A set of different evaluation values)



The fitness evaluation (The way of testing how well a
set of evaluation values performs)



Splicing and Offspring (The production of a new,
improved set of evaluation values)


Genetic‏Algorithm‏(Cont’d)



First, 8 sets of evaluation values are created
that will make up the population.


Second, each set is given a score based on
how it does with the fitness function, which in
this case is a game played against a different,
constant set of values.


Third, 8 new sets of values are produced by
splicing the 8 sets of values in the original
generation.

Splicing


The reproduction takes a Darwinian approach.


Depending on how well a particular set does against the
fitness function determines the likelihood it will be chosen
for reproduction.


Two sets (Set A and B) are chosen based on the
probabilities and a crossover point is chosen at random for
the sets.


Next, the two sets will splice by taking all the values before
the crossover point of Set A and combining them with the
values after the crossover point of Set B. Then, the
opposite is performed.

Splicing‏(Cont’d)



This process is done 4 times so the offspring (next
generation) is created and can be tested against the
fitness function.


There is also a predetermined chance that a mutation may
occur. A mutation is when a value on a specific set is put
to a random number and helps prevent the evaluation sets
from all reaching the same values (plateau).


In the event a plateau is reached and all the sets in a
generation are the same, then the fitness function will
become the one of the sets of the generation and 8 new
sets will be created by random.


In theory this will create better and better evaluation sets
over time.


Construction (Third Iteration)


Creating a program (MachineLearning) which
does the following:


Saving information from each move the AI
makes.


Storing the data in a file so it can be used game
after game.


Loading data from the file and putting it into a
HashMap for quick access during a game.

Data Collecting Procedure


Games will be run between the AI and a
random opponent over and over.


Data from each move made will be stored in
several different variations. (For example, the
board that is given, a reflection of that board,
etc.)


Data Storage

After each move the AI makes, four pieces of
data will be stored:


Player's piece color


Boardstate (before the move is made)



Move chosen


Evaluation

During a game, this data will be stored in a
HashMap with the player's piece color and
boardstate as the key and the chosen move
and evaluation as the values stored.

Implementation of Data


By storing all this data about moves the AI has
chosen, the FowardChecker should be more efficient
and, possibly, more effective.


This will prevent the need to traverse through a tree of
possible moves if a board situation found anywhere in
the tree has been encountered.


It should make it possible to search deeper into a tree
of moves by using the stored information only after the
ForwardChecker has moved three or four levels deep.

Conclusions


I

have

learned

the

importance

of

the

evaluation

function
.

A

s t r o n g

e v a l u a t i o n

f u n c t i o n

h a s

mu c h

mo r e

we i g h t

t h a n

a

f o r w a r d
-
c h e c k e r

wi t h

a

h i g h
-
p l y
.


C o l l e c t i n g

l o a d s

of

data

and

putting

it

is

useful

when

you

have

enough

of

it,

probably,

but

as

far

as

I

can

tell

I

will

need

thousands

of

times

more

than

I

have

for

any

real

improvement

to

take

place
.