Genetic Algorithms

weaverchurchΛογισμικό & κατασκευή λογ/κού

15 Αυγ 2012 (πριν από 5 χρόνια και 28 μέρες)

255 εμφανίσεις

Genetic Algorithms

Laboratory

Four
: Introduction to ECJ
, A GA Toolkit

In the previous lab we implemented a simple genetic algorithm on a number of
problems involving bitstrings. Once the GA was built it was quite straightforward to
apply it to a range of b
itstring
-
based problems. One simply has to write a new
Problem class and change the setFitness method of the BitStringSolution class to
evaluate on the new problem. The simple GA code also contains a number of choices.
We can sleect parameters like populat
ion size, mutation rate and crossover rate. We
can also choose different selection methods and different genetic operators by editing
the code.

In general there many different kinds of encoding and many different possible
parameter settings for a GA. Many
of these components are so standard that we would
wish to re
-
use them. When applying a GA to a new problem, we should not need to
write the entire code for the GA from scratch. A
GA Toolkit

is a software system for
developing GAs that
provides most of the
standard components you need. We will be
studying a particular GA Toolkit called ECJ. ECJ assists in the development of GAs
in Java.

1)

ECJ
is being

developed in Java by Sean Luke and others in the Evolutionary
Computation Laboratory at George Mason Universit
y
, USA
. ECJ documentation
can be downloaded from
http://cs.gmu.edu/~eclab/projects/ecj/

For our purposes we will be using our own version of ECJ, not the version
available on the ECJ website.

ECJ has been repackaged as a JAR file
1
, enabling
development of
new GAs in a NetBeans project decoupled from the ECJ
hierarchy.

2)

This lab sheet provides instructions on how to use
the repackaged
ECJ in
NetBeans. S
tart up NetBeans and create a new project,
myGA
s

as follows
.

Start NetBeans from the Taskbar.

Select File |
New Project





1

Many thanks to Deryck Brown for doing this

This gets you into a Wizard. You should use the default category, namely
General
,
and the
default project type, namely
Java Application

as shown.
Click on
Next
.



In the next panel, enter
myGAs

in

the
Project Name

field. Then

browse

through
the directory structure until you get to
th
e folder where you wish the project

to be
stored
.

H
ighlight it and then click
Open
. The path to your folder should now
appear in the Project Location field.
Ensure that

the box Create Main Class

is
not
checked
.
Cl
ick
Finish

and NetBeans will create a project folder called
myGAs

in
your folder. You may view the contents of this folder using the
Files

panel. It
contains some subdirectories such as
Source
,
Test Packages
,
Libraries

and
Test
Libraries
.

3)

Download
the arch
ive, ECJ.jar, from the L
aboratories section

of the Module
Website
. For convenience, it is a good idea to place ECJ.jar where it will be easy
to refer to in your parameter files. I recommend placing it in the dist subfolder of
the myGAs project.

4)

Make the EC
J classes available to your

project by importing the ECJ.jar

archive as
a library as follows.

In the Project Explorer Window, right click on the Libraries
folder and select Add JAR/Folder. In the pop
-
up window that follows, brow
se to
the location of the EC
J.jar

file and select it as shown.


The ECJ archive will now appear in the project library folder. You may now use
ECJ classes to develop GAs.

5)

To work

with
the
ECJ

JAR file
,
y
ou
set up sub
-
packages in your project for each
separate problem.
We will now se
t up ECJ to run the
OneMax problem
. ECJ
needs you to specify two things in order to build a GA: a problem class that
defines how to evaluate the fitness function; and a parameter file that configures
all other aspects of the GA.

These are then controlled v
ia a Main class that runs
the GA. First we will create a
parameter file for the OneMax problem.

a)

Click File | New File… and select other| Empty File as shown:


Click Next and in the following dialogue window set the file
name
tutorial1
.params. Leave the
Fo
lder

field blank as shown.


The default working directory for a NetBeans project is the top level folder
containing the build.xml file.

You should see tutorial1.params appearing in the file
window after you have created it. At the moment it is a blank te
xt file.

b)

You will find a text file called
tutorial1.params

in the
Lab4
Resources.zip
archive on the module web pages. Copy the text from this file into you
r

blank
tutorial1.params file. This file now contains lots of parameter settings that
configure a Gene
tic Algorithm in ECJ. We will explain most of these in
subsequent labs. For the moment, it is enough to know that the file specifies
that the GA is working with a bitstring encoding

of length 100
, using
tournament selection, 1
-
point crossover with crossove
r rate 1.0 and uniform
mutation with mutation rate 0.01.

W
e will change one parameter

which tells
ECJ where to look to find out how to evaluate a chromosome
.

Change the very
last line of tutorial1.params

to the following:

eval.problem
= mygas.tutorial1.Max
Ones

This tells ECJ that, to evaluate a bitstring chromosome, it should use the
MaxOnes class in the mygas.tutorial1 package.

c)

The next step is to specify the MaxOnes class that evaluates the bitstring
chromosome
2
. Create a new Java class by choosing File
| New File from the
NetBeans menu. This puts you in a Wizard as shown:


Select Java Class as shown and click Next. The next window looks like this:


Set the Class Name to MaxOnes and the package to mygas.tutorial1 as shown,
then click Finish.

Your projec
t file structure should now look like this:





2

This plays the same
role as the MaxOneProblem class in the simple GA from Lab 3.

d)

You will find a file called
MaxOnes.java

in the
Lab4
Resources.zip archive on
the module web pages. Copy the text from this file into your MaxOnes.java
file.

Let’s look at this line by line:

package mygas.tutor
ial1;

import ec.EvolutionState;

import ec.Individual;

import ec.Problem;

import ec.simple.SimpleFitness;

import ec.simple.SimpleProblemForm;

import ec.vector.BitVectorIndividual;

This first section of code sets up the class to be in the mygas.tutorial1 pa
ckage
(where ECJ will look for it) and imports class libraries that ECJ will need to
use with this class. All of these libraries are contained in the ECJ.jar file and
should be accepted happily by the NetBeans compiler.

public class MaxOnes extends Proble
m implements
SimpleProblemForm {

The MaxOnes cla
ss is a subclass of the ECJ class Problem and implements the
SimpleProblemForm interface. This means that it is the sort of class ECJ
needs to evaluate a chromosome as a solution to a particular problem. The
SimpleProblemForm interface consists of two methods,
evaluate

and
describe
.
We can ignore
describe

and simply leave it empty. The important method we
need to define is
evaluate
. This method tells ECJ how to evaluate a
chromosome.

public void evaluate(final

EvolutionState state,


final Individual ind,


final int threadnum) {

The
evaluate

method takes three parameters. ECJ calls this method
automatically when it wants to evaluate a chromosome and will supply all of
the parameters itself.

We can ignore
state

and
threadnum
. The parameter we
are concerned with is
ind
, which represents the chromosome
3
.

ECJ contains a number of classes that represent chromosomes, the most
general of which is Individual. The next part of the code is to do with
checking
whether
ind
has already been evaluated or not
,

extracting

ind

and
then type
-
casting it as a bitstring, which is the type of chromosome we wish to evaluate.

if (ind.evaluated) return; //don't evaluate the
individual if it's already evaluated

if (! (ind instanceof BitVectorIndividual))


state.output.fatal("Whoa! It's not a
BitVectorIndividual!!!", null);

We now come to the important part:

BitVectorIndividual ind2 = (BitVectorIndividual) ind;




3

In ECJ terminology, a chromosome is referred to as an individual.

int sum = 0;

for

(int x = 0; x < ind2.gen
ome.length; x++)


sum += (ind2.genome[x] ? 1 : 0);

First we type
-
cast
ind

as a BitVectorIndividual
ind2
. This
ind2
, as a
BitVe
ctorIndividual, has methods that we can use to get at the values of the
bits and so calculate the fitness of the chromosome. W
e set a variable
sum

to
0.
sum

will be used to store the fitness of the chromosome. The next bit of
code is a
for

loop that performs the OneMax fitness calculation with which
you are already familiar. BitVectorIndividual has a property called
genome

which
is an array of Boolean values. We use the length property of this array
to loop through all the values. At each position
x

in the array, we use the
Boolean value of
genome[x]

to decide whether to add 1 or 0 to the fitness
.

OK, so we have calculated the fit
ness of the chromosome. The final thing the
evaluate

method does is to pass this fitness back to ECJ. The code is as
follows:

if (! (ind2.fitness instanceof SimpleFitness))


state.output.fatal("Whoa! It's not a
SimpleFitness!!!", null);

This is

some boilerplate code to trap errors. You just put this code into your
problem classes at this point without change.


((SimpleFitness)ind2.fitness).setFitness(state,


// ...the fitness...


(float)(((double)sum) /
ind2.
genome.length),


/// ... is the individual ideal?
Indicate here...


sum == ind2.genome.length);


ind2.evaluated = true;

What is happening in this complex line of code is that you are running the
setFitness

method of
in
d2
4
. Since
ind2

is just a reference to the object
ind

this
sets the fitness for the chromosome that was originally passed in to the
evaluate

method. The
setFitness

method takes three parameters, which you
have to define in your code. The first parameter is

easy,
state
. Here you are
just passing back the
state

parameter that was originally passed into the
evaluate
method. So you just always put
state

in as the first parameter.

The second parameter is the fitness value itself. This must always be of type
floa
t.
Note that
sum

was an int for the OneMax calculation so it has to be
typecast into float. This parameter could have been written simply as:

(float) sum

In fact the code divides
sum

by the length of the bitstring. The effect of this is
that the fitness is

always a number between 0 and 1, and the optimal string



4

Strictly speaking, you are running the setFitness method of the fitness of ind2 but the additional
complexity is unimportant
here. You are setting the fitness of the chromosome in this part.

consisting of all 1s has fitness 1.0 instead of fitness equal to the chromosome
length. This simply illustrates that there is more than one way of calculating
fitness for the same problem. The import
ant thing here is that, whatever you
calculate fitness to be, the result must go in to this second parameter as a
variable of type float.

The third parameter is a Boolean value that indicates whether the chromosome
is ideal, i.e. optimal. ECJ uses this inf
ormation to stop the GA when an
optimum chromosome has been found. In this case the condition is

sum == ind2.genome.length

That is, the chromosome is optimal if all bits are set to 1 and so the
sum

is
equal to the length of the chromosome. Of course for ma
ny problems we do
not know whether an optimum has been reached. In this case we may simply
put
false

as the third parameter. If we were solving a satisfaction problem,
we would put our satisfaction criterion in this third parameter, for example:

sum >= 90.
0

The final line of code sets a Boolean property
evaluated

of the chromosome to
true
. This ensures that we only incur the computational cost of evaluating a
particular chromosome once. So if the chromosome is passed unchanged to a
successor population, ECJ

will not re
-
evaluate it. This again is standard
boilerplate code and you should always include it in your problem classes.

e)

The final step is to create a class that will run the GA. C
reate a new Main class
for your project by selecting File | New File… as
shown:


Select Java Main Class and click Next. In the next dialogue window, set up a class
called Main

and set the package to myga
s.tutorial1 as shown:


The browser window in NetBeans will now contain a folder for the tutorial1
package containing the Mai
n class. You should now see the following in the
NetBeans file hierarchy:


6)

Write the following code
in the
main

method of the
Main
class
:


Now build and run the project.
When you run the project, NetBeans will ask you
which Main class you want to run. At

the moment there is only one choice. Select
tutorial1.Main.
You will see something like the following output:


This is the default output for ECJ showing a sequence of completed generations.
The system by default outputs data to
a file called out.stat in
the main project
folder
.

Your project file hierarchy should now look like this:



Open out.stat
. You will see output of the following form:


The output records, for each generation
,

the genes of the best individual in the
population in that generation an
d its fitness. At the end of the run, ECJ also
outputs the genes of the best individual found during the whole run, along with its
fitness. Parameters can be set to specify what data is written out during a GA run.

7)

Currently, the GA is solving a length 100

OneMax problem. Change the
length to
15
0 by changing the following parameter in the params file:

pop.
subpop.0.species.genome
-
size

= 15
0

What happens? Do you find the optimum? Perhaps you need to increase the
number of generations. Try:

generations


= 1000

8)

Now try varying the population size.

Try:

pop.subpop.0.size


= 200

9)

Experiment with various combinations of
these
parameters, running the GA
several times for each combination of parameters.

10)


The Alternating Sum Problem:

This is defined on bitstrings of l
ength n and ca
be stated as:

For
1
0
...


n
x
x
x
, maximise




1
0
)
1
(
n
i
i
i
x
, i.e. maximise
1
1
1
0
)
1
(
...






n
n
x
x
x
.

The optimum solution has a 1 in each odd position and a 0 in each even position.

For the bitstring 10110, of length 5 the eva
luation would be
1
0
1
1
0
1






The optimum bitstring of length 5 would be 10101 with value 3.

Using ECJ, set up a GA to solve the Alternating S
um problem. Do not alter your
OneMax code. Create a completely new params file and a completely new
pro
blem and Main c
lass in a mygas.altsum package. Note that you will need to
change the Main class used by the project. This can be done by right
-
clicking on
the myGAs project and selecting the properties wizard. Change the Main class as
shown in the screensh
ot below:


Congratulations! You are now able to use ECJ to build a genetic algorithm.

11)


Now try running ECJ on the other problems you implemented in Lab3,
RoyalRoadProblem and TrapProblem.