# CSM6120 Introduction to Intelligent Systems

AI and Robotics

Oct 24, 2013 (4 years and 6 months ago)

76 views

rkj@aber.ac.uk

CSM6120

Introduction to Intelligent Systems

Evolutionary and Genetic Algorithms

Informal biological terminology

Genes

Encoding rules that describe how an organism is built up from
the tiny building blocks of life

Chromosomes

Long strings formed by connecting genes together

Recombination

Process of two organisms mating, producing offspring that may
end up sharing genes of their parents

Basic ideas of EAs

An EA is an iterative procedure which evolves a
population of individuals

Each individual is a candidate solution to a given problem

Each individual is evaluated by a fitness function, which
measures the quality of its candidate solution

At each iteration (generation):

The best individuals are selected

Genetic operators are applied to selected individuals in order
to produce new individuals (offspring)

New individuals are evaluated by fitness function

Taxonomy

Search Techniques

Informed

Uninformed

BFS

DFS

A*

Hill Climbing

Simulated
Annealing

Evolutionary
Algorithms

Genetic
Programming

Genetic
Algorithms

Swarm Intelligence

Evolutionary
Strategies

The Genetic Algorithm

Directed search algorithms based on the mechanics of
biological evolution

Developed by John Holland, University of Michigan
(1970s)

To understand the adaptive processes of natural systems

To design artificial systems software that retains the robustness
of natural systems

Provide efficient, effective techniques for optimization and
machine learning applications

Some GA applications

Application: function optimisation (1)

-
1

-
0.8

-
0.6

-
0.4

-
0.2

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

f
(
x
) =
x
2

g
(
x
) =
sin
(
x
)
-

0
.
1
x

+ 2

h
(
x,y)

=
x.sin
(4

x
)
-

y.
sin
(4

y
+

) + 1

Application: function optimisation (2)

Conventional approaches:

Often requires knowledge of derivatives or other specific
mathematical technique

Evolutionary algorithm approach:

Requires only a measure of solution quality (fitness function)

Components of a GA

A problem to solve, and ...

Encoding technique

(
gene, chromosome
)

Initialization procedure

(creation)

Evaluation function

(environment)

Selection of parents

(
reproduction)

Genetic operators

(mutation, recombination)

Parameter settings

(practice and art)

GA terminology

Population

The collection of potential solutions (i.e. all the chromosomes)

Parents/Children

Both are chromosomes

Children are generated from the parent chromosomes

Generations

Number of iterations/cycles through the GA process

Simple GA

initialize population;

evaluate population;

while TerminationCriteriaNotSatisfied

{

select parents for reproduction;

perform recombination and mutation;

evaluate population;

}

The GA cycle

selection

population

evaluation

modification

deleted

members

parents

children

modified

children

evaluated children

recombination

chosen

parents

Population

Chromosomes could be:

Bit strings (0101 ... 1100)

Real numbers (43.2
-
33.1 ... 0.0 89.2)

Permutations of element (E11 E3 E7 ... E1 E15)

Lists of rules (R1 R2 R3 ... R22 R23)

Program elements (genetic programming)

... any data structure ...

Representation of an individual can be using discrete values
(binary, integer, or any other system with a discrete set of
values)

The following is an example of binary representation:

CHROMOSOME

GENE

Example: Discrete representation

0

0

1

1

1

0

1

0

8 bits Genotype

Phenotype:

Integer

Real Number

Schedule

...

Anything?

Example: Discrete representation

0

0

1

1

1

0

1

0

Phenotype could be integer numbers

Genotype:

1*2
7
+ 0*2
6
+ 1*2
5
+ 0*2
4
+ 0*2
3
+ 0*2
2
+ 1*2
1
+ 1*2
0

=

128 + 32 + 2 + 1 = 163

= 163

Phenotype:

Example: Discrete representation

0

0

1

1

1

0

1

0

Phenotype could be real numbers

e.g. a number between 2.5 and 20.5 using 8 binary digits

= 13.9609

Genotype:

Phenotype:

Example: Discrete representation

0

0

1

1

1

0

1

0

Phenotype could be a schedule

e.g. 8 jobs, 2 time steps

Genotype:

=

1

2

3

4

5

6

7

8

2

1

2

1

1

1

2

2

Job

Time Step

Phenotype

Example: Discrete representation

0

0

1

1

1

0

1

0

A very natural encoding if the solution we are looking
for is a list of real
-
valued numbers, then encode it as a list
of real
-
valued numbers! (i.e., not as a string of 1s and 0s)

Lots of applications, e.g. parameter optimisation

Example: Real
-
valued representation

Representation

how to represent the travelling salesman problem
(TSP)?

Find a tour of a given set of cities so that

Each city is visited only once

The total distance travelled is
minimised

Representation

One possibility
-

an ordered list of city numbers

(this is known as an
order
-
based

GA)

1) London 3) Dunedin 5) Beijing 7) Tokyo

2) Venice 4) Singapore 6) Phoenix 8) Victoria

Chromosome 1

(3 5 7 2 1 6 4 8)

Chromosome 2

(2 5 7 6 8 1 3 4)

Selection

selection

population

parents

Selection

Need to choose which chromosomes to use based on
their ‘fitness’

Why not choose the best chromosomes?

We want a balance between
exploration

and
exploitation

Roulette wheel selection

Rank
-
based selection

1st step

Sort (rank) individuals according to fitness

Ascending or descending order (minimization or maximization)

2nd step

Select individuals with probability proportional to their rank only
(ignoring the fitness value)

The better the rank, the higher the probability of being selected

It avoids most of the problems associated with roulette
-
wheel
selection, but still requires global sorting of individuals,
reducing potential for parallel processing

Tournament selection

A number of “tournaments” are run

Several chromosomes chosen at random

The chromosome with the highest fitness is selected each time

Larger tournament size means that weak chromosomes are
less likely to be selected

It is efficient to code

It works on parallel architectures

The GA cycle

selection

population

evaluation

modification

deleted

members

parents

children

modified

children

evaluated children

recombination

chosen

parents

Crossover: recombination

P1

(0 1 1 0 1 0 1 1) (1 1 0 1 1 0 1 1)
C1

P2

(1 1 0 1 1 0 0 1) (0 1 1 0 1 0 0 1)
C2

Crossover is a critical feature of GAs:

It greatly accelerates search early in evolution of a population

It leads to effective combination of sub
-
solutions on different
chromosomes

Several methods for crossover exist…

Crossover

How would we implement crossover for TSPs?

Parent 1

(3 5 7 2 1 6 4 8)

Parent 2

(2 5 7 6 8 1 3 4)

Crossover

Parent 1

(3 5 7 2 1 6 4 8)

Parent 2

(2 5 7 6 8 1 3 4)

Child
1 (
3

5 7 6 8 1
3

4)

Child
2 (
2

5 7
2

1 6 4 8)

Mutation: local modification

Before:

(1 0 1 1 0 1 1 0)

After:

(0 1 1 0 0 1 1 0)

Before:

(1.38
-
69.4 326.44 0.1)

After:

(1.38
-
67.5 326.44 0.1)

Causes movement in the search space

(local or global)

Restores lost information to the population

Mutation

Given the representation for TSPs, how could we achieve
mutation?

Mutation involves reordering of the list:

*

*

Before: (5 8 7 2 1 6 3 4)

After: (5 8 6 2 1 7 3 4)

Mutation

Note

Both mutation and crossover are applied based on user
-
supplied probabilities

We usually use a fairly
high

crossover rate and fairly
low

mutation rate

Why do you think this is?

Evaluation of fitness

The evaluator decodes a chromosome and assigns it a fitness
measure

The evaluator is the only link between a classical GA and the
problem it is solving

evaluation

modified

children

evaluated children

Fitness functions

Evaluate the ‘goodness’ of chromosomes

(How well they solve the problem)

Critical to the success of the GA

Often difficult to define well

Must be fairly fast, as each chromosome must be
evaluated each generation (iteration)

Fitness functions

Fitness function for the TSP?

(3 5 7 2 1 6 4 8)

As we’re minimizing the distance travelled, the fitness is
the total distance travelled in the journey defined by the
chromosome

Deletion

Generational

GA:

entire populations replaced with each iteration

-
state

GA:

a few members replaced each generation

population

deleted

members

The GA cycle

selection

population

evaluation

modification

deleted

members

parents

children

modified

children

evaluated children

recombination

chosen

parents

Stopping!

The GA cycle continues until

The system has ‘converged’; or

A specified number of iterations (‘generations’) has been
performed

An abstract example

Distribution of Individuals in Generation 0

Distribution of Individuals in Generation N

Good demo of the GA components

http://www.obitko.com/tutorials/genetic
-
algorithms/example
-
function
-
minimum.php

TSP example: 30 cities

Overview of performance

Example:
n
-
queens

Put
n

queens on an
n
×

n

board with no two queens on
the same row, column, or diagonal

Examples

Eaters

http://math.hws.edu/xJava/GA/

TSP

http://www.heatonresearch.com/articles/65/page1.html

Exercise: The Card Problem

You have 10 cards numbered from 1 to 10. You have to choose a way of
dividing them into 2 piles, so that the cards in Pile0 *sum* to a number as
close as possible to 36, and the remaining cards in Pile1 *multiply* to a
number as close as possible to 360

Encoding

Each card can be in Pile0 or Pile1, there are 1024 possible ways of
sorting them into 2 piles, and you have to find the best. Think of a
sensible way of encoding any possible solution.

Fitness

Some of these chromosomes will be closer to the target than others.
Think of a sensible way of evaluating any chromosome and scoring it
with a fitness measure.

Issues for GA practitioners

Choosing basic implementation issues:

Representation

Population size, mutation rate, ...

Selection, deletion policies

Crossover, mutation operators

Termination criteria

Performance, scalability

Solution is only as good as the fitness function (often hardest
part)

Your assignment will be to code a GA for a given task! Be
aware of the above issues…

Concept is easy to understand

Supports multi
-
objective optimization

Good for “noisy” environments