Genetic Algorithms

croissantwildernessΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

151 εμφανίσεις

Exploiting
Randomness

Evolution and Genetic Algorithms

Genetic Algorithms

Building blocks and recombination

A Biological Metaphor

Genetic Algorithms


Operations


Crossover


Exchange genetic material between two individuals



Mutation


Randomly change part of the genetic material



Selection


The fittest individuals have the best chance of reproducing

ACTGCCGTCGTCGAAACGCGTAATTTCCG

Operations

Selection (reproduction): favors fittest strings


Crossover: provides way to simultaneously explore and exploit


Mutation: helps prevent development of uniform population that stagnates at less than
optimum fitness

Strengths of Evolutionary Approaches
to Problem Solving


massive parallelism


adaptability


innovation


skirting complex algorithms


balancing exploitation and exploration





(cf. Mitchell)

Luger: Artificial Intelligence, 5
th

edition. © Pearson Education Limited, 2005

Genetic algorithms visualized as parallel hill climbing, adapted from Holland (1986).

A Simple GA


Initialize population with
n

randomly generated
chromosomes of size
m
.


Repeat until fitness


a specified value or a
specified # of generations has been obtained


Compute fitness of each member of the population


Repeat until
n

offspring have been created


Probabilistically select two parents from current
population (based on fitness)


Probabilistically perform cross
-
over at random locations
(generating two new individuals)


Probabilistically mutate each offspring at random
locations


Replace current population with new population

Fitness
-
Proportionate Selection


An individual with fitness
z

should have a
reproductive probability of
z / t,

where
t

is the
total fitness of all members of the current
population.


Can implement via roulette
-
wheel sampling.


Compute normalized fitness (FN
i
) for each member of population and put
results in fitness array.


FN
i

= F
i
/SF (where SF is the sum of all fitness values)


Compute cumulative normalized fitness (FC
i
) for each member of
population and put results in fitness array.


FC
0

= FN
0



FC
i

= FC
i
-
1

+ FN
i

(for
i

> 0)


Select members to be reproduced, proportional to their fitness


Generate P random numbers R
k


For each random number R
k
, find the member
i

such that FC
i
-
1

< R
k

≤ FC
i



cf. Kennedy and Eberhart, 2001

GA Example

An airline company has a certain number of items
to fly from Jakarta to Singapore. Each item has a
specific weight and value (corresponding to the
profit they will bring when they reach their
destination). Unfortunately, the airplane being
used can only carry a limited amount of cargo (by
weight). The airline must, therefore, determine
which items to carry so as not to exceed the
airplane’s capacity but also so as to maximize its
profit. Design and implement a genetic algorithm
solution to this problem.

Airline
Problem


Consider a collection of items from which to
choose (for shipment not to exceed
W

pounds):


Item

Weight

Value


1


5.0


15.0


2

10.0


20.0


3


8.0


12.0


4


7.0


13.0


5

12.0


8.0


6

15.0


17.0


7


4.0


10.0


8

11.0


5.0


9


3.0


8.0


10


9.0


9.0

Total

84.0

117.0

The 0
-
1 Knapsack
Problem

(a naïve solution: select items on basis of value/weight)

http://www
-
cse.uta.edu/~holder/courses/cse2320/lectures/l15/node11.html

http://xkcd.com/287/

What factors must be considered when
designing a GA approach to this problem?

Some factors to consider


Data representation (encoding: what, how, binary or decimal, etc.)


How to do cross
-
over


Whether to do mutation (and if so, how to do it)


Fitness evaluation


Population size and limit


Number of births per generation


How to determine when to quit (# generations; fitness ceiling; etc.)


Control parameters


Processing logic


What factors must be considered when
designing a GA approach to this problem?

Airline Problem via Genetic Algorithm


Encode a possible shipment of
N

items as
an
N
-
bit string


Bit position
i

represents the selection (1) or
omission (0) of item
i
.


Example:


For 10 items, the string 0110000010 indicates
that the second, third, and ninth items are
being considered
for
shipment
.


DEMO

The Traveling Salesman Problem


What is the shortest route a salesman can
take to visit all of the cities in his territory
and return home?

The Traveling Salesman Problem


What is the shortest route a salesman can
take to visit all of the cities in his territory
and return home?

The (Real) Traveling Salesman “Problem”


As the number of cities increases, the time it takes
to find an exact solution increases exponentially.


Example:

Number of cities

# paths


Time to solve (on a fast PC)



8



2520


almost instantaneously



10


181,440

1 second



12


20 million

20 seconds




20


60,800,000,000,000,000

?


100


4.67 x 10
157



?


Applying a Genetic Algorithm to the Traveling
Salesman Problem


Encode a population of paths



AFDHBGEJCI

fitness=.8



EBDGAFJICH

fitness=.4







DGEIAFBJHC

fitness=.6



Select (based on fitness)



CAGBJEIDHF



IDJGEBFHAC



Crossover



CA
GBJE
IDHF

CA
JGEB
IDHF



ID
JGEB
FHAC

ID
GBJE
FHAC



Mutate



I
D
GBJEF
H
AC

I
H
GBJEF
D
AC

Repeat for many
“generations”

Applying a Genetic Algorithm to the
Traveling Salesman Problem

Solution is the most fit individual after some specified
number of generations: e.g., ACDEFIJGHB

(i.e., GA prediction of the best route)


Key: The solution is obtained relatively quickly.

Demo

Determinism and randomness can interact
productively



Genetic algorithms


illustrate interplay between


determinism and randomness


during search (exploration


and exploitation)

Insights from the TSP


We used a biological metaphor (natural
selection) to construct the algorithm to
solve this problem.



Determinism and randomness can
interact productively



The TSP is itself a metaphor for many
other problems of interest.


Insights from
the TSP
(continued)

Insights from the TSP (continued)

A non
-
rooted phylogenetic tree

Multiple Sequence Alignment

A

B

C

D

A

0

2

1

3

B

2

0

5

3

C

1

5

0

1

D

3

3

1

0

Inter
-
sequence
distances

(one approach)

Initialization

1. create G
0

Evaluation

2. evaluate the population of generation n (G
n
)



3. if the population is stabilized then END



4. select the individuals to replace



5. evaluate the expected offspring (EO)

Breeding

6. select the parent(s) from G
n



7. select the operator



8. generate the new child



9. keep or discard the new child in G
n+1



10. goto 6 until all the children have been successfully put into G
n+1



11. n = n+1



12. goto EVALUATION

End


13. end

SAGA: Sequence Alignment by Genetic Algorithm

(Notredame and Higgins, 1996)

http://www.tcoffee.org/Publications/Ps_pdf/saga_paper.pdf

SAGA

(continued)

Notredame and Higgins

backwards

Mount

SAGA (continued)

Notredame and Higgins

SAGA (continued)

Notredame and Higgins

SAGA

(continued)

Notredame and Higgins

SAGA

(continued)

Notredame and Higgins

SAGA

(continued)

Notredame and Higgins

SAGA

(continued)

Notredame and Higgins