Genetic Algorithms

pocketsoreΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

84 εμφανίσεις

Genetic Algorithms

Overview


Genetic Algorithms: a gentle introduction


What are GAs


How do they work/ Why?


Critical issues



Use in Data Mining


GAs and statistics


decile performance maximization


multi
-
objective models

Natural Genetics to AI


Computational models inspired by
biological evolution


survival of the fittest


reproduction through cross
-
breeding

Genetic Algorithms


Population based search (
parallel
)


simultaneous search from multiple points in search space



useful in complex, unstructured search spaces


(less prone to local failures)



Population members: potential solutions



Population of solutions evolve from one
generation to the next



Genetic Algorithms


Search objective


Fitness score for population members



(
fitness function
)


Survival of the fittest


selection


Generating new solutions


“Mating” and reproduction of individuals



(crossover, mutation)


Basic Operation

Selection

Recombination

Crossover

Mutation

Generation t

Generation t+1

GAs: Parallel Search

X

X

Hill

climber

Fitness

x

GAs: Basic Principles


Representation of individuals


String of parameters (
genes
) :
chromosome



eg
. optimize a function F(p,q,r,s,t)




Population members: p q r s t



genotype
and
phenotype

Binary representation?


Population members as bit strings


F( p,q,r,s,t) as:





1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 1 0 1 0





p q r s t



early theory in terms of binary strings


(schema
theorem)


unnecessary perversity?



GAs: Basic Principles


Survival of the fittest (
Fitness function
)



numerical “figure of merit”/utility measure of an individual


tradeoff amongst a multiple evaluation criteria


efficient evaluation

GAs: Basic Principles


Iterative search


population evolves over generations



Convergence


progression towards uniformity in population


premature convergence?


(local optima)


Typical GA Run

Fitness

Generations

Best

Average

Operators: Selection


Fitness proportionate selection (f
i
/f )


number of
reproductive trials

for individuals

Selection


Roulette
-
wheel selection


(stochastic sampling with replacement)



wheel spaced in proportion to

fitness values


N (pop size) spins of the wheel



Stochastic universal sampling


N equally spaced pins on wheel


single turn of the wheel


Selection


Premature converge


Fitness scaling


f = f
-

(2*avg.
-

max.)



Ranked fitness


Elitism


Steady
-
state selection


Demetic grouping

Operators: Crossover


Parent 1: axpsqvqbtpihd



Parent 2: qzxxaycgbtphw









crossover sites

Offspring 1: azpsavcbtpphd

Offspring 2: qxxxqyqgbtihw





(
Uniform crossover
)


combining good
building blocks

Operators: Mutation


alters each gene with small probability



x 1 y x 0 y
0

y y 0 x y x y






x 1 y x 0 y
1

y y 0 x x x y


Non
-
Binary Representations


Integer, real
-
number, order
-
based, rules, ...



Binary or Real
-
valued?


real representations give faster, more



consistent, more accurate results



High
-
level representation


intuitive, can utilize
specialized

operators


effective search over complex spaces

Real
-
valued representation

Parent1:


3.45 0.56 6.78 0.976 2.5

Parent2:


0.98 1.06 4.20 0.34 1.8


Offspring1:
3.22

0.56 6.78
0.65

2.12


Offspring2:

1.43

1.06 4.20
0.41

1.93




(Arithmetic crossover)


High
-
level representation

Parent1:

Parent2:


Offspring1:

Offspring2:

High
-
level representation


Generalize/Specialize



Tree
-
structured representation (GP)

/

x

5

log

*

(x log(y))/5)

y


Automated learning of programs (originally)


parse tree expressions



Non
-
linear interaction terms



Function set : internal nodes

{+,
-
,*,/,log}


terminal set: leaf nodes

{constants, variables}


Tree
-
structured representation


Representing complex patterns

<

if

y

7

0

*

y

x

2

+

AND

>

x

2

If (y<7) and (x>2)

then 0

else 2x+y

Genetic search: Issues


Coding scheme
,
fitness function

critical


the “art” in GA design!


General mechanism so robust that, within reasonable margins,
parameter settings are not critical
.



Representation to match problem, domain


utilizing domain knowledge


problem
-
specific crossover, mutation, selection



Flexibility in fitness function formulation


modeling business objectives


Genetic search: Issues


Stochastic search


initial populations, probabilistic operators


multiple runs with different random streams


Initializing population with known solutions


seeding initial population with solutions from multiple,
independent runs

Genetic search: Issues


Guarantees optimality?


But...



GAs and traditional techniques


especially useful where traditional approaches fail


in conjunction with traditional techniques



Parallelizable for large data


multi
-
processor, networked machines


Using GAs ?


When to use a GA?


GA and traditional techniques


How long does it take?


Will it perform better?

Using GAs


population size


mutation, crossover rates


how many generations


multiple runs


Is it a “black
-
box”?

?

Huh?


Data characteristics


Fitness function


GA parameters

GA Application Examples


Function optimizers


difficult, discontinuous, multi
-
modal, noisy functions


Combinatorial optimization


layout of VLSI circuits, factory scheduling, traveling
salesman problem


Design and Control


bridge structures, neural networks, communication networks
design; control of chemical plants, pipelines


GA Application Examples


Machine learning


classification rules, economic modeling, scheduling strategies



Portfolio design, optimized trading models, direct

marketing models, sequencing of TV advertisements,

adaptive agents, data mining, etc.