MULTI-OBJECTIVE LEARNING VIA GENETIC ALGORITHMS

losolivossnowAI and Robotics

Oct 23, 2013 (3 years and 10 months ago)

241 views

MULTI-OBJECTIVE LEARNING VIA GENETIC ALGORITHMS
J. David Schaffer
Department of El ect r i ca l Engineering
John J, Gref enst et t e
Department of Computer Science
Vanderbi l t Uni ver si t y
Nashvi l l e, TN 37235
ABSTRACT
Geneti c al gori thms (GAs) are powerf ul, general
purpose adaptive search techniques which have been
used successf ul l y i n a var i et y of l earni ng systems.
In the standard f ormul at i on, GAs mai ntai n a set of
al t er nat i v e knowledge st ruct ures for the task to be
l earned, and improved knowledge st ruct ures are
formed through a combination of competi ti on and
knowledge shari ng among the al t er nat i v e knowledge
st r uct ur es. In t hi s paper, we extend the GA para-
digm by al l owi ng mul ti di mensi onal feedback concern-
ing the performance of the al t er nat i v e st r uct ur es.
The modi fi ed GA is shown to solve a mul t i cl as s pat -
t er n di scr i mi nat i o n task which coul d not be solved
by the unmodified GA.
1- I nt r oduct i o n
Patter n cl assi f i cat i o n i s a cent r al task i n
f l exi bl e systems which i ncorporat e an arsenal of
problem sol vi ng techni ques. For example, a gi ven
problem i nstance may need to be cl assi f i e d in order
to decide which problem sol vi ng method shoul d be
appl i ed. Thi s paper concerns the task of mul t i -
cl ass pat t er n di scri mi nat i o n i n a l earni ng system.
As in Mi t chel l [ 8], we view l earni ng as a search
process. But rather than searching a space of con-
cept s, we consider a l earni ng system which searches
a space of producti on system programs for programs
which adequatel y accomplish the desi red pat t er n
cl assi f i cat i on. The search i s accomplished by
means of a geneti c al gori thm (GA). GAs are power-
f ul adapti ve search techniques which have been used
successf ul l y i n a var i et y of l earni ng systems
[ 3,4,5,12]. In t hi s paper, GAs are extended i n
order t o perform mul t i - obj ect i v e l earni ng i n a pat -
t er n cl assi f i cat i o n domain.
2• Learni ng vi a Geneti c Search.
Thi s secti on contai ns a br i ef descr i pt i o n of
GAs. More det ai l ed descri pt i ons are avai l abl e i n
the l i t er at ur e [ 3,6,7,12]. Br i ef l y, GAs may be
viewed as adaptive generate-and-tes t procedures.
GAs are adaptive in the sense t hat the candidat e
sol ut i ons generated r ef l ec t and expl oi t i nf ormat i on
obtai ned by ear l i er t est s. A GA mai ntai ns a popu-
l at i on of knowledge st ruct ures ( e.g. al t er nat i v e
set s of producti on rul es f or a gi ven task) and
repeatedl y (1) sel ect s st ruct ures on the basi s of
observed performance, and (2) appl i es i deal i zed
genet i c operator s t o the sel ected st ruct ures t o
const ruct new st r uct ur es. For example, one impor-
t ant geneti c operator i s crossover, by which sub-
set s of r ul es may be exchanged between two al t er na-
t i ve knowledge st r uct ur es. Thi s r esul t s i n a
sophi st i cat ed search i n which subset s of r ul es
which cont ri but e to good performance are propagated
through the popul at i on. Other geneti c operator s
are described i n Smith [ 12]. In t hi s paper, we
concentrat e on the i nf l uence of the performance
feedback on step (1) above, the sel ect i on of
knowledge st ruct ures f or reproduct i on.
The char act er i zat i o n of the power and l i mi t a -
t i ons of GAs i s an act i ve research area, but pr el -
i mi nary t heor et i ca l r esul t s are avai l abl e. For
example, the number of st ruct ures i n the popul at i on
which contai n a given subset of r ul es can be
expected to i ncrease or decrease over ti me at a
rat e pr opor t i onal t o the average observed per f or -
mance of a l l knowledge st ruct ures which cont ai n
t hat set of rul es [ 11]. Thus, al l subset s of r ul es
appearing i n the popul ati on of knowledge st r uct ur es
are explored si mul taneousl y i n a near-opt i mal
f ashi on, a phenomenon which i s cal l ed i mpl i ci t
par al l el i s m by Hol l and[ 7]. Bethke[2] descri bes
some propert i es of search spaces which may be espe-
c i al l y hard f or GAs.
Smith[12] implemented a GA-based machine
l earni ng system cal l ed LS- 1. I n LS- 1, each st r uc -
t ur e maintained by the GA represent s a producti on
system (PS) program. Each PS program is eval uated
on the l earni ng task by a c r i t i c which assigns a
numeri cal measure of f i t ness t o the evaluated pr o-
gram. When a l l of the PS programs in the current
popul at i on have been eval uated, the GA is invoked
to const ruct a new popul ati on of PS programs, and
the cycl e i s repeated. (See Figure 1.) LS-1 suc-
cessf ul l y learned PS programs f or maze tasks and
f or draw poker. Our work extends the LS-1 system
i n order t o achieve mul t i - obj ect i v e l ear ni ng.
3• The Task Domain
Mul t i - cl as s pat t er n di scr i mi nat i o n was
sel ected as a represent at i v e mul t i - obj ect i v e l ear n-
i ng t ask. The speci f i c task under i nvest i gat i o n
was t o cl assi f y muscle ac t i v i t y pat t erns f or f i v e
human gai t cl asses, representi ng one normal and
four abnormal gai t t ypes. A t r ai ni ng set of 11
t est cases was obtai ned from the l i t er at ur e [ 1 ],
eaoh t r ai ni n g case consi st i ng of a 12- bl t st r i ng
deri ved from EMG si gnal s from l eg muscles whi l e
wal ki ng. As In LS- 1, we used a GA to search a
594 J. Schaffer and J. Grefenstette
space of knowledge st r uct ur es ( i.e., r ul e set s).
The goal of the l earni ng system was to f i nd a
knowledge st r uct ur e which cor r ect l y cl assi f i e s the
11 t r ai ni n g cases. The performance of the l earni ng
system was measured by the number of knowledge
st r uct ur es tested before obt ai ni ng a sol ut i on. By
choosing subset s of the t r ai ni n g cases, we deri ved
2- cl ass, 3- cl ass, 4- cl ass, and 5-cl ass di scr i mi na -
t i on problems. Each experiment was st ar t ed wi t h an
i n i t i a l popul at i on of randoml y generated knowledge
st r uct ur es, i n order to t est the power of the GA
wi t h no i n i t i a l knowledge. (Our implementation
system[9] does al l ow the I ncorporat i on of heur i st i c
knowledge i nt o the i n i t i a l popul at i on of knowledge
st r uct ur es.)
4. The Need for Mul ti di mensi ona l Feedbaok
A seri es of experiment s was performed in which
a GA usi ng a scal ar c r i t i c was appl i ed to mul t i -
cl ass di scr i mi nat i o n problems. Although the GA
coul d sol ve 2-cl ass problems, i t coul d not sol ve
the f u l l f i ve- cl as s probl em. An anal ysi s of the
i ndi vi dua l PS programs generated by the GA at v a r i -
ous times duri ng search revealed a common pat t er n.
Knowledge of how to recognize a par t i cul a r cl ass
was f requent l y absent from l at er PS programs, even
when such knowledge was present in ear l i e r pro-
grams. The problem i s t hat st r uct ur es which con-
t ai n complementar y knowledge are forced to compete
by a GA usi ng a scal ar c r i t i c. Consider t hi s si m-
pl e example: Suppose t hat the c r i t i c measures the
f i t ness of each st r uct ur e by counti ng the number of
t r ai ni n g cases which were cor r ect l y cl assi f i ed.
Suppose program P1 contai ns r ul es which cor r ect l y
cl assi f y cl asses A and B, and program P2 cor r ect l y
cl assi f i e s onl y i nstances i n cl ass C. I f a l l
cl asses are equal l y represented i n the t r ai ni n g
set, then program P1 appears to be t wi ce as "f i t"
as program P2. Since the GA sel ect s programs f or
reproduct i on on the basi s of the f i t nes s assigned
by the c r i t i c, P1 wi l l tend t o cont r i but e r ul e sub-
set s to twi oe as many new programs in the next
popul at i on as wi l l P2. As the number of cl asses
i ncreases, speci al i zed knowledge ( l i k e t hat i n P2)
may tend t o suf f er ext i nct i on, f i n a l l y r esul t i n g i n
suboptimal performance f or the l earni ng system as a
whol e.
Our sol ut i on was to modi f y the o r i t i c so t hat
a vect or of performance measures was computed f or
eaoh st r uot ur e, wi t h one sl ot i n the f i t nes s vect or
f or each cl ass represented i n the t r ai ni n g set. We
then modi fi ed the GA so t hat complementar y
knowledge st r uct ur es would share knowledge (through
the geneti c operators) rat her than compete
di r ec t l y.
An i nt er est i n g questi on emerges at t hi s poi nt
which was never an issue wi t h scal ar c r i t i c s. Where
shoul d any punishment f or i ncor r ec t behavior be
applied? Consider a knowledge st r uct ur e which
i ncor r ect l y cl assi f i e s a cl ass A case as cl ass B.
By appl yi ng the penal t y to the A sl ot of the reward
vect or, we are puni shi ng the f ai l ur e to do the
r i ght t hi ng. By appl yi ng i t to the B s l ot, we are
puni shi ng doing the wrong t hi ng. The former st r a -
tegy was adopted f or al l subsequent experi ments,
arguing t hat cl ass X t r ai ni n g cases shoul d c ont r i -
but e, posi t i vel y or negat i vel y, onl y t o the X sl ot
of the reward vect or.
The modi f i cat i on to the basi c GA is as f o l -
l ows: i nstead of sel ect i ng st r uct ur es f or reproduc-
t i on on the basi s of a si ngl e f i t nes s measure, a
por t i on of each new popul ati on is sel ected on the
basi s of each sl ot i n the f i t nes s vect or. Note
t hat i f a st r uct ur e scores wel l on several meas-
ures, then i t wi l l tend t o be chosen i n several
sel ect i on phases. However, st ruct ures which per -
form wel l on even one measure wi l l be given the
oppor t uni t y t o pass along t hei r speci al i zed
knowledge. Af t er the sel ect i on phases, st r uct ur es
are combined vi a geneti c operator j us t as i n LS- 1.
As a r esul t, our modi fi ed GA performs mul t i -
obj ect i ve opt i mi zat i o n i n the space of knowledge
st r uct ur es [ 10].
5. The Cr i t i c
One i mportant propert y of the c r i t i c i n a GA-
based system i s t hat the f i t ness reported by the
c r i t i c must r ef l ec t more than j us t success or
f ai l ur e on the t ask. Otherwi se, the GA is unabl e
to i dent i f y promising programs i n the ear l y stages
when successes are r ar e, and cannot di scr i mi nat e
the bet t er programs i n the l at er stages when they
are p l e n t i f u l. One source of i nf ormat i on of t hi s
type is the amount of uncert ai nt y exhi bi t ed by a PS
program whi l e attempti ng the t ask, where uncer-
t ai nt y i s defi ned as the extent t o which conf l i c t
r esol ut i on i s requi red i n the deci si on process. A
good c r i t i c shoul d not discourage t hi s uncert ai nt y
i n the earl y stages but shoul d discourage i t i n the
l at er stages. Some experiment s wi t h di f f er en t c r i -
t i c s of t hi s sor t revealed something of the power
of the GA t o expl oi t subt l e f eat ures i n the c r i t i c.
For example, a c r i t i c which appl i ed a desi rabl e
uncer t ai nt y cor r ect i on, but onl y rewarded success,
was found to yi el d popul ati ons r i oh in programs we
cal l ed speci al i st s. A speci al i s t was a program
which achieved a maximum score in one sl ot of the
f i t ness vect or, but zero i n a l l ot her s. I t
appeared to "know" onl y one aspect of the t ask.
Unf ort unat el y, these speci al i st s were act ual l y PS
programs which were wi l dl y guessing in the sense
t hat they contai ned an over l y general r ul e which
suggested the same cl assi f i cat i o n f or every t r a i n -
i ng case. A modi fi ed c r i t i c, designed t o punish
t hi s i ndi scr i mi nat e guessi ng, was found to be too
s t r i c t. By making r i sk- t aki n g behavior too
J. Schaffer and J. Grefenstette 595
dangerous, it l ed the GA to evolve programs which
produced no cl assi f i cat i on. Thi s st rat egy at l east
scores zero, which i s bet t er than being excessi vel y
puni shed. Fi nal l y, a j udi ci ous balance of reward
and punishment was achieved by i ncorporat i ng in the
c r i t i c a scori ng scheme i nspi red by the Schol asti c
Apt i t ude Test (SAT scor i ng). Thi s scheme led the
GA to consi st ent success on problems i nvol vi ng 2,
3, 4 and 5 cl asses.
6. Resul t s
the c r i t i c i s too l ax, onl y rewarding successes,
the GA evolves programs which guess wi l dl y to max-
imize the possi bi l i t y of success. When the c r i t i c
is too s t r i c t, the GA qui ckl y l earns than doi ng
nothi ng i s a good st r at egy. Only j udi ci ous bal anc-
ing of reward and punishment leads to ef f ect i v e
l ear ni ng.
REFERENCES
The r esul t s of the experiment s wi t h 2, 3, 4
and 5-cl ass di scri mi nat i o n l earni ng are summarized
i n t abl e 1. The f i gures reported f or number-of -
eval uat i ons- t o- sol ut i o n are averages from several
t r i a l s.
By successf ul l y l earni ng to solve the 5-cl ass
problem, the value of the vector-val ued feedback to
the l earni ng component was demonstrated. It may be
of i nt er es t t hat the onl y attempt t o sol ve t hi s
problem by hand requi red a non- t r i vi aj. ef f or t and
produced a sol ut i on program cont ai ni ng 16 r ul es.
Although a l l sol ut i ons evolved by the GA used the
f u l l al l ot ment of 11 r ul es, i n a l l cases many of
these rul es were not used i n the sol ut i on of the
l earni ng t ask. (They never f i r ed.) They represented
a ki nd of unexpressed geneti c mat er i al. The aver -
age number of f unct i onal rul es i n the sol ut i ons to
the 5-cl ass task was 7.5.
TABLE 1. RESULTS OF EXPERIMENTS IN MULTICLASS
PATTERN DISCRIMINATION LEARNING
WITH LS-2
Number of
cl asses t o
di scr i mi nat e
2
3
4
5
7. Conclusions
Number of
eval uati ons
t o l ear n
sol ut i on
1440
5647
15938
44309
Maximum
number
of rul es
4
6
8
11
Length of
knowledge
st ruct ures
( bi t s )
136
228
328
462
The use of a scal ar c r i t i c l i mi t s the usef ul -
ness of GAs f or mul t i - obj ect i v e l ear ni ng. The
scal ar c r i t i c forces competi ti on between knowledge
st r uct ur es which contai n rul es f or complementary
aspect s of the l earni ng t ask. We have shown t hat
GAs can be extended to perform mul t i c r i t er i a l ear n-
i ng through the use of a mul ti di mensi onal feedback
mechanism from the c r i t i c and a modi f i cat i on of the
GA sel ect i on procedure. A seri es of experiment s
wi t h an implemented system show t hat GAs are oppor-
t uni st i c l earni ng al gori t hms, r equi r i ng car ef ul
design of the c r i t i c. In f act, the GA responded i n
a reasonabl e manner to the vari ous c r i t i c s. When
1. A. B. Bekey, C. Chang, J. Perry, and M. M.
Hof f er, "Pat t er n recogni t i on of mul t i pl e EMG
si gnal s appl i ed to the descr i pt i o n of human
gai t," Proc. IEEE Vol. 65(5) (May 1977).
2. A. D. Bethke, Geneti c al gori thms as f unot i on
opt i mi zer s, Ph. D. Thesi s, Dept. Computer and
Communication Sciences, Uni v. of Michigan
(1981).
3. K. A. DeJong, "Adaptive system desi gn: a
geneti c approach," IEEE Trans. Syst., Man, and
Cyber. Vol. SMC-10(9), pp.566-574 (Sept.
1980).
4. J. M. Fi t zpat r i ck, J. J. Gr ef enst et t e, and D.
Van-Gucht, "Image r egi st r at i o n by geneti o
search," Proceedings of IEEE Southeastcon '84,
pp.460-464 ( Apr i l 1984).
5. D. Goldberg, Computer aided gas pi pel i ne
operat i on usi ng geneti c al gori t hms and r ul e
l ear ni ng, Ph. D. Thesi s, Dept. Ci vi l Eng.,
Uni v. of Michigan (1983).
6. J. H. Hol l and, Adaptati on i n Nat ural and
Ar t i f i c i a l Systems, Uni v. Michigan Press, Ann
Arbor (1975).
7. M. L. Maul di n, "Mai nt ai ni ng di ver si t y i n
geneti c search," Proc. Nat i onal Conf. on AI,
AAAI 84, pp.247-250 (Aug. 1984).
8. T. M. Mi t chel l, "General i zat i o n as Search,"
Ar t i f i c i a l I nt el l i genc e Vol. 18 (1982).
9. J.D. Schaf f er, Some experiment s i n machine
l ear ni ng usi ng vector evaluated geneti c al go-
r i t hms, Ph. D. Thesi s, Dept. of El ect r i ca l
Engi neeri ng, Vanderbi l t Uni ver si t y (Dec.
1984).
10. J.D. Schaf f er, "Mul t i pl e obj ect i v e opt i mi za-
t i on wi t h vector-eval uated genet i c al gor i t hms,"
submitted to I nt'1 Conf. on Geneti c Al gori thms
and t hei r Appl i cat i ons, Carnegie-Mellon Uni v.,
Pi t t sburgh ( Jul y 1985).
11, S. F. Smi th, A l earni ng system baaed on
geneti c adapti ve al gori t hms, Ph. D. Thesi s,
Dept. Computer Science, Uni v. of Pi t t sbur gh
(1980).
12. S. F. Smi th, "Fl exi bl e l earni ng of problem
sol vi ng heur i st i c s through adapti ve sear ch,"
Proc. of 8t h IJCAI (Aug. 1983).