OPTIMIZING HIDDEN MARKOV MODELS USING GENETIC ALGORITHMS AND ARTIFICIAL IMMUNE SYSTEMS

cottonseedbotanistΤεχνίτη Νοημοσύνη και Ρομποτική

23 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

92 εμφανίσεις




OPTIMIZING HIDDEN MA
RKOV MODELS USING
GENETIC ALGORITHMS A
ND ARTIFICIAL IMMU
NE
SYSTEMS


Mohamed Korayem, Amr Badr, Ibrahim Farag

Department of Computer Science

Faculty of Computers and Information

Cairo University



ABSTRACT


Hidden Markov Models are

widely used in speech
recognition and bioinformatics systems. Conventional
methods are usually used in the parameter estimation
process of Hidden Markov Models (HMM). These
methods are based on iterative procedure, like Baum
-
Welch method
, or

gradient bas
ed method
s
. However,
these methods can yield to local optimum parameter
values. In this work, we use artificial techniques such
as Artificial Immune Systems (AIS) and Genetic
Algorithms (GA) to estimate HMM parameters. These
techniques are global search op
timization techniques
inspired from biological systems.
Also,

t
he hybrid
between genetic algorithms and artificial immune
system was used to optimize HMM parameters.


Keywords:

Artificial
Immune System
s; Genetic
Algorithm; Clonal
Selection

Algorithm,
Hybri
d
Geneti
c Immune System ;Hidden Markov
Models(HMM); Baum
-
Welch(BW)


1.

INTRODUCTION


Hidden Markov Models (HMM) have many
applications in signal processing, pattern recognition,
and speech recognition

(
Rabiner
, 1993
)
.

HMM
is
considered

a
basic component i
n speech
recognition systems. The estimation of good model
parameters affects the performanc
e of the recognition
process so

the values of these

parameters
are
need
ed

to be
estimated such that the
recognition error
is
minimized
.

HMM parameters are determine
d during iterative
process called
"training p
rocess
"
. One of the
conventional methods that
are

applied in setting HMM
model parameters values is Baum Welch algorithm.
One drawback of this method is that it converges to a
local optimum.


Global search techn
iques can be used to optimize
HMM parameters.
In this paper
,

the performance of
two

global optimization techniques
is

compared with
B
aum Welch algorithm which is one of the traditional
techniques that
are
used to estimate HMM parameters
.
These te
chniques a
re Genetic Algorithms and

Clonal
Selection
Algorithm
which is

inspire
d from artificial
immune system. Also, a h
ybrid genetic immune
method
is proposed to optimize HMM parameters

and
then
compared with the above methods
.


The natural immune system uses a va
riety of
evolutionary and adaptive mechanisms to protect
organisms from foreign pathogens and misbehaving
cells in the body

(
Forrest
, 1997)

(
De Castro
, 2005
)
.
Artificial immune systems (AIS)

(
Somayaji,
1998)

(
Hofmeyr,
2000)

seek to capture some a
spects of the
natural immune system in a computational framework,
either for the purpose of modeling th
e natural immune
system
or for
solving engineering problems
(
Glickman
,
2005)
.
Clonal selection algorithm which is
a
n aspect from immune system
is

used
to optimize
HMM parameters. Clonal
selection

algorithm


(
De
Castro


2000)(

De Castro

2002)
is a special kind of
artificial immune systems algorithms that uses the
clonal expansion

principle

and the affinity maturation
as the main forces

of the evolutionary

process
(

De
Castro,2002

b
)
. Genetic Algorithm

is

another global
optimization technique

which is used to optimize
HMM parameters
.
The main force of the evolutionary
process for the GA which is crossover operator
and
mutation operator can be merged with cl
onal selection
principle to optimize HMM parameters. So, a
hybrid
genetic immume
technique
is
proposed
.


2. HIDDEN MARKOV MOD
ELS (HMM)


HMM

are probabilistic models useful for modeling
stochastic sequence with underlying finite state
structure. Stochastic

sequences in speech recognition
are called observation sequences
O

=
o
1

o
2

………
o
T
,

where
T

is the length of the sequence. HMM with
n

states (
S
1
,
S
2

….
S
n
) can be characterized by a set of
parameters
, where


is

the initial
distribution probability that describes the probability
distribution of the observation symbol in the initial
moment, and

and

>=0.

A

is the transition probability matrix {
a
ij

|
i
,
j
=1,2,3 …
n
},

a
ij

is the probability of transition from state
i

to
state
j
, and

and
>=0.



B

is the observation matrix
{
b
ik

|
i

=1,2,3 ………….
n
,
k
=1,2……….
m
} where n is the number of the states
and m is the number of o
bservation symbols.

,

>=0
,

b
ik

is the probability of
observation symbol with index
k

emitted by the
current state
i
.


The main problems of HMM are:

evaluation
,
decoding
, and
Learning

problems.


Evaluation
problem

Given the HMM

and the observation
sequence
O
=
o
1

o
2

...
o
T
,
the probability that model

has generated sequence
O

is calculated
.

Often this problem is

solved by The Forward
Backward

Algorithm

(
Rabiner
,1
989) (
Rabiner
,1993)
.

Decoding problem

Given the HMM

and the observation
sequence
O
=
o
1

o
2

...
o
T
, calculate the most likely
sequence of hidden states


that produced this
observation sequence
O
.

Usually this problem i
s handled by Vit
erbi Algorithm

(
Rabiner
,1989) (
Rabiner
,1993)
.

Learning problem

Given some training observation sequences
O
=
o
1

o
2

...
o
T

and general structure of HMM (numbers of hidden
and visible states), determine HMM parameters
that best fit trai
ning data.


The most common solution for this problem is Baum
-
Welch algorithm

(
Rabiner
,1989) (
Rabiner
,1993)

which i
s considered the traditional method for training
HMM.

The third problem i
s solved by using three global
optimization techniques. T
he

results

from these
techniques a
re compared with the traditional technique



Figure 1 :Six state left
-
right HMM model

In this paper
,

a six state left
-
right HMM model
i
s used
as shown in figure

1
. Optimizing HMM parameters is
estimated

by using
the mentioned
thre
e global
optimization techniques and
the
traditional method .

The speech vectors are vectors quantized into a
codebook with a size of 32
.
The transition matrix
A

is
a 6 x 6 matrix and the observation matrix
B

is of size 6
x 32.
According to this configurat
ion (LR HMM
model) see figure 1, some transitions of the matrix are
constantly zero.


3
-

OPTIMIZING
HMM TRAINING


3.1Genetic Algorithm (GA)

The genetic algorithm is a robust general purpose
optimization technique
,
which evolv
es a population of
solutions
(
Goldberg
,1989)
.


GA
is
a search technique
that
has a representation of
the
problem states and
also has
a set of operations to
move through the search space. The states in the GA
are represented using a set of chromosomes. Each
chromosome represents a candi
date solution to the
problem. The set of candidate solutions forms a
population. In essence, the GA produces more
generations of this population ho
ping to reach a good
solution for

the problem. Members (candidate
solutions) of the population are improved a
cross
generation through a set of operations that GA
uses
during the search process.
GA has three basic
operations to expand a candidate solution into other
candidate solutions. These basic operations are:



Selection: In this operation, an objective
functio
n
(called fitness function) i
s used to
assess the quality of the solution. The fittest
sol
utions from each generation are

kept.



Crossover:

This operation generates
new
solutions given a set of selected members of
the current population.

Crossover
exchang
es

genetic material between two single
chromosome parents




This set of selected members is the outcome
of the selection operation.



Mutation: Biological organisms are often subject to a
sudden change in their chromosomes in an unexpected
manner. Such a sudd
en change is simulated in GA in
the mutation operation. This operation is a clever way
to escape from local optima trap in which state
-
space
search algorithms may fall into. In this operation
,
some values of a chromosome a
re changed by adding
random values

for the current values. This action
changes the member values and hence produces a
different solution.







Genetic Algorithm pseudo code:

1.

Generate initial random population of
chromosomes

2.

Compute the fitness for each chromosome
in the current population

3.

M
ake an intermediate population from the
current population using the reproduction
operator.

4.

Using the intermediate population,
generate a new population by applying the
crossover and mutation operators.

5.

If you get a member the population that
satisfies the

requirements stop, otherwise go
to step 2.


In this work, The GA
is

applied to estimate the HMM
model parameters. This
parameters estimation
problem i
s represented as shown in figure 2 as
follows:

Each member (chromosome) of the
generation represents the
A

matrix and the
B

matrix
jointly. Each row of the
A

matrix
i
s encoded into an
array, and all the arrays are concatenated to constitute
one array, where the first row
i
s followed by the
second row then the third row and so on. Then, the B
matrix
i
s encoded

row by row in the same way.



Figure
2:

Representation of Chromosome


The members from a given generation
a
re selected for
the reproduction phase. The fitness function which
is
used depend
s

on the average of all log likelihood for
all utterances for a wo
rd as described in section 4. The
arithmetic crossover
i
s applied on the population. The
arithmetic crossover generates two children of two
parents, and the values of a child
a
re set as the
averages of the values of the parents and the values of
the other
child
i
s set by using the equation (3*
p
1
-
p
2
)/2,
where
p
1

is the first parent and
p
2

is the other parent.
When applying the arithmetic crossover, the resultant
values created from crossover must be in the range of
limited values for each parameter.

For the
mutation operation,

the following method

is
applied
:

According to the representation of chromosomes
which consist of real values, the creeping operator
i
s
used as the mutation operator which adds generated
random values (Gaussian values) to the original
values. The resultant values must be within the
defined limits.


3.2 Clonal
Selection

Algorithm

Artificial immune systems (AIS) are adaptive systems,
inspired by theoretical immunology and observed
immune functions, principles and models, which are

applie
d to problem solving
(

D
e Castro
,2002
c
)
.

The clonal selection algorithms

are a special kind of
Immune Algorithms using the clonal expansion and
the affinity maturation as the main forces of the
evolutionary process

.


The clonal selection algorithm is desc
ribed as follows:

1
-

Generate initial antibodies (each antibody
represents a solution that represents the
parameters of HMM in our case the
A

and
B

matrices).

2
-

Compute the fitness of each antibody. The
used fitness function computes the average
log probability

over training data.

3
-

Select antibodies from population which will
be used to generate new antibodies (the
selection can be random or according to the
fitness rank). The antibodies with highest
fitness
a
re selected such that they
a
re different
enough as des
cribed later.

4
-

For each antibody, generate clones and mutate
each clone according to fitness.

5
-

Delete antibodies with lower fitness form the
population, then add to the population the new
antibodies.

6
-

Repeat the steps from 2
-

5 until stop criterion
is met. T
he number of iterations can be used
as the stop criterion.


Antibodies represent the parameters of HMM. Each
antibody represents a candidate solution
.
Each
member (antibody) of the generation represents the
A

matrix and the
B

matrix jointly like a chromoso
me in
GA see figure (2).

The fitness value for each antibody
i
s computed
as follows:


F
=
,



(1)

where
L
i

is

log likelihood for utterance
,

n

is the
number of utterances for word.

Selection in clonal
Selection algorithm

depen
ds on the
fitness values for each antibody; the antibodies with
the highest fitness
a
re selected such that they
a
re
different enough
. The Euclidean distance between any
two antibodies is greater than a threshold
. The
Euclidean distance
i
s used to measure t
he difference
between the antibodies.





(2)

For all antibodies the fixed number
i
s used to generate
clones. In each cycle of the algorithm, some new
antibodies
a
re added to the population. The percentage
of these new antibodies i
s equal to 10% from the
population size. For mutation the value was added to

each value in the antibody this value
is

generated by
(
* Gaussian value) where

is computed according
to the following equation




=
,

(3)

where
F

is the fitness value for the antibody and

is
decaying factor.


3.3 Hybrid Genetic
-
Immune System Method

The

proposed

hybrid method
depend
s on genetic
algorithms and

immune system. The main forces of
the evolutionary process for the GA are crossover and
the mutation operators. For the Clonal
selection

algorithm the main force of the evolutionary process is
the idea of clone
selection

in
which new clones

are
generate
d.

T
hese new clones

are then mutated

and the
best of these clones
a
re added to the population plus
adding new generated members to the population. The
hybrid method take the main force of the evolutionary
process for the two systems.


The hybrid method is

described as follow:
-

1
-

Generate the initial population (candidate
solutions).

2
-

Select the (N) best items from the population.

3
-

For each selected item generate a number of
clones (N
c
) and mutate each item form (N
c
).

4
-

Select the best mutated item from each gro
up
(N
c
) and add it to the population.

5
-

Select from the population the items on which
the crossover will be applied. We select them
randomly in our system

but any selection
method can be used.
.

6
-

After selection make a crossover and add the
new items (items a
fter crossover) to the
population by replacing the low fitness items
with the new ones.

7
-

Add to the population a group of new
generated random items.

8
-

Repeat step 2
-

7 according to meeting the
stopping criterion.


The steps 2
-
5 were repeated for a nu
mber of times
before adding new group of generated random items
.


4. EXPERIMENTS


Dataset description

The used data i
s recorded for a speech recognition
task. The 30 samples for 9 words
a
re collected
. These
words
represent

th
e digits from 1 to 9
spoken in a
rabic
language. As a standard procedure in evaluating
machine learning techniques, the dataset
i
s split into a
training set and a test set. The training set
i
s composed
of 15

x

9 utterances, and the same size
i
s used for the
test set. HMM models

a
re trained using the above
three methods. Then, the performance of each model
i
s tested on the test dataset. Models
a
re compared
according to the average log likelihood over all
utterances for each word. Moreover, HMM model
i
s
trained by using one tradit
ional method (Baum
-

Welch
algorithm)
. The results are reported in table 1.



The objective of these experiments is to determine
which of four methods yields better model in terms of
the maximum likelihood estimation (MLE) of training
and testing data.


5.
RESULTS



Table1 :
Average Log
Likelihood

for
Genetic Algorithms , Clonal
Selection

, Hybrid Method vs. Baum Welch


Table1 shows the average likelihood for each w
ord
resulting from applying the GA, clonal
selection
,
Hybrid genetic immune, and Baum Welch algorithms.

The figures

3 and 4 present comparison of the four
techniques.




GA, Clonal
Selection
, Hybrid Method Vs.
Baum
-
Welch

It's clear from table 1

that GA, Cl
onal
Selection

Algorithm, and The Hybrid Method optimize HMM

Experiment

Genetic Algorithms

Clonal
Selection

Hybrid Genetic Immune

Baum Welch

Training
Data

Testing Data

Training Data

Testing Data

Training
Data

Testing Data

Training

Data

Testing Data

Word 1

-
87.678826

-
120.456613

-
87.571879

-
122.846582

-
79.952704

-
112.972627

-
100.256415

-
132.043827

Word 2

-
112.853853

-
129.668485

-
109.478160

-
124.280939

-
99.285726

-
116.882284

-
126.857669

-
144.2
17638

Word 3

-
120.233247

-
136.341848

-
117.629193

-
135.939801

-
110.821281

-
127.068196

-
127.729650

-
144.002114

Word 4

-
99.422550

-
106.937365

-
97.869824

-
105.532062

-
90.419239

-
98.660715

-
100.071147

-
109.651458

Word 5

-
114.924569

-
135.706757

-
107.990526

-
1
27.026957

-
100.729004

-
120.318206

-
119.703586

-
139.742821

Word 6

-
111.930453

-
128.168076

-
105.539754

-
123.945254

-
97.301037

-
114.680868

-
118.338618

-
135.820565

Word 7

-
103.412966

-
113.111271

-
100.211900

-
109.174607

-
94.428044

-
106.360118

-
112.493737

-
120
.044928

Word 8

-
91.442692

-
121.272038

-
91.163196

-
122.357082

-
80.717414

-
111.834818

-
92.313749

-
123.730211

Word 9

-
87.677962

-
112.124334

-
82.390500

-
105.931426

-
76.158748

-
98.281587

-
95.080516

-
117.038164


parameters better than Baum Welch Algorithm for all
Words, they maximize likeli
hood better over training
data (
see figure3) and testing data (see figure 4)

We note that for experiment eight the B
aum
-
Welch ,
GA, Clonal
Selection

Algorithm almost yield to the
same results but the Hybrid Method gives better
results. .



GA Vs Clone
Selection

Algorithm

The immune clone
selection
gives

better results than
GA for all words .We also we note that for t
he
experiment one, four, and eight the two algorithms
almost yield the same result.


Figure 5 a , b

shows that the fitness function in clonal
selection

is better than genetic algorithm which yields
to better results for optimizing HMM parameters. The
figu
re also show that the clonal
selection

fitness
function increase faster than genetic algorithms
especially in the beginning iterations.



Hybrid Method Vs GA and Clonal
Selection

Algorithm

We note from the results above The Hybrid Method is
give better r
esults than GA and Clonal
Selection

Algorithm for all experiments over the training data
and the testing da
ta, and it's clear from figure 5 a, b

and
c

that the fitness function for The Hybrid Method
is better at any moment in the graph than Genetic
Algorit
hm and Clonal
Selection

Algorithm.


Figure 3 : Log Likelihood for Training Data




Figure 4 : Log Likelihood for Testing Data




a

b



c



Figure 5

a, b and c

:
Fitness function of clonal
selection

algorithm, genetic algorithms and hyb
rid m
ethod
word 9 .


C
ONCLUSION

In this paper, we presented the results of using the
Genetic Algorithm and the clonal
selection

algorithm
to optimize the HMM parameters. We also proposed a
hybrid
immune genetic
algorithm
for optimizing
HMM parameters
.

it t
akes into account the main
immune aspects: selection and cloning of the most
stimulated cells, death of non
-
stimulated cells, affinity
maturation and reselection of the clones with higher
affinity, generation and maintenance of diversity
.

It
also takes int
o account the main force
s

of the
evolutionary process for the GA which
are

crossover
operator and mutation operator.

T
he results show that
the used global optimization techniques produce better
results than the traditional Baum Welch algorithm.
Moreover, t
he proposed hybrid algorithm produced
the best resu
lts over all tested techniques.

The
global
search algorithms
generated better results because it
doesn't fall in local minima like the Baum Welch
algorithm.



REFERENCES:
-

D
e Castro L.N.&VonZuben,F.J.

(20
00)

“The Clonal
Selection Algorithm with

Engineering Applications”,
GECCO’00


Workshop Proceedings, pp. 36
-
37.


De Castro

L. and Timmis, J. (2002)

"Artificial
Immune Systems: A New Computational Approach",
Springer
-
Verlag New York, Inc.


De Castro

L. N.

& Von Zuben, F. J.
(2002

b
)
"Learning
and Optimizat
ion Using the Clonal Selection
Principle", IEEE Transactions on Evolutionary
Computation, Special Issue on Artificial Immune
Systems, 6(3), pp. 239
-
251..



D
e Castro L. N. and Timmis

J.
(
2002

c)

"Artifici
al
Immune Systems: A Novel Paradigm to Pattern
Recognition"In
Artificial Neural Networks in Pattern
Recognition
, J. M. Corchado, L. Alonso, and C. Fyfe
(eds.), SOCO
-
2002, Unive
rsity of Paisley, UK, pp. 67
-
84.


De Castro

L.N., Von Zuben, F.J

(2005).

Recen
t
Developments in Biologically Inspired Computing
,
Idea Group Inc. (IGI) Publishing
.


Forrest

S., Hofmeyr S., and Somayaji

A.

(1997)

"Computer Immunology.". Communications of the
ACM Vol. 40, No. 10, pp. 88
-
96
.


Goldberg, D. E.
(1989)

.
Genetic Algorithm
s in Search,
Opti
-
mization & Machine Learning
, Addison
-
Wesley..


Glickman M. , Balthrop J., and Forrest
S
.

(2005)

"A
Machine Learning Evaluation of an Artificial Immune
System ," Evolutionary Computation Journal, Vol 13,
No 2 pp. 179
-
212.


Hofmeyr S. and
Forrest S.
(2000)

"Architecture for an
Artificial Immune System." Evolutionary
Computation 7(1), Morgan
-
Kaufmann, San Francisco,
CA, pp. 1289
-
1296.


Rabiner L. and Juang B.
(1993)

.

Fundamentals of
Speech Recognition
. Prentice
-
Hall, Englewood

Cliffs,NJ.


R
abiner L.R.
(1989)

"A tutorial on HMM and Selected
Applications in Speech Recognition", In:[WL],
PROCEEDINGS OF THE IEEE
, VOL. 77, NO.
2,
pp267
-
296
.


Somayaji

A.,
Hofmeyr
S., and
Forrest
S.

(1998)

"Principles of a Computer Immune System". New
Security Pa
radigms Workshop, pp. 75
-
82, ACM.