NNWeek3[1].

cracklegulleyΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 7 μήνες)

84 εμφανίσεις

Reading Assignment


Chapter 1 in text


Chapter 2, Sections 2.1, 2.2


Other References:


“Emergent Neural Computational
Architectures Based on Neuroscience”,
Wermter, Austin and Willshaw (Eds.)


“Computational Explorations in Cognitive
Neuroscience”, O’Reilly and Munakata

Neural Networks

Classification

Classification


Classification is a common application for
neural networks, other applications are:


Prediction ( stock market )


Control ( chemical plants, airplanes )



Before a classifier can be designed, several
options must be decided:


Neural network architecture


Features


Training method


Apples, Oranges and Pears


Have an digital image of a collection of
fruit.


Objective: Want to classify each piece
of fruit


Which features could we use?

Features


Color


Shape: Roundness



The feature measurements form the
feature vector
, f(i) = { c(i), s(i) }


The objects form the
class vector
,
v(i)={ n(i) }, where n(i) = apple,
orange or pear

Procedure


To simplify example, consider only apples and
pears.


From a collection of apples and pears,
measure color and roundness for each piece
of fruit.


The measurements will represent ‘typical’
fruits


Plot data on a 2 dimensional plane ( one
dimension for each measurement )


Feature Space and Decision
Boundary



RED


Color


YELLOW




GREEN






NOT ROUND ROUND






Roundness

P


AA A A


P



A

P



AA A




A

P P PP P


P A


Two Dimensional Feature Space and
Decision Boundary

CLASS 1 CLASS 2

Feature 1

Feature 2

Two Dimensional Feature Space and
Decision Boundary

CLASS 1 CLASS 2

Feature 1

Feature 2

X

Measure features of unknown object

Unknown object X is classified as a member of Class 2

Three Dimensional Feature Space

Feature vector F = { f
1
, f
2
, f
3

}

X X


XX

YY


Y

Three Dimensional Feature Space

Feature vector F = { f
1
, f
2
, f
3

}

X X


XX

YY


Y

DECISION

BOUNDARY

IS A PLANE

More Features,


Higher Dimensional Feature Space


Consider Optical Character Recognition
System


Each character is divided into a 16x20 matrix
of pixel values


Matrix is transformed into a vector ( the gray
level values will actually be the features )


The dimension of the Feature Space is 320


Each character will be represented by a point
in a 320 dimensional space (Recall: apple was
point in 2 dimensional space )

Optical Character Recognition System





This is a point in a 320 dimensional space

How Do We Find the
Decision Boundary??

Apple / Pear Problem


Is the object an Apple?


Does it or doesn’t it belong to the apple
category


Response:


+1 if it does belong


-
1 if it does not belong

Linear Classifier Architecture

1


X
1



X
2




X
n


Y

Input neurons


number of neurons depend on input vector length

Output neurons

yes/no,
-
1/+1

b

w
1

w
2

w
n

Bias neuron

Allows classification of

n
-
tuple vector in one

category only

Activation Function


net

is defined as the input to the Y
neuron




net = b +

i

x
i

* w
i




Activation function


+1 if net >= 0

f(net) =
-
1 if net < 0

Boundary Decision


Consider two input with bias




net = b +

i

x
i

* w
i
= 0
(zero is the






boundary )


b + x
1
w
1

+ x
2
w
2

= 0



solve for x
2
: x
2
=
-
(w
1

/w
2
) x
1



b/w
2


Values of

w
1

, w
2

and b determined during
training

Example


Assume
w
1

=1, w
2

=1, b =
-
1


From:

x
2
=
-
(w
1

/w
2
) x
1



b/w
2


x
2
=

-

x
1

+ 1

Therefore:

when


x
1
=0, x
2
= 1





x
1
=1, x
2
= 0

x
2

x
1

Category +1

Category
-
1


net =
-
1 +
x
1

+ x
2

Feature vector: {
x
1
, x
2

}

{1,1}

f(net) = +1

{.5,0}

f(net) =
-
1

Consider Bias Term b


What happens if we eliminate he bias
term b?


Net equation reduces to:



net =

i

x
i

w
i
= 0




x
1
w
1

+ x
2
w
2

= 0




x
2
=
-
(w
1

/w
2
) x
1



No Bias Term

x
2
=
-
(w
1

/w
2
) x
1


W
1
=1, W
2
=1 , x
2
=
-
x
1


W
1
=
-
1, W
2
=1 , x
2
= x
1




x
1


x
2

Linear Separability


If a set of weights can be obtained from
the training vectors so that the correct
response of +1 lies on one side of the
boundary, and a correct response of

1
lies on the other side, the problem is
“linearly separable”

Recall Apple / Pear



RED


Color


YELLOW




GREEN






NOT ROUND ROUND






Roundness

P


AA A A


P



A

P

A P P


AA A


AAA


A

P P PP P


P A


Can not separate pears from apples with a straight ( linear ) line

x
2

x
1

Feature vector: {
x
1
, x
2

}

{1,1}

f(net) =
-
1

{1,
-
1}

f(net) = +1

{
-
1,1}

f(net) = +1

{
-
1,
-
1}

f(net) =
-
1


Although there are only two

classes cannot separate with only

one linear boundary

Ex
-
OR Problem

x
2

x
1

Feature vector: {
x
1
, x
2

}

{1,1}

f(net) =
-
1

{1,
-
1}

f(net) = +1

{
-
1,1}

f(net) = +1

{
-
1,
-
1}

f(net) =
-
1


Non
-
linear boundary

Ex
-
OR Problem

McCulloch
-
Pitts

Basis for most neurons used today


Activation is binary (output either 1 or 0 )


Each neuron has a fixed threshold


Positive weights


excites neuron, Negative
weights


inhibit neuron


It takes one ‘time step’ to pass a signal over
one connection link


How does this model compare to the biological
neurons we have previously studied?

Biological Neurons


Neurotransmitters can either inhibit or
excite neuron


Output is a train of pulses


can a train
of pulses be modeled by a 0/1 level?

General McCulloch Pitts Neuron

X1




Xn


Xn+1



Xn+m


Y

+w

+w

-
p

-
p

p > 0

-
p inhibit

+w excite

Activation Function


f(y_in) = 1 if y_in





= 0 if y_in <



Where


is the threshold



AND


uses analysis instead of learning to determine
weights

x1

x2

y

0

0

0

0

1

0

1
0

0

1

1

1


x1

X1





y




Y



x2






= 2

X2

1

1

y_in = x1*1 + x2*1 = x1+x2


y = f(y_in) = 1 if y_in



,
y_in


2



OR


uses analysis instead of learning to determine
weights

x1

x2

y

0

0

0

0

1

1

1
0

1

1

1

1


x1

X1





y




Y



x2






= 2

X2

2

2

y_in = x1*2 + x2*2


y = f(y_in) = 1 if y_in



,
in this example

y_in


2



AND
-
NOT


non
-
symmetric function

x1

x2

y

0

0

0

0

1

0

1
0

1

1

1

0


x1

X1





y




Y



x2






= 2

X2

2

-
1

y_in = x1*2
-

x2


y = f(y_in) = 1 if y_in



,
in this example

y_in


2



XOR Function

x1 xor x2 = ( x1 AND NOT x2 ) OR ( x2 AND NOT x1)


= Z1 OR Z2


2 Z1


X1









-
1

X2



Z2



y

2

2

2

-
1

All thresholds = 2

Training Algorithms for Single Layer
Neuron Networks


Hebbs


most fundamental


Perceptron Learning


Delta Rule

HEBB Net


Learning occurs by modifying the
weights so that the weight between two
neurons that are both ‘on’ is increased


Modified Hebb Learning increases the
strength of the weight when both
neurons are either on or off. This is
more powerful han original Hebb rule

Hebb Learning

1



X1



Y



X2

y

b

w1

w2

Bipolar Data +1 or

1

We need training data for learning ( s:t )

Training vector s

Target vector t

Hebb Learning Algorithm


Initialize weights to 0, w
i



For each training vector and target pair,
s
i

: t
i

, ( i = 1,n )


Set activations for input neurons x
i

= s
i



Set activations for output neuron y = t
i



Adjust weights


w
i

(new) = w
i

(old) + x
i

y


Adjust bias b(new) = b(old) + y

Use only one pass through the training data

Hebb Learning Example ( AND Logic Gate ):


Input

x1

Input

x2

Bias

B

Target

y

1

1

1

1

1

-
1

1

-
1

-
1

1

1

-
1

-
1

-
1

1

-
1

Initialize weights to zero, calculate change in weights and bias


Recall:


w
i

(new) = w
i

(old) + x
i

y


b(new) = b(old) + y


So, define:


w
1

= x
1
y



w
2

= x
2
y


b = y

x1

x2

b

y


w
1


w
2


b

w
1

w
2

b

1

1

1

1

1

1

1

NEW

w
1,
w
2,
b


Initialize weights to zero, calculate change in weights and bias


Recall:


w
i

(new) = w
i

(old) + x
i

y


b(new) = b(old) + y


So:


w
1

= x
1
y



w
2

= x
2
y


b = y

x1

x2

b

y


w
1


w
2


b

w
1

w
2

b

1

1

1

1

1

1

1

1

1

1

NEW

w
1,
w
2,
b


Since initial weights = 0,

w
i

(new) = w
i

(old) + x
i

y


w
i

(new) = x
i

y



Current Decision Boundary

y = b +

i

x
i

* w
i
= 0
(recall zero is the boundary )


0 = b + x
1
w
1

+ x
2
w
2



solve for x
2
: x
2
=
-
(w
1

/w
2
) x
1



b/w
2


With current weights:
x
2
=
-
x
1



1


x
2
= 0, x
1

=
-
1


x
2
=
-
1 , x
1

= 0



Current Decision Boundary

+

-

-

-


Using:


w
i

(new) = w
i

(old) + x
i

y


b(new) = b(old) + y


And:


w
1

= x
1
y



w
2

= x
2
y


b = y

x1

x2

b

y


w
1


w
2


b

w
1

w
2

b

1

-
1

1

-
1

-
1

1

-
1

0

2

0

NEW

w
1,
w
2,
b


Since previous weights no longer 0,

w
i

(new) = w
i

(old) + x
i

y





Next Data Set

Current Decision Boundary

+

-

-

-

x
2
=
-
(w
1

/w
2
) x
1



b/w
2


With current weights:
x
2
= 0



Using:


w
i

(new) = w
i

(old) + x
i

y


b(new) = b(old) + y


And:


w
1

= x
1
y



w
2

= x
2
y


b = y

x1

x2

b

y


w
1


w
2


b

w
1

w
2

b

-
1

1

1

-
1

1

-
1

-
1

1

1

-
1

NEW

w
1,
w
2,
b


Since previous weights no longer 0,

w
i

(new) = w
i

(old) + x
i

y





Next Data Set

Current Decision Boundary

+

-

-

-

x
2
=
-
(w
1

/w
2
) x
1



b/w
2


With current weights:
x
2
=
-

x
1

+1


Boundary is now in correct position, but one more data set to process


Using:


w
i

(new) = w
i

(old) + x
i

y


b(new) = b(old) + y


And:


w
1

= x
1
y



w
2

= x
2
y


b = y

x1

x2

b

y


w
1


w
2


b

w
1

w
2

b

-
1

-
1

1

-
1

1

1

-
1

2

2

-
2

NEW

w
1,
w
2,
b


Since previous weights no longer 0,

w
i

(new) = w
i

(old) + x
i

y





Last Data Set

Final Decision Boundary

+

-

-

-

x
2
=
-
(w
1

/w
2
) x
1



b/w
2


With current weights:
x
2
=
-

x
1

+1


Observations for Hebb’s Learning


Weights only change for active input
neurons, x
i



0


Hebb Learning will not always find the
correct weights even if hey exist


Perceptron Learning Algorithm

Developed by Frank Rosenblatt


Will always converge to correct weights
if they exist


Incorporates the concept of a learning
rate


you can control how fast the
neuron learns.

Perceptron Learning Algorithm


Initialize weights and bias to 0


Set learning rate


( 0 <




1 )


Continue process as long as weight change:


For each training pair, x
i

= s
i


Compute response of output neuron:

y_in = b +


x
i
w
i





1 if y_in >


Output of neuron with input y_in

y = 0 if
-




y_in





-
1 if y_in <
-




continued

Perceptron Learning Algorithm


If y


t ( there is an error, update
weights)


w
i
(new) = w
i
(old) +


t
x
i

note: if x =0,

b(new) =b(old)

+


t
weight does not

ELSE:





change

w
i
(new) = w
i
(old), b(new) =b(old)


Perceptron Learning Algorithm


As long as the weights or bias changes
at least once while processing the
complete set of data continue repeating
the algorithm


STOP when no weights or bias changes
for a complete pass through the data