# Feed-Forward Neural Networks

AI and Robotics

Oct 19, 2013 (4 years and 8 months ago)

153 views

Introduction to

:

Content

Overview

The Models of Function Approximator

RBFN’s for Function Approximation

Learning the Kernels

Model Selection

Introduction to

Overview

Typical Applications of NN

Pattern Classification

Function Approximation

Time
-
Series Forecasting

Function Approximation

f

Unknown

Approximator

ˆ

f

Neural

Network

Supervised Learning

Unknown

Function

+

+

Neural Networks as
Universal Approximators

Feedforward neural networks with a
single hidden layer

of
sigmoidal units

are capable of approximating uniformly any
continuous multivariate function, to any desired degree of
accuracy.

Hornik, K., Stinchcombe, M., and White, H. (1989). "Multilayer
Feedforward Networks are Universal Approximators," Neural Networks,
2(5), 359
-
366.

Like feedforward neural networks with a
single hidden layer

of sigmoidal units, it can be shown that
RBF networks

are
universal approximators.

Park, J. and Sandberg, I. W. (1991). "Universal Approximation Using
-
Basis
-
Function Networks," Neural Computation, 3(2), 246
-
257.

Park, J. and Sandberg, I. W. (1993). "Approximation and Radial
-
Basis
-
Function Networks," Neural Computation, 5(2), 305
-
316.

Statistics vs. Neural Networks

Statistics

Neural Networks

model

network

estimation

learning

regression

supervised learning

interpolation

generalization

observations

training set

parameters

(synaptic) weights

independent variables

inputs

dependent variables

outputs

ridge regression

weight decay

Introduction to

The Model of

Function Approximator

Linear Models

Fixed Basis

Functions

Weights

Linear Models

Feature Vectors

Inputs

Hidden

Units

Output

Units

Decomposition

Feature Extraction

Transformation

Linearly
weighted
output

y

1

2

m

x
1

x
2

x
n

w
1

w
2

w
m

x
=

Linear Models

Feature Vectors

Inputs

Hidden

Units

Output

Units

Decomposition

Feature Extraction

Transformation

Linearly
weighted
output

y

1

2

m

x
1

x
2

x
n

w
1

w
2

w
m

x
=

Example Linear Models

Polynomial

Fourier Series

y

1

2

m

x
1

x
2

x
n

w
1

w
2

w
m

x
=

Single
-
Layer Perceptrons as

Universal Aproximators

Hidden

Units

With sufficient number of
sigmoidal units
, it can be a
universal approximator.

y

1

2

m

x
1

x
2

x
n

w
1

w
2

w
m

x
=

Universal Aproximators

Hidden

Units

With sufficient number of
-
basis
-
function units
,
it can also be a universal
approximator.

Non
-
Linear Models

Learning process

Weights

Introduction to

Function Networks

Center

Distance Measure

Shape

Three parameters

x
i

r

= ||
x

x
i
||

i
(
x
)
=

(
||
x

x
i
||)

Gaussian

Gaussian Basis Function (

㴰⸵ⰱ⸰ⰱ,㔩

c
=1

c
=2

c
=3

c
=4

c
=5

Most General RBF

Basis
{

i
:
i
=1,2,…}

is `
near
’ orthogonal.

Properties of RBF’s

On
-
Center, Off Surround

Analogies with
localized receptive fields

found in several biological structures, e.g.,

visual cortex;

ganglion cells

The Topology of RBF

Feature Vectors

x
1

x
2

x
n

y
1

y
m

Inputs

Hidden

Units

Output

Units

Projection

Interpolation

As a function
approximator

The Topology of RBF

Feature Vectors

x
1

x
2

x
n

y
1

y
m

Inputs

Hidden

Units

Output

Units

Subclasses

Classes

As a pattern classifier.

Introduction to

RBFN’s for

Function Approximation

The idea

x

y

Unknown Function

to Approximate

Training

Data

The idea

x

y

Unknown Function

to Approximate

Training

Data

Basis Functions (Kernels)

The idea

x

y

Basis Functions (Kernels)

Function

Learned

The idea

x

y

Basis Functions (Kernels)

Function

Learned

Nontraining

Sample

The idea

x

y

Function

Learned

Nontraining

Sample

Universal Aproximators

x
1

x
2

x
n

w
1

w
2

w
m

x
=

Training set

Goal

for all
k

Learn the Optimal Weight Vector

w
1

w
2

w
m

x
1

x
2

x
n

x
=

Training set

Goal

for all
k

Regularization

Training set

Goal

for all
k

If regularization
is unneeded, set

Learn the Optimal Weight Vector

Minimize

Learn the Optimal Weight Vector

Define

Learn the Optimal Weight Vector

Define

Learn the Optimal Weight Vector

Learn the Optimal Weight Vector

Variance Matrix

Design Matrix

Summary

Training set

Introduction to

Learning the Kernels

RBFN’s as Universal Approximators

x
1

x
2

x
n

y
1

y
m

1

2

l

w
11

w
12

w
1
l

w
m
1

w
m
2

w
ml

Training set

Kernels

What to Learn?

Weights

w
ij
’s

Centers

j
’s of

j
’s

Widths

j
’s of

j
’s

Number

of

j
’s

Model Selection

x
1

x
2

x
n

y
1

y
m

1

2

l

w
11

w
12

w
1
l

w
m
1

w
m
2

w
ml

One
-
Stage Learning

One
-
Stage Learning

The simultaneous updates of all three
sets of parameters may be suitable
for
non
-
stationary environments

or
on
-
line setting
.

Two
-
Stage Training

x
1

x
2

x
n

y
1

y
m

1

2

l

w
11

w
12

w
1
l

w
m
1

w
m
2

w
ml

Step 1

Step 2

Determines

Centers

j
’s of

j
’s.

Widths

j
’s of

j
’s.

Number

of

j
’s.

Determines
w
ij
’s.

E.g., using batch
-
learning.

Train the Kernels

Unsupervised Training

Random subset selection

Clustering Algorithms

Mixture Models

Unsupervised Training

The Projection Matrix

Unknown

Function

The Projection Matrix

Unknown

Function

Error Vector