Application of
Back

Propagation neural
network in data forecasting
Le Hai Khoi, Tran Duc Minh
Institute Of Information Technology
–
VAST
Ha Noi
–
Viet Nam
Acknowledgement
The authors want to Express our
thankfulness to Prof. Junzo WATADA who
read and gave us worthy comments.
Authors
CONTENT
Introduction
Steps in data forecasting modeling
using neural network
Determine network’s topology
Application
Concluding remarks
Introduction
•
Neural networks are “Universal Approximators”
•
To find a suitable model for the data forecasting problem is very difficult and
in reality, it might be done only by trial

and

error
•
We may take the data forecasting problem for a kind of data processing
problem
Data collecting and
analyzing
Neural Networks
Post

processing
Pre

processing
Figure 1: Data Processing.
Steps in data forecasting modeling using neural network
The works involved in are:
*
Data pre

processing
:
determining data interval: daily, weekly, monthly or quarterly; data type:
technical index or basic index; method to normalize data: max/min or
mean/standard deviation.
*
Training
:
determining the learning rate, momentum coefficient, stop condition, maximum
cycles, weight randomizing, and size of training set, test set and verification
set.
*
Network’s topology
:
determining number of inputs, hidden layers, number of neurons in each layer,
number of neurons in output layer, transformation functions for the layers and
error function
Steps in data forecasting modeling using neural network
The major steps in design the data forecasting model is as follow:
１
.
Choosing variables
２．
Data collection
３．
Data pre

processing
４．
Dividing the data set into smaller sets: training, test and verification
５．
Determining network’s topology: number of hidden layers, number of
neurons in each layer, number of neurons in output layer and the
transformation function.
６．
Determining the error function
７．
Training
８．
Implementation.
In performing the above steps, it is not necessary to perform steps sequentially. We could be back to the
previous steps, especially in training and choosing variables steps. The reason is because in the designing
period, if the variables chosen gave us unexpected results then we need to choose another set of variables
and bring about the training step
Choosing variables and Data collection
Determining which variable is related directly or indirectly to the data
that we need to forecast.
•
If the variable does not have any affect to the value of data that we need to
forecast then we should wipe it out of consider.
•
Beside it, if the variable is concerned directly or indirectly then we should
take it on consider.
Collecting data involved with the variables that are chosen
Data pre

processing
Analysis
and
transform
values
of
input
and
output
data
to
emphasize
the
important
features,
detect
the
trends
and
the
distribution
of
data
.
Normalize
the
input
and
output
real
values
into
the
interval
between
max
and
min
of
transformation
function
(usually
in
[
0
,
1
]
or
[

1
,
1
]
intervals)
.
The
most
popular
methods
are
following
:
SV = ((0.9

0.1) / (MAX_VAL

MIN_VAL)) * (OV

MIN_VAL)
Or
:
SV = TFmin + ((TFmax

TFmin) / (MAX_VAL

MIN_VAL)) * (OV

MIN_VAL)
where:
SV: Scaled Value
MAX_VAL: Max value of data
MIN_VAL: Min value of data
TFmax: Max of transformation function
TFmin: Min of transformation function
OV: Original Value
Dividing patterns set
Divide the whole patterns set into the smaller sets:
(1)
Training set
(2)
Test set
(3)
Verification set.
The training set is usually the biggest set employed in training the network.
The test set, often includes 10% to 30% of training set, is used in testing the
generalization. And the verification set is set balance between the needs of
enough patterns for verification, training, and testing.
Determining network’s topology
This step determines links between neurons, number of hidden layers,
number of neurons in each layer
.
1. How neurons in network are connected to each other.
2. The number of hidden layers should not exceed two
3. There is no method to find the most optimum number of neurons used in
hidden layers
.
=> Issue 2 and 3 can only be done by trial and error since it is depended on the
problem that we are dealing with.
Determining the error function
•
To estimate the network’s performance before and after training process.
•
Function used in evaluation is usually a mean squared errors. Other functions
may be: least absolute deviation, percentage differences, asymmetric least
squares etc.
Performance index
F
(
x
) = E
[
e
T
e
]
=
E
[ (
t

a
)
T
(
t

a
) ]
Approximate Performance index
F(
x
) =
e
T
(k)
e
(k)
]
= (
t
(k)

a
(k)
)
T
(
t
(k)

a
(k)
)
•
The lastest quality determination function is usually the Mean Absolute
Percentage Error

MAPE.
Training
Training tunes a neural network by adjusting the weights and biases
that is expected to give us the global minimum of performance
index or error function.
When to stop the training process
?
1.
It should stop only when there is no noticeable progress of the error
function against data based on a randomly chosen parameters set?
2.
It should regularly examine the generalization ability of the network by
checking the network after a pre

determined number of cycles?
3.
Hybrid solution is having a monitoring tool so we can stop the training
process or let it run until there is no noticeable progress.
4.
The result after examining of verification set of a neural network is most
persuadable since it is a directly obtained result of the network after
training.
Implementation
This is the last step after we determined the factors related to network’s
topology, variables choosing, etc.
1. Which environment: Electronic circuits or PC
2. The interval to re

train the network: might be depended on the times and
also other factors related to our problem.
Determine network’s topology
Multi

layer feed

forward neural networks
S
2
x
1
S
1
x
1
n
1
1
S
1
xR
1
R
1
x
1
W
1
b
1
f
1
S
1
x
1
S
1
x
1
a
1
S
2
x
1
n
2
1
S
2
xS
1
W
2
b
2
f
2
S
2
x
1
a
2
P
Figure
2
:
Multi

layer
feed

forward
neural
networks
where:
P
: input vector (column vector)
W
i
: Weight matrix of neurons in layer
i
. (S
i
xR
i
: S
i
rows (neurons), R
i
columns
(number of inputs))
b
i
:
bias
vector of layer
i
(S
i
x1: for S
i
neurons)
n
i
: net input
(
S
i
x1
)
f
i
: transformation function (activate function)
a
i
: net output
(S
i
x1)
: SUM function
i
= 1 .. N, N is the total number of layers.
a
2
= f
2
( W
2
f
1
(W
1
p + b
1
) + b
2
)
Determine training algorithm and network’s topology
Output
x
１
x
2
…
x
n
bias
bias
w
ij
Input layer
Hidden layers
Output layer
…
…
…
w
jk
w
kl
Transfer function is a
sigmoid
or
any squashing function
that is
differentiable
ƒ
(x)
=
1
1 +
e

δx
and
ƒ
’
(x)
=
ƒ
(x)
{
1

ƒ
(x)
}
1
1
Figure 3: Multi

layered Feed

forward neural network layout
Back

propagation algorithm
Step 1:
Feed forward the inputs through networks:
a
0
= p
a
m+1
=
f
m+1
(
W
m+1
a
m
+
b
m+1
), where m = 0, 1, ...,
M
–
1.
a
=
a
M
Step 2:
Back

propagate the sensitive (error):
where
m
=
M
–
1, ..., 2, 1.
Step 3:
Finally, weights and biases are updated by following formulas:
.
(Details on constructing the algorithm and other related issues should be found on text book
Neural Network Design
)
at the output layer
at the hidden layers
Using Momentum
This is a heuristic method based on the observation of training results.
The standard back

propagation algorithm will add following item to the weight as
the weight changes:
∆
W
m
(
k
) =

s
m
(
a
m
–
1
)
T
,
∆
b
m
(
k
) =

s
m
.
When using momentum coefficient, this equation will be changed as follow:
∆
W
m
(
k
) =
∆
W
m
(
k
–
1)
–
(1
–
)
s
m
(
a
m
–
1
)
T
,
∆
b
m
(
k
) =
∆
b
m
(
k
–
1)
–
(1
–
)
s
m
.
Application
Arrow
:
inheritance
relation
Rhombic
antanna
arrow
:
Aggregate
relation
NEURAL
NET
class
includes
the
components
that
are
the
instances
of
Output
Layer
and
Hidden
Layer
.
Input
Layer
is
not
implemented
here
since
it
does
not
do
any
calculation
on
the
input
data
.
NEURAL NET
class
Output layer
Hidden layer
LAYER
class
friend
Application
Application
Application
Concluding remarks
The
determination
of
the
major
works
is
important
and
realistic
.
It
will
help
develop
more
accuracy
data
forecasting
systems
and
also
give
the
researchers
the
deeper
look
in
implementing
the
solution
using
neural
networks
In fact, to successfully apply a neural network, it is depended on three major
factors:
First,
the
time
to
choose
the
variables
from
a
numerous
quantity
of
data
as
well
as
perform
pre

processing
those
data
;
Second,
the
software
should
provide
the
functions
to
examine
the
generalization
ability,
help
find
the
optimal
number
of
neurons
for
the
hidden
layer
and
verify
with
many
input
sets
;
Third,
the
developers
need
to
consider,
examine
all
the
possible
abilities
in
each
time
checking
network’s
operation
with
various
input
sets
as
well
as
the
network’s
topologies
so
that
the
chosen
solution
will
exactly
described
the
problem
as
well
as
give
us
the
most
accuracy
forecasted
data
.
THANK YOU FOR
ATTENDING!
Authors
Kytakyushu 03/2004
Comments 0
Log in to post a comment