Predictive Modeling Using Neural Network - UUM

apricotpigletΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

72 εμφανίσεις

PREDICTIVE
MODELING USING
NEURAL NETWORK

Organic Neural Network


Has 10 billion highly interconnected neurons
acting in parallel


Each neuron may receive electrochemical
signals from other neurons.


If the right signal is received by the inputs,
the neuron is activated and sends signals to
other neurons.

Artificial Neural Network (ANN)


ANN offer a mathematical model that attemps to
mimic the human brains.


Knowledge is often represented as a layered set
of interconnected processors:. Aka
neurodes


Each
Node

has a weighted connection to
several other nodes in adjacent layers


Each individual nodes take the input received
from connected nodes and use the weight
together with a simple function to compute
output values

Artificial vs Organic NN

Neuron

Hidden Unit

Artificial Neural Network (ANN)


Several ANN architectures exist.


We limit our discussion to the most popular
structures in supervised classification by
examining the
Multilayer Perceptron (MLP)


MLP is a feed
-
forward network composed of
an input layer, hidden layers (composed of
hidden units) and an output layer.


Multilayer Perceptron

Hidden Layers

Output Layer

Input

Layer

Hidden Unit

Multilayer Perceptron

Figure 1 : A fully connected feed
-
forward neural network

Multilayer Perceptron (MLP)


Figure 1 shows a fully connected feed
-
forward neural network
structure together with a single input instance [1.0, 0.4, 0.7].


Arrow indicate the direction of flow for each new instance as it
passes through network.


The number of input variables within individual instances determines
the number of input layer nodes.


The user specifies the number of hidden layers as well as the
number of nodes within a specific hidden layer.


Each hidden unit receives a linear combination of input variables.
The coefficients are called the weights.


An activation function transforms the linear combinations and then
output them to another unit (as input).


Determining a best choice for these value is a matter of
experimentation.


In practice, the total number of hidden layers is usually restricted to
two.


The output layer of the neural network may contain one or several
nodes (depending on the application).

Neural Network Input Format


Input to individual nodes must be numeric
and fall in the closed interval [0,1]

-

transform categorical data??

-

convert numerical data falling outside the


range??


Neural Network Input Format


Transforming a categorical data
:

Eg : Attribute


Color = {
Red
,
Green
,
Blue
,
Yellow
}

Method 1

:

-

Divides the interval range into equal size units

-

Red = 0.00
,
Green = 0.33
,
Blue = 0.67
,
yellow = 1.00

-

Weakness :


Modification incorporates a measure of distance not


seen prior to the conversion.


Distance between
red

&
green

<

red
&
yellow
,


therefore it appears as though the color red is more


similar to green than yellow.

Neural Network Input Format


Transforming a categorical data:

Eg : Attribute


Color = {
Red
,
Green
,
Blue
,
Yellow
}

Method 2

:

-

Use of additional input nodes to represent the colors

-

Red = [0,0]
,
Green = [0,1]
,
Blue = [1,0]
,
yellow = [1,1]

-

Eliminate the bias in previous method.

Neural Network Input Format


Converting a categorical data to range
[0,1]:

Eg : Attribute


Income = {1000, 2000, 3000, 4000}

Method 1 :

-

Divide all attribute values by the largest attribute


value (4000).

-

1000


0.25, 2000


0.5, 3000


0.75 and 4000


1

-

Weakness :


We cannot take advantage of the entire interval


range unless we have some values close to zero.

Neural Network Input Format


Converting a categorical data to range
[0,1]:

Eg : Attribute


Income = {1000, 2000, 3000, 4000}

Method 2 :

-

Use the formula:





-

1000


0.0, 2000


0.33, 3000


0.66 and 4000


1.0

New Value =
originalValue


minimumValue


maximumValue
-

minimumValue

Sigmoid Function


The purpose for each node within a feed forward NN
is to accept input values and pass an output value to
the next higher network layer.


A hidden or output layer node
n

takes input from the
connected nodes of the previous layer, combines
the previous layer node value into a single value,
and uses the new value as input to an evaluation
function.


The output of the evaluation function is a number in
the closed interval [0,1].


This value represents the output of node
n.

Initial weight values for the Neural Network

W
1j

W
1i

W
2j

W
2i

W
3j

W
3i

W
jk

W
ik

0.20

0.10

0.30

-
0.10

-
0.10

0.20

0.10

0.50

View Excel Example


Consider node j. To compute the input to node j, we sum
total of the multiplication of each input weight by its
corresponding input layer node value =

(0.2)(1.0) + (0.3)(0.4) + (
-
0.1)(0.7) = 0.25



0.25 is the input value for node j’
s

evaluation function.


Evaluation function :

-

output value must be in the [0,1] interval range

-

should output a value close to 1 when sufficiently


excited.



Sigmoid Function

:




f(0.25) = 0.562

which represent the output of node j.

Sigmoid Function

f(x)

= 1


1 + e
-
x

Neural Network Output Format


Output nodes of a neural network represent continuous
values in the range [0,1].


However the output can be transform to accommodate
categorical class values.


Example

:
suppose we wish to train a NN to recognize
new credit card customers likely to take advantage of a
special promotion. We design our network architecture
with two output layer nodes, node 1 and node 2. During
training, we indicate a correct output format for
customers that have taken advantage of previous
promotion as 1 for the first output node and 0 for the
second output node and vice versa for customers who
traditionally do not take advantage of the promotions.

Neural Network Output Format

Output Layer

Node 1

Node 2

Take advantage of
promotion

1

0

Do not take advantage
of promotion

0

1



Eg : Customer with output combination; node 1 = 0.9 and


node 2 = 0.2 is likely to take advantage of a


promotion.



Weakness : How about customer with output combination


of 0.2, 0.3 ???


Hard to interpret output value



B
uild a NN with single output layer node even when the output
is categorical.

Neural Network Output Format


Using NN with single output layer node


we can be
confident about classifying an output value of 0.8 as a
customer likely to take advantage of the promotion.



How about if the value is 0.45??


present a special test
dataset to the trained network and record output value for
each instance. Then apply the network to the unknown
instance


when unknown instance
x

shows an uncertain
output value
v
, we classify
x
with the category shown by the
majority of test set instances clustering at or near
v
.

Neural Network Output Format


As the output of neural network give a result between 0 and 1,
how do we use / convert the result to predict the actual value
of the target??


Eg :
Assume that a NN has been trained to predict the future
price of our favorite stock. Suppose the output value is 0.35.
We need to convert this output to determine the future stock
price.


Method :

Predicted value = [maxValue


minValue]*Output + minValue



if the training data price range is $10.00 to $100.00, the
predicted value would be;


= [100


10]*0.35 + 10


= $41.50

Neural Network Training

Supervised

Unsupervised

Backpropagation

Learning

Genetic

Learning

Kohonen self
-
organizing map

This section is not discussed in detail. Please refer to hardcopy notes.

NN : General considerations


What input attributes will be used to build the
networks?


How will the network output be represented?


How many hidden layers should the network
contain?


How many nodes should there be in each hidden
layer?


What condition will terminate network training?

There are no right answer to these questions.

However, we can use the experimental process to help

us achieve desired results.

NN : Strengths


Works well with datasets containing large amount of
noisy input data. NN evaluation functions such as
the sigmoid function naturally smooth input data
variation caused by outliers and random error.


Can process and predict numeric as well as
categorical outcome. However categorical data
conversion can be tricky.


Performed consistently well in several domains.


Can be used for both supervised learning and
unsupervised clustering.

NN : Weaknesses


Probably the biggest criticism of NN is that
lack of the ability to explain their behavior.


NN learning algorithm are not guaranteed to
converge to an optimal solution.


NN can easily be overtrained to the point of
working well on the training data but poorly
on test data.

Neural Networks Application


Pattern recognition

~ to recognize hand
-
writing,
signature, biometrics, texture analysis, signal processing
or even human emotions
.


Control

~ autonomous robot, self
-
guided underwater
rover, manufacturing, autonomous vehicles, washing
machine, smart home.


Medical applications
~ automated heart attack
detection, cancer cells analysis, disease diagnosis,
outbreak analysis.


Business applications

~ fraud detection, marketing
segmentation, sales (revenue) forecasting, stock
exchange.