# Exercise 1. Function Approximation Your task is to create and train a neural network that solves the XOR problem. XOR is a function that returns 1 when the two inputs are not equal, see table below:

AI and Robotics

Oct 20, 2013 (4 years and 8 months ago)

90 views

Exercise

1.
Function Approximation

Your task is to create and train a neural network that solves the XOR problem.
XOR is a function that returns 1 when the two inputs are not equal, see table

below:

The XOR
-
problem

A

B

A XOR B

1

1

0

1

0

1

0

1

1

0

0

0

To solve this we will need a feedforward neural network with two input neurons, and one
output neuron. Because that the problem is not linearly separable it will also need a
hidden layer with two neurons.

Now we know how our network should look

like, but how do we create it?

To create a new feed forward neural network use the command
newff
. You have to enter
the max and min of the input values, the number of neurons in each layer and optionally
the activation functions.

>> net = newff([0 1; 0

1],[2 1],{'logsig','logsig'})

The variable net will now contain an untrained feedforward neural network with two
neurons in the input layer, two neurons in the hidden layer and one output neuron, exactly
as we want it. The
[0 1; 0 1]

tells matlab that th
e input values ranges between 0 and 1.
The
{'logsig','logsig'}

tells matlab that we want to use the
logsig

function as activation
function in all layers. The first parameter tells the network how many nodes there should
be in the input layer, hence you do
not have to specify this in the second parameter. You
have to specify at least as many transfer functions as there are layers, not counting the
input layer. If you do not specify any transfer function Matlab will use the default
settings.

The
logsig

activation function

Now we want to test how good our untrained network is on the XOR problem. First we
construct a matrix of the inputs. The input

to the network is always in the columns of the
matrix. To create a matrix with the inputs "1 1", "1 0", "0 1" and "0 0" we enter:

>> input = [1 1 0 0; 1 0 1 0]

Now we have constructed inputs to our network. Let us push these into the network to se
what

it produces as output. The command
sim

is used to simulate the network and
calculate the outputs, for more info on how to use the command type
helpwin sim
. The
simplest way to use it is to enter the name of the neural network and input matrix, it
returns
a output matrix.

>> output=sim(net,input)

output =

0.5923 0.0335 0.9445 0.3937

(not unique)

The output was not exactly what we wanted! We wanted (0 1 1 0) but got near to (0.60
0.03 0.95 0.40). (Note that your network might give a differe
nt result, because the
network's weights are given random values at the initialization.)

You can now plot the output and the targets, the targets are the values that we want the
network to generate. Construct the target vector:

>> target = [0 1 1 0]

To

plot points we use the command "plot". We want that the targets should be small
circles so we use the command:

>> plot(target, 'o')

We want to plot the output in the same window. Normally the contents in a window is
erased when you plot something new in

it. In this case we want the targets to remain in
the picture so we use the command
hold on
. The output is plotted as +'s.

>> hold on

>> plot(output, '+')

In the resulting figure below it's easy to see that the network does not give the wanted
results.

To change this we have to train it. Now we will train the network by hand by

Manually set weights

The network we have constructed so far does not really behave as it should. To correct
this the weights will be adjusted.
All the weights are stored in the
net

structure that were
created with
newff
. The weights are numbered by the layers they connect and the neurons
within these layers. To get the value of the weights between the input layer and the first
hidden layer we typ
e:

>> net.IW

ans =

[2x2 double]

[]

>> net.IW{1,1}

ans =

5.5008
-
5.6975

2.5404
-
7.5011

This means that the weight between the second neuron in the input layer to the first
neuron in the first hidden layer is
-
5.6975. To

change it to 1, enter:

>> net.IW{1,1}(1,2)=1;

>> net.IW{1,1}

ans =

5.5008 1.0000

2.5404
-
7.5011

The weights between the hidden layers and the output layer are stored in the
.LW

component, which can be used in the same manner as
.IW
.

>>
net.LW

ans =

[] []

[1x2 double] []

>> net.LW{2,1}

ans =

-
3.5779
-
4.3080

The change we made in the weight makes our network give an other output when we
simulate it, try it by enter:

>> output=sim(net,input)

output =

0.8574 0.0336 0.9445 0.3937

>> plot(output,'g*');

Now the new output will appear as green stars in your picture
.

Training Algorithms

In the neural network toolbox there are several training algorithms already implemented.
That is good becau
se they can do the heavy work of training much smoother and faster
than we do by manually adjust the weights. Now let us apply the default training
algorithm to our network. The matlab command to use is
train
, it takes the network, the
input matrix and the

target matrix as input. The
train

command returns a new trained
helpwin train
. In this example we do not need all the
information that the training algorithms shows, so we turn it of by entering:

>> net.trainParam.show
=NaN;

The most important training parameters are
.epochs

which determines the maximum
number of epochs to train,
.show

the interval between each presentation of training
progress. If the gradient of the performance is less than

the training is en
ded.
The
.time

component determines the maximum time to train.

And to train the network enter:

>> net = train(net,input,target);

Because of the small size of the network, the training is done in only a second or two.
Now we try to simulate the network
again, to se how it reacts to the inputs:

>> output = sim(net,input)

output =

0.0000 1.0000 1.0000 0.0000

That was exactly what we wanted the network to output! You may now plot the output
and see that the +'s falls in the o's. Now examine

the weights that the training algorithm
has set, does they look like the weights that you found?

>> net.IW{1,1}

ans =

11.0358
-
9.5595

16.8909
-
17.5570

>> net.LW{2,1}

ans =

25.9797
-
25.7624

Exercise

2.

Prediction (Timeseries, stock …)

Solution to the timeseries competition

by

using an

MLP network
.

the variable data (1 x 1000)

Import

the
timeseries
.txt file and then:

>>
data=transpose (timeseries)

Reset

the

random

generator
s

to its initial state.

These functions will

be used in the
network creation

>> randn('state',0);

>> rand('state',0);

The idea is to use the timeseries that we have to setup a network, train it and test
the sets.

We will create 3 input layers to the network, using 997 data points. So, we define a

matrix of 3x997 from the loaded timeseries.mat file.
The same way, we have to define
our output matrix, which is composed of the fourth element till the end

>>

in_data = [

data(1:end
-
3)
;
data(2:end
-
2)
;

data(3:end
-
1)];

>>
out_data = data(4:end);

Now
, s
plit the data in training and test sets

>> in_tr=in_data(:,1:900);

>> out_tr=out_data(:,1:900);

>> in_tst=in_data(:,901:end);

>> out_tst=out_data(:,901:end);

Let N be the number of neurons in the hidden layer

>>
N = 7;

Now, to create a feed
-
forward n
etwork with 3 input, N hidden neurons with tanh
-
nonlinearity and one output with linear activation function we use the same function used
in Exercise 1 newff

>> net = newff([min(in_tr,[],2) max(in_tr,[],2)],[N 1],{'tansig' 'purelin'});

>> net.trainParam.e
pochs = 1000;

>> V.P=in_tst;

>> V.T=out_tst;

Lets
train the network: the default training method is Levenberg
-
Marquardt

[net,tr] = train(net,in_tr,out_tr,[],[],V);

The structure of the network can be
optimized

by monitoring the

error on the test set.

Here is a list of test set errors as a function

of the number of hidden neurons

N=2: 1.6889e
-
03

N=3: 1.5500e
-
03

N=4: 6.0775e
-
10

N=5: 5.2791e
-
08

N=6: 3.2476e
-
08

N=7: 3.2816e
-
10

N=8: 6.3030e
-
10

N=9: 5.8722e
-
10

N=10: 2.7228e
-
09

N=15: 5.16033e
-
08

N=30: 2.372
7e
-
11

The number of delays could also be
optimized

the same way

different random
initialization

would also give different values

Now, the trained network can be used for predicting a new datapoint recursively (ten

>> for i=1:10

>>
data(end+1)=sim(net,data(end
-
2:end)');

>> end

>> data(end
-
9:end)