Artificial Neural Network Overview - GK-12 Outreach

glibdoadingAI and Robotics

Oct 20, 2013 (3 years and 7 months ago)

67 views

Artificial Neural Network Overview


An artificial neural network is a collection of connected models neurons. Taken one at a
time each neuron is rather simple. As a collection however, a group of neurons is capable
of producing complex results. In the foll
owing sections I will briefly summarize a
mathematical model of a neuron
, neuron layer, and neural network

before discussing the
types of behavior achievable from a neural network.

Finally, I will conclude with a short
description of the program included i
n this lesson so you can form networks that are
tailored to your class.


Models


The models presented in this section appear fairly difficult mathematically. However,
they eventually boil down to just multiplication and addition.
The use of matrices and
ve
ctors simplifies the notation but is not absolutely required for this application.

Neuron Model


A model of a neuron has three basic parts: input weights, a summer,

and a
n output
function. The input weights scale values used as inputs to the neuron, the su
mmer adds
all the scaled values together, and the output function produces the final output of the
neuron. Often, one additional input, known as the bias is added to the system. If a bias is
used, it can be represented by a weight with a constant input of
one. This description is
laid out visually below.



f(x)













B













1



a


Where
I1
,
I2
, and
I3

are the inputs,
W1
,
W2
, and
W3

are the weights,
B

is the bias,
x

is
an intermediate output, and
a

is final output. The equation for
a

is given by

where
f

could be any function. Most often,
f

is the sign of
the argument (i.e. 1 if the argument is positive and
-
1 if the argument is negative), linear
(i.e. the output is simply the input times some constant factor), or some complex curve

used in function matching (not needed here). For this model we will use the first case
where
f

is the sign of the argument for two reasons: it closely matches the ‘all or nothing’
property seen in biological neurons and it is fairly easy it implement.


W
hen artificial neurons are implemented, vectors are commonly used to represent the
inputs and the weights so
the first of two

brief review
s

of linear algebra is appropriate
here.
The dot product of two vectors

and

is given by
. Using this notation the output is simplified to

where all the inputs are contained in

and all the weights are
contained in
.


Neuron Layer


In

a neuron layer each input is tied to every neuron

and each neuron produces its own
output. This can be represented mathematically by the following series of equations:





. . .


NOTE: I
n general these functions may be different, however, I will take them to be the
sign of the argument from now on.


And we will take our second digression into linear algebra. We need to recall that to
perform the operation of matrix multiplication you take

each column of the second matrix
and perform the dot product operation with each row of the first matrix to produce each
element in the result. For example the dot product of the
i
th column of the second matrix
and the
j
th row of the first matrix results
in the (
j
,
i
) element of the result. If the second
matrix is only one column, then the result is also one column.


Keeping matrix multiplication in mind, we append the weights so that each row of a
matrix represents the weights of
on neuron. Now, represent
ing the input vector and the
biases as one column matrices, we can simplify the above notation to:




which is the final form of the mathematical representation of one layer of artificial
neurons.


Neural Network


A neural network is

simply a collection of neuron layers where the output of each
previous layer becomes the input to the next layer. So, for example, the inputs to layer
two are the outputs of layer one. In this exercise we are keeping it relatively simple by
not having fee
dback (i.e. output from layer
n

being input for some previous layer).
To
mathematically represent the neural network we only have to chain together the
equations. The finished equation for the three layer network in this equation is given by:




Neural Network Behavior


Although transistor

now switch in as little as 0.000000000001 seconds and biological
neurons take about .001 seconds to respond we have not been able to approach the
complexity or the overall speed of the brain becaus
e

of, in part,
the large number
(approximately 100,000,000,000) neurons that are highly connected (approximately
10,000 connections per n
euron)
. Although not as advanced as biologic brains, a
rtificial
n
eural networks are still perform many important functio
ns in a

wide range of
applications

including sensing, controls, pattern recognition, and categorization.
Generally, networks (including our brains) are trained to achieve a desired result. The
training mechanisms and rules are beyond the scope of this pape
r, however it is worth
mentioning that generally good behavior is rewarded while bad behavior is punished.
That is to say that when a network performs well it is modified only slightly (if at all) and
when it performs poorly larger modifications are made.
As a final thought on neural
network behavior, it is worth noting that if the output function of the neurons are all
linear functions, the network is
reducible

to a one layer network. In other words, to have
a useful network of

more than one layer we must
us a function like the sigmoid (an s
shaped curve), the sign function we used above, a linear function that saturates, or any
other non
-
line shaped curve.


Matlab Code


This section covers the parameters in my Matlab code that you might choose to modify i
f
you decide to create a network with inputs and outputs other than what have been already
documented in this lesson. Before using my code you should be aware that it was not
written to solve general neural network problems, but rather to find a network by

randomly trying values. This means that it could loop forever even if a solution to your
inputs and outputs exists. If you do not get a good result after a few minutes you may
want to stop the execution and change your parameters.

Finally, I will not clai
m that I
have worked all bugs out of this program so you should check your results carefully
before executing them in a classroom setting.


p1, p2, and p3 are input patterns for three
different inputs. Each input pattern consists of
three elements pertaini
ng to different attributes of the input. For example in my lesson I
used redness, roundness, and softness. Here, for instance, a one in the first position means
that an object is red while a zero indicates that it is not red.


a1, a2, and a3 are output pa
tterns.

They need to be initialized to be incorrect (that way the
program enters the loop rather than bypasses it). The second argument of the conditionals
for the loop should be the desired results.

In my case, I chose to have one neuron in the
last layer

be an indicator for each object. When that object was used as an input for the
network, that neuron would end up being a one while the other neurons in the last layer
would be negative one (if everybody did their math correctly).

More explicitly, when the

first element of a1 is not a positive one then it is wrong and I want to do the loop again.
In a similar manner, when the second element of a1 is not a negative one it is wrong and I
want to do the loop again. And the same for the rest of the outputs.


No
te that there is one known bug involving the termination of non
-
terminating decimals
(in binary 0.1 is non
-
terminating). It is possible that a 0.0000 is taken to be positive rather
than zero.