# Neural Networks - KBS

Τεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 4 χρόνια και 6 μήνες)

118 εμφανίσεις

Neural Networks

Slides from: Doug Gray, David Poole

What is a Neural Network?

inspired by the way biological nervous
systems, such as the brain, process
information

A method of computing, based on the
interaction of multiple connected
processing elements

What can a Neural Net do?

Compute a known function

Approximate an unknown function

Pattern Recognition

Signal Processing

Learn to do any of the above

Basic Concepts

A Neural Network generally
maps a set of inputs to a set
of outputs

Number of inputs/outputs is
variable

The Network itself is
composed of an arbitrary
number of nodes with an
arbitrary topology

Basic Concepts

Definition of a node:

A node is an element
which performs the
function

y =
f
H
(∑(
w
i
x
i
) +
W
b
)

Node

Connection

Properties

Inputs are flexible

any real values

Highly correlated or independent

Target function may be discrete
-
valued, real
-
valued, or vectors of discrete or real values

Outputs are real numbers between 0 and 1

Resistant to errors in the training data

Long training time

Fast evaluation

The function produced can be difficult for
humans to interpret

Perceptrons

Basic unit in a neural network

Linear separator

Parts

N inputs, x1 ...
xn

Weights for each input, w1 ...
wn

A bias input x0 (constant) and associated
weight w0

Weighted sum of inputs, y = w0x0 + w1x1 + ...
+
wnxn

A threshold function (activation function),
i.e

1 if
y > 0,
-
1 if y <= 0

Diagram

x1

x2

.

.

.

xn

Σ

Threshold

y = Σ wixi

x0

w0

w1

w2

wn

1 if y >0

-
1 otherwise

Typical Activation Functions

F(x) = 1 / (1 + e

x
)

Using a nonlinear

function which
approximates a linear
threshold allows a

network to approximate
nonlinear functions

Simple Perceptron

Binary logic application

f
H
(x) = u(x) [linear threshold]

W
i

= random(
-
1,1)

Y = u(W
0
X
0

+ W
1
X
1

+ W
b
)

Now how do we train it?

Basic Training

Perception learning rule

ΔW
i

= η * (D

Y) * X
i

η = Learning Rate

D = Desired Output

Adjust weights based on how well the
current weights match an objective

Logic Training

Expose the network to the logical
OR operation

Update the weights after each
epoch

As the output approaches the
desired output for all cases, ΔW
i
will
approach 0

X
0

X
1

D

0

0

0

0

1

1

1

0

1

1

1

1

Results

W
0

W
1

W
b

Details

Network converges on a hyper
-
plane decision
surface

X
1

= (W
0
/W
1
)X
0

+ (W
b
/W
1
)

X
1

X
0

Feed
-
forward neural networks

Feed
-
forward neural networks are the
most common
models.

These are directed acyclic graphs:

Neural Network for the news
example

Axiomatizing the Network

The values of the attributes are real numbers.

Thirteen parameters
w0; … ;w12 are real numbers.

The attributes
h1 and h2 correspond to the values of

hidden units.

There are 13 real numbers to be learned. The
hypothesis space is thus a 13
-
dimensional real space.

Each point in this 13
-
dimensional space corresponds
to a particular logic program that predicts a value for
given
known, new, short, and home

Prediction Error

Neural Network Learning

Aim of neural network learning: given a set
of examples, find parameter settings that
minimize the error.

Back
-
descent search through the parameter
space to minimize the
sum
-
of
-
squares
error.

Backpropagation Learning

Inputs:

A network, including all units and their
connections

Stopping Criteria

Learning Rate (constant of proportionality of
descent search)

Initial values for the parameters

A set of classified training data

Output: Updated values for the parameters

Backpropagation Learning
Algorithm

Repeat

evaluate the network on each example given
the
current parameter settings

determine the derivative of the error for each
parameter

change each parameter in proportion to its
derivative

until the stopping criteria is met

Learning

Bias in neural networks and
decision trees

It’s easy for a neural network to represent “at least
two of
I
1
, …, I
k

are true”:

w
0
w
1

w
k

-
15 10 10

This concept forms a large decision tree.

Consider representing a conditional: “If
c then a
else b”:

Simple in a decision tree.

Needs a complicated neural network to represent

(c ^ a) V (~c ^ b).

Neural Networks and Logic

Meaning is attached to the input and
output units.

There is no a priori meaning associated
with the hidden
units.

What the hidden units actually represent is
something
that’s learned.