Topic_4B_ANN

cracklegulleyAI and Robotics

Oct 19, 2013 (3 years and 10 months ago)

104 views

ICT619 Intelligent
Systems


Topic 4: Artificial Neural
Networks

ICT619

2

Artificial Neural Networks

PART A


Introduction


An overview of the biological neuron


The synthetic neuron


Structure and operation of an ANN


Problem solving by an ANN


Learning in ANNs


ANN models


Applications

PART B



Developing neural network applications



Design of the network



Training issues



A comparison of ANN and ES



Hybrid ANN systems



Case Studies

ICT619

3

Developing neural network
applications

Neural Network Implementations



Three possible practical implementations of ANNs are:

1.
A software simulation program running on a digital
computer


2.
A hardware emulator connected to a host computer
-

called a
neurocomputer


3.
True electronic circuits


ICT619

4

Software Simulations of ANN


Currently the cheapest and simplest implementation
method for ANNs
-

at least for general purpose use.



Simulates parallel processing on a conventional
sequential digital computer




Replicates temporal behaviour of the network by
updating the activation level and output of each node
for successive time steps



These steps are represented by iterations or loops




Within each loop, the updates for all nodes in a layer
are performed
.

ICT619

5

Software simulations of ANN
(cont’d)


In multilayer ANNs, processing for a layer is
completed and its output used to calculate states of
the nodes in the following layer



Typical additional features of ANN simulators

1.
Configuring the net according to a chosen architecture and
node operational characteristic

2.
Implementation of training phase using a chosen training
algorithm

3.
Tools for visualising and analysing behaviour of nets



ANN simulators are written in hi
-
level languages such
as C, C++ and Java.

ICT619

6

Advantages and possible problems
with software simulators

Advantages and possible problems with software
simulators


Main attraction of ANN simulators is the relatively low
cost and wide availability of ready
-
made commercial
packages


They are also compact, flexible and highly portable.



Writing your own simulator requires programming skills
and would be time consuming (except that you don't
have to now!)


Training of ANNs using software simulators can be
slow for larger networks (greater than a few hundred)

ICT619

7

Commercially available neural net
packages



Prewritten shells with convenient user interfaces


Cost a few hundred to tens of thousands of dollars



Allow users to specify the ANN design and training
parameters


Usually provide graphic interfaces to enable monitoring
of the net’s training and operation


Likely to provide interfacing with other software
systems such as spreadsheets and databases.

ICT619

8

Neurocomputers


Dedicated special
-
purpose digital
computer (aka
accelerator boards
)




Optimised to perform operations
common in neural network simulation




Acts as a coprocessor to a host
computer and is controlled by a
program running on the host.



Can be tens to thousands of times
faster than simulators




Systems are available with approx.
1000 million IPS connection updates
per second for networks with 8,192
neurons e.g ACC Neural Network
Processor

ICT619

9

Neurocomputers

Genobyte's CAM
-
Brain Machine was developed between 1997 and 2000



ICT619

10

True Networks in Hardware


Closer to biological neural networks than simulations




Consist of synthetic neurons actually fabricated on
silicon chips



Commercially available hardwired ANNs are limited to
a few thousand neurons per chip
1
.




Chips connected in parallel to achieve larger networks.



Problems: interconnection and interference, fixed
-
valued weights
-

work progressing on modifiable
synapses.


1

Figures more than five years old.

ICT619

11

Neural Network Development
Methodology


Aims to add structure and organisation to ANN
applications development for reducing cost, increasing
accuracy, consistency, user confidence and
friendliness



Split development into the following phases:


The Concept Phase


The Design Phase


The Implementation Phase


The Maintenance Phase

ICT619

12

Neural Network Development
Methodology
-

the Concept Phase


Involves


Validating the proposed application


Selecting an appropriate neural paradigm.


Application validation

Problem characteristics suitable for neural network
application are:


Data intensive


Multiple interacting parameters


Incomplete, erroneous, noisy data


Solution function unknown or expensive


Requires flexibility, generalisation, fault
-
tolerance, speed

ICT619

13

ANN Development Methodology
-

the
Concept Phase (cont’d)



Common examples of applications with above
attributes are


pattern recognition (eg, printed or handwritten character,
consumer behaviour, risk patterns),


forecasting (eg, stock market), signal (audio, video, ultrasound)
processing






Problems
not

suitable for ANN
-
based solutions include:


A mathematically accurate and precise solution is available


Solution involving deduction and step
-
wise logic appropriate


Applications involving explaination or reporting



One application area that is
unsuitable

for ANNs is
resource management eg, inventory, accounts, sales
data analysis

ICT619

14

Selecting an ANN paradigm



Decision based on comparison of application requirements
to capabilities of different paradigms


eg, the multilayer perceptron is well known for its pattern
recognition capabilities,


Kohonen net more suited for applications involving data
clustering



Choice of paradigm also influenced by the training method
that can be employed


eg. supervised training must have adequate number of
input
-
correct output pairs available and training may take a
relatively long time



Technical and economic feasibility assessments should be
carried out to complete the concept phase

ICT619

15

The Design Phase


The design phase specifies initial values and
conditions at the node, network and training levels



Decisions to be made
at the node level

include:


Types of input


binary (0,1), bipolar (
-
1,+1), trivalent (
-
1, 0, +1), discrete, continuous
-
valued


Transfer function
-

step or threshold, hyperbolic tangent,
sigmoid, consider possible use of lookup tables for
speeding up calculations



Decisions to be made
at the network architecture
level


The number and size of layers and their connectivity



(fully interconnected, or sparsely interconnected, feedforward
or recurrent, other?)

ICT619

16

The Design Phase (cont’d)


'Size' of a layer is the number of nodes in the layer


For the input layer, size is determined by number of data
sources (input vector components) and possibly the
mathematical transformations done



The number of nodes in the output layer is determined
by the number of classes or decision values to be output



Finding optimal size of the hidden layer needs some
experimentation


Too few nodes will produce inadequate mapping, while
too many may result in inadequate generalisation


ICT619

17

The Design Phase (cont’d)

Connectivity


Connectivity determines the flow of signals between
neurons in the same or different layers



Some ANN models, such as the multilayer perceptron,
have only
interlayer

connections
-

there is no
intralayer

connection



The Hopfield net is an example of a model with
intralayer connections

ICT619

18

The Design Phase (cont’d)

Feedback


There may be no feedback of output values, eg, the
multilayer perceptron


or


There may be feedback as in a recurrent network eg,
the Hopfield net



Other design questions include


Setting of parameters for the learning phase


eg,
stopping criterion, learning rate.


Possible
addition of noise

to speed up training.

ICT619

19

The Implementation phase

Typical steps:


Gathering the training set


Selecting the development environment


Implementing the neural network


Testing and debugging the network



Gathering the training set



Aims to get right type of data in adequate amount
and in the right format


ICT619

20

Gathering training data (cont’d)


How much data to gather?


Increasing data amount increases training time but may
help earlier convergence


Quality more important than quantity



Collection of data


Potential sources
-

historical records, instrument
readings, simulation results



Preparation of data


Involves preprocessing including scaling, normalisation,
binarisation, mapping to logarithmic scale, etc.



ICT619

21

Gathering training data (cont’d)


Type of data to collect should be representative of
given problem including routine, unusual and
boundary
-
condition cases



Mix of good as well as imperfect data but not
ambiguous or too erroneous.



Amount of data to gather


Increasing data amount increases training time but
may help earlier convergence


Quality more important than quantity

ICT619

22

Gathering training data (cont’d)


Collection of data


Potential sources
-

historical records, instrument
readings, simulation results



Preparation of data


Involves preprocessing including normalisation and
possible binarisation


ICT619

23

Selecting the development
environment

Hardware and software aspects


Hardware requirements based on


speed of operation


memory and storage capacity


software availability


cost


compatibility



The most popular platforms are workstations and high
-
end PC's (with accelerator board option)



ICT619

24

Selecting the development
environment

Two options in choosing
software


1.
Custom
-
coded simulators


which requires more
expertise on part of the user but provides maximum
flexibility


2.
Commercial development packages


which are
usually easy to use because of a more
sophisticated interface


ICT619

25

Selecting the development
environment (cont’d)


Selection of hardware and software
environment usually based on following
considerations:


ANN paradigm to be implemented


Speed in training and recall


Transportability


Vendor support


Extensibility


Price


ICT619

26

Implementing the neural network


Common steps involved are:


Selection of appropriate neural paradigm


Setting network size


Deciding on the learning algorithm


Creation of screen displays


Determining the halting criteria


Collecting data for training and testing


Data preparation including preprocessing



Organising data into training and test sets

ICT619

27

Implementation
-

Training



Training the net, which consists of


Loading the training set


Initialisation of network weights


usually to
small random values


Starting the training process


Monitoring the training process until training
is completed


Saving of weight values in a file for use
during operation mode

ICT619

28

Implementation


Training
(cont’d)

Possible problems arising during training


Failure to converge to a set of optimal weight values


Further weight adjustments fail to reduce output error,
stuck in a
local minimum


Remedied by resetting the learning parameters and
reinitialising the weights



Overtraining


Net fails to generalise, i.e., fails to classify less than
perfect patterns


Mix of good and imperfect patterns for training helps

ICT619

29

Implementation


Training
(cont’d)


Training results may be affected by the method
of presenting data set to the network.



Adjustments may be made by varying the layer
sizes and fine
-
tuning the learning parameters.



To ensure optimal results, several variations of
a neural network may be trained and each
tested for accuracy

ICT619

30

Implementation
-

Testing and
Debugging

Testing can be done by:

1.

Observing operational behaviour of the net.

2.

Analysing actual weights

3.

Study of network behaviour under specific conditions


Observing operational behaviour


Network treated as a black box and its response to a series
of test cases is evaluated


Test data


Should contain training cases as well as new cases


Routine, unusual as well as boundary condition cases
should be tried

ICT619

31

Implementation
-

Testing and
Debugging (cont’d)

Testing by weight analysis


Weights entering and exiting nodes analysed for
relatively small and large values



In case of significant errors detected in testing,
debugging would involve examining


the training cases for representativeness, accuracy and
adequacy of number


learning algorithm parameters such as the rate at which
weights are adjusted


neural network architecture, node characteristics, and
connectivity


training set
-
network interface, user
-
network interface

ICT619

32

The Maintenance Phase

Consists of


placing the neural network in an operational
environment with possible integration


periodic performance evaluation, and maintenance



Although often designed as stand
-
alone systems,
some neural network systems are integrated with other
information systems using:


Loose
-
coupling


preprocessor, postprocessor,
distributed component


Tight
-
coupling or full integration as embedded
component

ICT619

33

The Maintenance Phase


Possible ANN operational environments:


ICT619

34

System evaluation



Continual evaluation is necessary to


ensure satisfactory performance in solving dynamic
problems


check for damaged or retrained networks.



Evaluation can be carried out by reusing
original test procedures with current data.

ICT619

35

ANN Maintenance

Involves modification necessitated by


Decreasing accuracy


Enhancements


System modification falls into two categories
involving either data or software.


Data modification steps:


Training data is modified or replaced


Network retrained and re
-
evaluated.

ICT619

36

ANN Maintenance (cont’d)


Software changes include changes in


Interfaces


cooperating programs


the structure of the network.



If the network is changed, part of the design and most
of the implementation phase may have to be repeated.



Backup copies should be used for maintenance and
research.

ICT619

37

A comparison of ANN and ES

Similarities between ES and ANN


Both aim to create intelligent computer systems by
mimicking human intelligence, although at different
levels




Design process of neither ES nor ANN is automatic


Knowledge extraction in ES is a time and labour
intensive process


ANNs are capable of learning but selection and
preprocessing of data have to be done carefully.

ICT619

38

A comparison of ANN and ES
(cont’d)

Differences between ANN and ES


Differ in aspects of design, operation and use



Logic vs. brain


ES simulate the human reasoning process based on
formal logic


ANNs are based on modelling the brain, both in structure
and operation




Sequential vs. parallel


The nature of processing in ES is sequential


ANNs are inherently parallel


ICT619

39

A comparison of ANN and ES
(cont’d)

External and static vs. internal and dynamic


Learning is performed external to the ES


ANN itself is responsible for its knowledge acquisition
during the training phase.


Learning is always off
-
line in ES
-

knowledge remains
static during operation


Learning in ANNs, although mostly off
-
line, can be on
-
line



Deductive vs. inductive inferencing


Knowledge in an ES always used in a deductive
reasoning process


An ANN constructs its knowledge base inductively from
examples, and uses it to produce decision through
generalisation

ICT619

40

A comparison of ANN and ES
(cont’d)

Knowledge representation: explicit vs. implicit


ES store knowledge in explicit form
-
possible to inspect
and modify individual rules


ANNs knowledge stored implicitly in the interconnection
weight values



Design issues: simple vs. complex


Technical side of ES development relatively simple
without difficult design choices.


ANN design process often one of trial and error

ICT619

41

A comparison of ANN and ES
(cont’d)


User interface: white box vs. black box


ES have explanation capability


Difficulty in interpreting an ANN's knowledge
-
base
effectively makes it a black box to the user



State of maturity and recognition: well
-
established vs. early


ES already well established as a methodology in
commercial applications


ANN recognition and development tools at a
relatively early stage.


ICT619

42

Hybrid systems


Neuro
-
symbolic computing
utilises the complementary
nature of computing in neural networks (numerical) and
expert systems (symbolic).


Neuro
-
fuzzy systems combine neural networks with
fuzzy logic


ANNs can also be combined with genetic algorithm
methodology


Hybrid ES
-
ANN systems


The strengths of the ES can be utilised to overcome
the weaknesses of an ANN based system and vice
versa.


For example, ANN’s extraction of knowledge from data


ES’s explanation capability


ICT619

43

Hybrid ES
-
ANN systems


Rule extraction by inference justification in an ANN


MACIE, an ANN based decision support system
described in (Gallant 1993)


Extracts a single rule that justifies an inference in an
ANN



Inference in an ANN is represented by output of a
single node


This output is based upon incomplete input values fed
from a number of nodes as shown in the diagram
below.

ICT619

44

Hybrid ES
-
ANN systems (cont’d)


A node
u
i

is

defined to be a contributing node to node
u
j
if
w
ij

u
i


0.

ICT619

45

Hybrid ES
-
ANN systems (cont’d)


In this example, the
contributing variables are
{u
2
, u
3
, u
5
, u
6

}.


The rule produced in this
example is:

IF u
6

= Unknown


AND u
2

= TRUE

AND u
3

= FALSE


AND u
5

= TRUE

THEN conclude u
7

= TRUE
.

ICT619

46

Hybrid ES
-
ANN systems (cont’d)


One approach to hybrid systems divides a problem into
tasks suitable for either ES and ANN


These tasks are then performed by the appropriate
methodology



One example of such a system (Caudill 1991) is an
intelligent system for delivering packages


ES performs the task of producing the best loading
strategy for packages into trucks


ANN works out best route for delivering the packages
efficiently.

ICT619

47

Hybrid ES
-
ANN systems (cont’d)


Hybrid ES
-
ANN systems with ANNs embedded
within expert systems


ANN used to determine which rule to fire, given
the current state of facts.



Another approach to hybrid ES
-
ANN uses an
ANN as a preprocessor


One or more ANNs produce classifications.


Numerical outputs produced by ANN are
interpreted symbolically by an ES as facts


ES applies the facts for deductive reasoning

ICT619

48

Case Study

Case: Application of ANNs in bankruptcy prediction
(Coleman et al,
AI Review
, Summer 1991, in Zahedi
1993)


Predicts banks that were certain to fail within a year


Predicts certainty given to bank examiners dealing with the
bank in question.


ANN has 11 inputs, each of which is a ratio developed by
Peat Marwick.



Developed by NeuralWare’s Application Development
Services and Support Group (ADSS)


Software used
-

the NeuralWorks Professional neural
network development system.


Uses the standard backpropagation (multiplayer perceptron)
network.


ICT619

49

Case Study (cont’d)


ANN has 11 inputs, each a ratio developed by Peat
Marwick.


Inputs connected to a single hidden layer, which in turn is
connected to a single node in the output layer.


Network outputs a single value denoting whether the bank
would or would not fail within that calendar year



Employed the hyperbolic
-
tangent transfer function and a
proprietary error function created by the ADSS staff.


Trained on a set of 1,000 examples, 900 of which were
viable banks and 100 of which were banks that had actually
gone bankrupt


Training consisted of about 50,000 iterations of the training
set.


Predicted 50% of banks that are viable, and 99% of banks
that actually failed.

ICT619

50

REFERENCES



AI Expert (special issue on ANN), June 1990.


BYTE (special issue on ANN), Aug. 1989.


Caudill,M., "The View from Now", AI Expert, June 1992,
pp.27
-
31.


Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997


Kirrmann,H., "Neural Computing: The new gold rush in
informatics", IEEE Micro June 1989 pp. 7
-
9


Lippman, R.P., "An Introduction to Computing with Neural
Nets", IEEE ASSP Magazine, April 1987 pp.4
-
21.


Lisboa, P., (Ed.) Neural Networks Current Applications,
Chapman & Hall, 1992.


Negnevitsky, M. Artificial Intelligence A Guide to Intelligent
Systems, Addison
-
Wesley 2005.

ICT619

51

REFERENCES (cont’d)



Bailey, D., & Thompson, D., How to Develop Neural
Network Applications, AI Expert, June 1990, pp. 38
-
47.


Caudill & Butler, Naturally Intelligent Systems, MIT
Press,1989, pp 227
-
240.


Caudill, M., “Expert networks”, BYTE pp.109
-
116, October
1991.


Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997.


Gallant, S., Neural Network Learning and Expert Systems,
MIT Press 1993.


Medsker,L., Hybrid Intelligent Systems, Kluwer Academic
Press, Boston 1995


Zahedi, F., Intelligent Systems for Business, Wadsworth
Publishing, , Belmont, California, 1993.