Training and Testing Neural Networks - 서울대학교

clangedbivalveΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 5 μήνες)

91 εμφανίσεις

Training and Testing

Neural Networks

서울대학교

산업공학과

생산정보시스템연구실

이상진

Contents


Introduction


When Is the Neural Network Trained?


Controlling the Training Process with Learning
Parameters


Iterative Development Process


Avoiding Over
-
training


Automating the Process

Introduction (1)


Training a neural network


perform a specific processing function

1)
어떤

parameter?

2) how used to control the training process

3) management of the training data
-

training process


미치는

영향
?


Development Process


1) Data preparation


2) neural network model & architecture
선택


3) train the neural network


neural network


구조와



function


의해

결정


Application


“trained”

Introduction (2)


Learning Parameters for Neural Network



Disciplined approach to iterative neural network
development


Introduction (3)


When Is the Neural Network Trained?


When the network is trained?


the type of neural network


the function performing


classification


clustering data


build a model or time
-
series forecast


the acceptance criteria


meets the specified accuracy


the connection weights are “locked”


cannot be adjusted


When Is the Neural Network Trained?


Classification (1)


Measure of success : percentage of correct
classification


incorrect classification


no classification : unknown, undecided


threshold limit


When Is the Neural Network Trained?



Classification (2)


confusion matrix

: possible output categories and the corresponding
percentage of correct and incorrect classifications

When Is the Neural Network Trained?


Clustering (1)


Output a of clustering network


open to analysis by the user


Training regimen is determined:


the number of times the data is presented to the neural
network


how fast the learning rate and the neighborhood decay


Adaptive resonance network training (ART)


vigilance training parameter


learn rate

When Is the Neural Network Trained?


Clustering (2)


Lock the ART network weights


disadvantage : online learning


ART network are sensitive to the order of the
training data




When Is the Neural Network Trained?


Modeling (1)


Modeling or regression problems


Usual Error measure


RMS(Root Square Error)


Measure of Prediction accuracy


average


MSE(Mean Square Error)


RMS(Root Square Error)


The Expected behavior


초기의

RMS error


매우

높으나
,
점차

stable
minimum
으로

안정화된다

When Is the Neural Network Trained?


Modeling (2)


When Is the Neural Network Trained?


Modeling (3)


안정화되지

않는

경우


network fall into a local minima


the prediction error doesn’t fall


oscillating up and down


해결

방법


reset(randomize) weight and start again


training parameter


data representation


model architecture


When Is the Neural Network Trained?


Forecasting (1)


Forecasting


prediction problem


RMS(Root Square Error)


visualize : time plot of the actual and desired network
output


Time
-
series forecasting


long
-
term trend


influenced by cyclical factor etc.


random component


variability and uncertainty


neural network are excellent tools for modeling
complex time
-
series problems


recurrent neural network : nonlinear dynamic systems


no self
-
feedback loop & no hidden neurons


When Is the Neural Network Trained?


Forecasting (2)


Controlling the Training Process with
Learning Parameters (1)


Learning Parameters depends on


Type of learning algorithm


Type of neural network



Controlling the Training Process with
Learning Parameters (2)

-

Supervised training

Neural Network

Pattern

Prediction

Desired

Output

1) How the error is computed

2) How big a step we take when adjusting the
connection weights

Controlling the Training Process with
Learning Parameters (3)

-

Supervised training


Learning rate


magnitude of the change when adjusting the connection
weights


the current training pattern and desired output


large rate


giant oscillations


small rate


to learn the major features of the problem


generalize to patterns

Controlling the Training Process with
Learning Parameters (4)

-

Supervised training


Momentum


filter out high
-
frequency changes in the weight values


oscillating around a set values
방지


Error


오랫동안

영향을

미친다


Error tolerance


how close is close enough


많은

경우

0.1


필요성


net input must be quite large?

Controlling the Training Process with
Learning Parameters (5)

-
Unsupervised learning


Parameter


selection for the number of outputs


granularity of the segmentation


(clustering, segmentation)


learning parameters (architecture is set)


neighborhood parameter : Kohonen maps


vigilance parameter : ART


Controlling the Training Process with
Learning Parameters (6)

-
Unsupervised learning


Neighborhood


the area around the winning unit, where the non
-
wining
units will also be modified


roughly half the size of maximum dimension of the
output layer


2 methods for controlling


square neighborhood function, linear decrease in the learning
rate


Gaussian shaped neighborhood, exponential decay of the
learning rate


the number of epochs parameter


important in keeping the locality of the topographic
amps

Controlling the Training Process with
Learning Parameters (7)

-
Unsupervised learning


Vigilance


control how picky the neural network is going to be
when clustering data


discriminating when evaluating the differences between
two patterns


close
-
enough


Too
-
high Vigilance


use up all of the output units


Iterative Development Process (1)


Network convergence issues


fall quickly and then stays flat / reach the global
minima


oscillates up and down / trapped in a local minima


문제의

해결

방법


some random noise


reset the network weights and start all again


design decision




Iterative Development Process (2)


Iterative Development Process (3)


Model selection


inappropriate neural network model for the function to
perform


add hidden units or another layer of hidden units


strong temporal or time element embedded


recurrent back propagation


radial basis function network


Data representation


key parameter is not scaled or coded


key parameter is missing from the training data


experience

Iterative Development Process (4)


Model architecture


not converge : too complex for the architecture


some additional hidden units, good


adding many more?


Just, Memorize the training patterns


Keeping the hidden layers as this as possible, get the
best results



Avoiding Over
-
training


Over
-
training


같은

pattern


계속적으로

학습


cannot generalize


새로운

pattern


대한

처리



switch between training and testing data


Automating the Process


Automate the selection of the appropriate number
of hidden layers and hidden units


pruning out nodes and connections


genetic algorithms


opposite approach to pruning


the use of intelligent agents