Performance Evaluation of BPNN and Genetic Algorithm

toycutnshootΔίκτυα και Επικοινωνίες

27 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

73 εμφανίσεις

Available ONLINE
www.vsrdjournals.com





VSRD
-
IJCSIT, Vol. 2 (7
), 201
2
,
1
-
5



____________________________

1
,2
Research Scholar, Department of Information Technology, Shri Vaishnav Institute of Technology &

Science, Indore,
Madhya Pradesh, INDIA.

*Correspondence :
ankurshrivastav@live.com

R
R
R
E
E
E
S
S
S
E
E
E
A
A
A
R
R
R
C
C
C
H
H
H



C
C
C
O
O
O
M
M
M
M
M
M
U
U
U
N
N
N
I
I
I
C
C
C
A
A
A
T
T
T
I
I
I
O
O
O
N
N
N



Performance Evaluation of BPNN


and

Genetic Algorithm

1
Ankur Shrivastava
*

and

2
Surbhi Hardikar

ABSTRACT

Security threats had always been around.

Intrusion detection systems are the important component of the
network security [1]. It is one of the important ways to solve network security problems. Detection precision and
detection stability are two key indicators to evaluat
e intrusion detection sys
tems [2
].

The protocol
acknowledgement module includes packet filtering and state protocol analysis techniques.

Protocol analysis is
the technology of individual data packet analysis; State protocol analysis not only detects a single connection
request or
response, but also a session of all traffic as a whole to be considered. In this

paper

we have proposed
a new IDS scheme based on packet filtering that will reduce the false ratio and maximize the automatic response
to attacks. In this paper
we provide per
formance study of

two classification algorithms; back

propagation neural
network and genetic algorithms, and compare the performance parameters of both of them

for the purpose of
improving performance of IDS system
.

Keywords :

IDS, Packet Filter, Protocol
Analysis, Back Propagation Neural Network And Genetic Algorithm.

1.

INTRODUCTION

Intrusion detection

system (IDS)

has had its problems, such as false positives, operational issues in high
-
speed
environments, and the difficulty of detecting unknown threats. I
n addition, intrusion prevention is still in its
infancy. Most of the problems with intrusion detection are caused by improper implementation and
misunderstanding of what the technology can and cannot do.

Main purpose of IDS is to find out intrusions
among

normal audit data and this can be considered as classification problem.
IDS

are an effective security
technology, which can detect, prevent and possibly react to the attack. It performs monitoring of target sources
of activities, such as audit and network

traffic data in computer or network systems, requiring security measures,
and employs various techniques for providing security services. With the tremendous growth of network
-
based
Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
2

of 9


services and sensitive information on networks, network security is becom
ing more and
more important than
ever before
.
It w
as become increasingly important to

make our information systems,
especially those used for
critical functions in the military and commercial sectors, resi
stant
and tolerant of such attacks. Intrusion
detec
tion inc
ludes identifying a set of mali
cious actions

against the intruders or hacker who tries to gain access
to several important document, files servers and many more other valuable information

that comprom
ise the
integrity and

confidential
ity of a natio
n, organization or indivisible data.

2.

PACKET FILTERS

[3]
:


Firewalls make a
simple decision; accept or discard

communication. There are two distinct types of firewalls:
packet filters and proxy servers. The difference between the two types of firewalls lie
s in what information the
firewall uses to make the a
ccept or discard a

decision. The packet filter is the simpler of the two firewalls. The
packet filter makes its decision using network information. By network
information, we

mean the information
contain
ed in the TCP, UDP, IP, and other protocol headers. The packet filter does not examine the data section
of a packet. A proxy server, on the other hand, operates at the application level. For instance, an http proxy
server firewall can make a decision to ac
cept or d
iscard a communications based on the

web page

content
.
Pac
ket filters are fast
,

easy to maintain and implement and they are also very less costly
. Proxy servers can make
more informed

d
ecisions
, but because of they are slower,
more difficult to ma
intain

and implement and they are
very expensive also, so proxy servers are less used as compare to packet filters.
Packet Filter is the default
packet

filtering program for Opens

[4
].

Packets are matched against rules on a last
match

basis, as with

IP Filter,
and can match more than one rule.

Basically Packet Filters are of 3 types:



Address filters



Protocol filters



Data set fi
l
ters


Address Filters

:
They are mainly used to capture

the desired traffic

on the basis of IP or IPx and
source

or
destinat
ion
addr
ess.

Protocol Filters

:

Protocol filters help

to

reduce the traf
fic based on
protocol

or the operation of a program
. For
example

if you want to capture all OSPF, ICMP or DNS traffic so there are various number of protocol are
available today.

Data Set Filters
:
These types of filters are called advance filters. Data set filters provide a unique technique that
allows us to

define interes
ting traffic based on a specified

value at a

specified

offset within a packet.




Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
3

of 9







Add Process ID


Read Process ID



Windows

Process ID set


Packet filtering









Delete Pro
cess ID












Process Monitor



Fi
g. 1

:
Basic Packet Filter Model

3.

BACKGROUND

As attacks are constantly changing, a flexible IDS

is required to analyze the enormous amount of network traffic
in a manner which is less structured than the traditional rule
-
based system. In [5] [6], neural networks have been
proposed as alternatives to the statistical analysis component of anomaly dete
ction systems.
Neural networks
and Genetic algorithms [7] analyzed program behavior profiles for both anomaly detection and misuse detection
to identify the normal system behavior.

4.

BACK

PROPAGATION NEURAL NETWORK

Back

propagation

is a common method for
tra
ining

artificial neural network

to

reduce

the objective function.


It
is a

supervised learning
method

and

generalization of the

delta rule
. It requires a dataset of the desired output for
many inputs, making up the training
set. It is most useful for feed

forward networks
.

Back

propagation requires

the

activation function
used

by the

artificial neurons

be

differentiable. To understand the working of

the back

propagation learning algorithm
, it can be classified as can

propagation and weight update.

4.1.

Propagation

The following steps are used in each propagation
:



The

training pattern's input

for f
orward propagation through the neural network in order to generate the
propagation's output activations.



The propagation's output activations for backward propa
gation through the neural network used the training
pattern's target in order to generate the deltas of all output neurons and hidden neurons.

4.2.

Weight Update

The following steps are used in

each weight
:

Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
4

of 9




Multiplication function is performed between its input

activation and output delta to find the gradient of its
weight.



Convey the weight in opposite direction of the gradient by deducting a ratio of it from the weight.

The quality and speed
of learning

is determined by this ratio, it is known as

learning rate
.

The weight is always
updated in inverse direction depending upon the sign of the gradient of weight showing where the error is
stepping up.

Iterate
1
st

and 2
nd

steps to attain the network operation at a fix level.

5.

MODES OF

LEARNING

In back propagation th
ere are
two
modes of learning t: 1
st

is on
line

learning and

2
nd

is batch learning. In 1
st

mode
of learning,
ea
ch
propagation is

conforming to

directly
by a weight update.

But in 2
nd

mode of learning
much
propagation occurs

earlier weight updating occurs.
More memory capacity

is requisite in
Batch learni
ng
, but
more updates

are required in online learning.

5.1.

Basic Algorithm

Actual algorithm for a 3
-
layer n
etwork (only one hidden layer):

Initialize the weight in the network

Do


For each example in the

training set


Neural
-
net
-
output; forward pass


Teacher output for


Calculate error at the output units


Compute delta_wh for all weight from hidden layer to output layer; backward pass


Compute delta_wh

for all weight from input layer to hidden layer; backward pass continued


Update the weight in the network

Until all examples classified correctly or stopping criterion satisfied

Return the network

5.2.

Genetic
Algorithm

Genetic algorithms were developed by Jo
hn Holland at the

University of Michigan in the early
1970’s [8].
Genetic algorithms are theoretically and empirically
proven to provide robust search in complex
spaces
(Goldberg, 1989) [9].
Genetic algorithms are stochastic search methods that

mimic natural biological
evolution. Genetic algorithms
operate on a population (a group of individuals) of
potential solutions applying
the principle of survival of the fittest given by Charles Darwin [10]

to generate improved estimations to a
solution.
At each

generation, a new set of approximations is created by
the process of selecting individuals
according to their
level of fitness and breeding them together using genetic
operators inspired by natural
Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
5

of 9


genetics. This process leads to

the evolution of b
etter populations than the previous
populations [11].

Reproduction

:

Reproduction (or selection) is an operator

[12]

that makes more copies of bett
er strings in a
new population.
Reproduction is usually the first operator applied on a population.
Reproduction selects good
str
ings
in a population and forms a mating pool. This is one of the reasons fo
r the reproduction operation to
be
sometimes
known as the selection operator
. To sustain the generation of a new population, the reproduction of
the ind
ividuals in the current population is necessary. For better individuals, these should be from the fittest
individuals of the prev
ious population.

Crossover

:
A crossover operator

[12]

is used to recombine two strings to get a better string. In crossover
op
eration,

r
ecombination process creates diff
erent individuals in the successive ge
nerations by combining
material
from two individuals of the previous generation. In reproduction, g
ood strings in a population are
probabilistic
ally assigned a larger number o
f copies and a mating pool is
formed. It is important to note
that no
new strings are formed in the reproduction phase. In the crossover opera
tor, new strings are
created by
exchanging information am
ong strings of the mating pool.
It is clea
r that the eff
e
ct of cross
over may be
detrimental or benefi
cial. Thus, in order to pr
eserve some of the good strings
that are already present in the
mating pool, all strings in the mating pool are not used in crossover.

Mutation :
Mutation

[12]

adds new information in a

random way to the genetic search process and ultimately
helps

to avoid getting trapped at local optima. It is an operator that introdu
ces diversity in the population
whenever the population tends to become homogeneous due to repeated us
e of reproduction a
nd crossover
operators. Mutation may cause the chro
mosomes of individuals to be diff
e
rent from those of their parent
individuals.
Mutation in a way is the process of randomly disturbing genetic information. They ope
rate at the
bit
level; when the bits are
being copied from the current string to the n
ew string, there is probability
that each bit
may become mutated. This probability is usually a quite
small value, called as mutation
probability pm. The
mutation is also us
ed to maintain diversity in the
popula
tion.

Pseudo
-
code of the standard genetic algorithm:

BEGIN GA

gen:=0 { generation counter }

Initialize population P(g)

Evaluate population P(g)

done:=false

WHILE not done DO

gen:=gen+1

Select P(gen) from P(gen
-
1)

Crossover P(gen)

Mutate P(gen)

Evaluate P(
gen)

done:=Optimization criteria met?

END WHILE

Output best solution

END GA

6.

COMPARATIVE ANALYSIS

Accuracy :
In the fields of science, engineering, training, industries, professional work, security purposes
the

accuracy

of a

measurement

system is the degree of closeness of measurements of a

quantity

to that quantity's
Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
6

of 9


actual

value
, that is the accuracy is the proportion of true results true positive and true negative in the
population.





Accuracy =





No.
of true posi ti ve + no. of true negati ves




No. of


true posi ti ve + fal se posi ti ve + fal se negati ve + true negati ve

Accuracy(
in Percentage)


Table

1

:

Accuracy Comparison Between BPN and GA Algorithm In Percentage

Data
set size

BPN

GA

50

86.26%

72.34%

100

82.31%

74.62%

150

85.44%

70.52%

200

89.73%

72.42%

250

86.38%

69.23%



Graph 1 : Accuracy Graph

Between BPN
and GA Algorithm In Percentage

7.

MEMORY USED

How much amount of memory

is used by a particular program to build an execute it successfully in different
condition is known as memory used.

Memory used(
in
kilo
bytes)


Table

2 :

Memory

Comparison Between BPN and GA Algorithm

In Bytes

Data set size

BPN

GA

50

36432

32339

100

36842

33564

150

35263

32873

200

36271

33615

250

36813

33274



Graph 2

:

Memory
Used Graph

Between
BPN
and GA
Algorithm In Byte

0
10
20
30
40
50
60
70
80
90
100
50
100
150
200
250
BPN
GA
30000
32000
34000
36000
38000
50
100
150
200
250
BPN
GA
Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
7

of 9


8.

TRAINING TIME

Extraction from data set to data model is known as training time. Training time is depend upon the
number of
layers used and number of epoc is used in training. And accuracy of program is d
epend upon the number of
training
layers, greater number of layers there will be more accuracy.

Training
time (in milliseconds)

Table

3

:

Training
Time

Comparison Between
BPN a
nd GA
Algorithm In Milliseconds

Data set size

BPN

GA

50

165 msec

254 msec

100

284 msec

564 msec

150

546 msec

582 msec

200

345 msec

458 msec

250

549 msec

455 msec


Graph 3

:

Training
Time Graph

Between
BPN and GA
Algorithm In Milliseconds

9.


CONCLUSION AND FUTURE WORK

An important question is, how well does the performance of NNID scale with the number of users
.

However, the
rate of detecting anomalies may not change much, as long as the network can learn the user patterns well. Any
activity that differs from the user’s normal behavior would
still be detected as an anomaly
.

NNID is easy to train
and inexpensive
to run
.

There are so many methods of IDS have been evolved but most of the methods are not as accurate as desire. we
want to enhanced the protocol acknowledge
[13]

technique to make is faster applying new machine learning
approach that will work on real ti
me and prevent attacks on network.

Our main goal is to reduce
false positive
and false negative ratio in intrusion detection system. On the basis of this comparison which we did in this
paper, back propagation neural network algorithm is showing the best r
esult, so it is implemented in IDS.


10.

REFERENCES

[1]

Xuejun Ding, Guiling Zhang, Yongzhen Ke, Baolin Ma and Zhichao Li “High Efficient Intrusion
Detection Methodology with Twin Support Vector Machines” ,IEEE,2008

[2]

H. Debar, M. Becke, D. Siboni, "A Neural Networ
k Component for an Intrusion Detection System, " IEEE

Symposium on Research in Security and Privacy, 1992.

[3]

E. Eiben, R. Hinterding, and Z. Michalewicz, "Parameter Control in Evolutionary Algorithms"
, IEEE
Transations on

Evolutionary Computation
, IEEE, 3(2)
, 1999, pp. 124
-
141.

[4]

Chundong Wang, Quancai Deng, Qing Chang, Hua Zhang and Huaibin Wang “A New Intrusion Detection
System Based on Protocol Acknowledgement”,IEEE,2010

0
100
200
300
400
500
600
700
50
100
150
200
250
BPN
GA
Ankur Shrivastava
et al

/ VSRD
International Journal of CS & IT Vol. 2 (7), 2012


Page
8

of 9


[5]

H. Debar, B. Dorizzi, "An Application of a Recurrent Network to an Intrusion Detection
System,”
International

Joint Conference on Neural Networks. 1992, pp. (II)

478
-
483.

[6]

A. Ghosh, A. Schwartzbard, "A study in using Neural Networks for Anomaly and Misuse Detection, "
Proceedings of the 8th USENIX Security Symposium, 1999
.

[7]

Larson, Edward J.
(2004).
Evolution:

The Remarkable History of a Scientific Theory.
Modern library. ISBN
0
-
679
-
64288
-
9.

[8]

Anderson, J. “An introduction to neural networks” ,

Cambridge: MIT Press,1995.

[9]

Holland, J. H.,
Adaptation in natural and artificial systems
. Ann Arbor: The University of Michigan Press,
1975.

[10]

Goldberg, D. E.,
Genetic Algorithms in Search, Optimization
, and Machine Learning. Reading, Mass.:
Addison
-
Wesley,
1989.



.

[11]

Mathew, Tom V., Assistant Professor, “
Genetic Algorithm
” Department of Civil Engineering, Indian
Institute of
Technology Bombay, Mumbai .
http://www.
civil.iitb.ac.in/tvm/2701_dga/2701
-
ga
-
notes/gadoc/gadoc.html

[12]

B
asic packet filtering, By Laura Chappelle, protocol analysis institute,
LLC
www.packet
-
level.com
.


[13]

Nick Holland and
Joel Knight. Pf: The Open BSD Pack
et Filter [online].
http://www.openbsd
. org/faq/pf.
Last visited 23/05/2005

