Better Performance of Priori Weight on Fuzzy Neural Network for Classification of Criminal Law

glibdoadingAI and Robotics

Oct 20, 2013 (3 years and 10 months ago)

101 views

Better Performance of Priori Weight on Fuzzy Neural
Network for Classification of Criminal Law

Janeela Theresa
1

and
Joseph Raj

2

1
Department of MCA
,
St. Xavier’s Catholic College of Engineering

,
Anna University,
Chennai, India
.

theresajaneela@yahoo.com

2

Department of Computer Sciences
,

Kamaraj College, M
anonmani
a
m Sundaranar

University
, Tirunelveli, India

v.jose08@gmail.com


Abstract

The paper proposes a Fuzzy Neural N
etwork (FNN) classifier using Priori and Random weights
in Criminal law using Lagrange Interpolation and Gaussian
m
embership function
s
. Since the
initial weights of Fuzzy Neural Networks affect the performance of the classification models, the
implementati
on

of Priori weights improves the classification results. The classifier model is
implemented in C++ and tested on real data sets and found to perform well. Experimental
results show that the FNN using Priori weights outperform FNN using Random weights.

Keywords

Criminal Law, Fuzzy Neural Network (FNN), Gaussian membership function, Priori weights, Lagrange
Interpolation membership function.

1.

I
NTRODUCTION

Criminal law is the body of law that relates to crime. It
can

be defined as the body of rules that

defines conduct that is not allowed because it is held to threaten, harm or endanger the safety and
welfare of people, and that sets out the punishment to be imposed on people who do not obey
these laws. The court’s judgments are based on the provisions o
f codes and statutes, from which
solutions in particular cases are to be derived. Courts thus have to reason extensively on the basis
of general rules and principles of the code [1]. Most researchers in artificial intelligence and law
should examine how le
gal theorists view judicial adjudication, and particularly the degree to
which judges are constrained by the law in the decisions they make. Naturally this involves a
subsidiary analysis about the way in which lawyers predict the outcome of cases, based up
on
their understanding of the adjudicatory models adopted by a given judge or judges [2]. A judge
must decide on legally relevant situations, which can often be described only in indeterminate
terms. The decisions must be determinate, however and often eve
n expressed as a numerical
quantity [3]. The purpose of a fuzzy logic application in legal science is to assist lawyers and
judges in forming a judgement on facts given by the computer [4].

While modeling the real world problem one is encountered by proble
m of vagueness in some of
the parameters and hence fuzzy numbers are used rather than crisp numbers. It is believed that
fuzzy systems theory is more proper to solve complex systems, especially humanistic systems.
Fuzzy systems theory was developed based o
n fuzzy logic and some other related disciplines in
such a way that the relationships among system variables are expressed via fuzzy logic [5]. In
these systems, the parameters of the fuzzy system are determined by means of the learning
algorithms used in
neural networks. Researchers have proposed different models and algorithms
based on neural networks, fuzzy systems or their combinations.

Feng et al. [6] proposed a training algorithm for hierarchical hybrid fuzzy

neural networks based
on Gaussian Members
hip Function (MF) that adjusts the widths of Gaussian MF of the IF parts
of fuzzy rules in the lower fuzzy sub
-
systems, and updates the weights and bias terms of the
neural network by gradient
-
descent method which results in fewer parameters. Garg et al. [
7]
presented the study of examining the suitability of fuzzy numbers having Gaussian
M
embership
F
unction and its implementation in finding the solution of fuzzy linear and non linear system.
Giovanna et al. [8] proposed an approach to extract automatically

fuzzy rules by learning from
data, with the main objective of obtaining human
-
readable fuzzy knowledge base. A neuro
-
fuzzy
model and its learning algorithm is developed that works in a parameter space with reduced
dimensionality with respect to space of
all the free parameters of the model. Lee et al. [9] deals
with the function approximation problem by analyzing the relationship between membership
functions and approximation accuracy in FNN systems. Its objective is to find a functional
expansion of the
Gaussian function and to tune the weight so as to modify the shape. Li

et al.

[10] presented a theoretical tool or basis for fuzzy logic systems or neural networks, especially
when combining both of them. Nyongesa [11] reports on studies to overcome diffic
ulties
associated with setting of the learning rates of back propagation neural networks by using fuzzy
logic.

A preliminary attempt, made by the authors was to develop an Analogy Making in Criminal Law
with Neural Network [12]. Unfortunately, neural netwo
rks are able to learn or represent
knowledge explicitly, which a fuzzy system can do through fuzzy if
-
then rules. Thus an
integration of neural networks and fuzzy systems can yield systems, which are capable of
learning and decision making. The layers of t
he network learn the rules required for the
classification task. The motivation of this paper is to construct a more effective training
algorithm with higher classification accuracy with priori weights. The database consists of 399
court decisions. The pro
posed system of Priori and Random Weights Fuzzy Neural Networks
classifier in Criminal Law can serve as an aid for judges about to pass sentence in criminal law
and as a learning tool for law students.

This paper is organized as follows: Section

2 describe
s about Criminal law,

Section 3 describes
the Methodology and Simulation Design for classification and also the network architecture.
Section 4 gives the Experimental results and discussions on a real data set. Finally, the paper
concludes with

section 5.

2.

C
RIMINAL LAW

In court level, one of the most challenging and complex legal activity is judicial reasoning. The
activity requires skillful of consideration and examination in the credibility of witnesses and
evidences, meanings and effects of precedents
and consistency with legal principles. In most
countries, judges and lawyers require to justify these factors in text documents with no
semantics, or written hard copies.

The advantages of the investigation
in
computer system
,

artificial intelligence and e
xpert systems
in case law reasoning and they found that there are various ways to implement the intelligent
system in different legal domain such as case reasoning, case classification, rules interpreting,
etc. Most of the researchers investigated in legal

reasoning of the common law model which
started to learn or build legal rules from cases [13].

Criminal act or omission is considered as an offence against the state. So the criminal acts are
punishable. The Indian Penal Code consists of 511 sections and
starts with its applicability and
jurisdiction. There are 9 defenses which are treated as exceptions to crime and as such the
offender is not punishable.

3.

M
ETHODOLOGY AND
S
IMULATION
D
ESIGN

A fuzzy logic system is unique because it is able to handle num
erical data and linguistic
knowledge

simultaneously
. It is a non linear mapping of an input data vector into a scalar output.
The architecture of FNN system is extended from the multilayer feedforward neural network
(See fig
ure
1).













Fig
ure 1.

F
uzzy Neural Network Architecture








X
1








X
L



FUZZIFICATION LAYER

Input

L
ayer

Hidden

Layer

Output


Laye
r

Mechanism for
generation of
priori weights

Selection layer


The FNN system consists of the components of a conventional fuzzy system except that
computation of degree of membership for each rule is performed by Fuzzification layer and the
neural network’s learning capacity is provi
ded to enhance the system knowledge.

3.1.
Generating Degree of Membership

There are several types of Membership Functions (MF) in representing fuzzy phenomena
including Gaussian and Lagrange Interpolation
M
embership
F
unction.

The degree of membership of
Gaussian
MF

is generated for
the
i
th

pattern

with
the
j
th

class
, can be generated as follows:



(1)

where
.

and

are the
centre and width to the set of patterns for that
particular feature across
M

classes.

The method of Lagrange interpolation is a curve
-
fitting method in which the constructed MF is
assumed to be expressed by a suitable polynomial form. Given the function

and
given
n+1
distinct point


in
.

L
et

denote the polynomial of degree at most
n

that interpolates
f

at the given points. In other words,

is suc
h that



(2)

where
L
i

are the Lagrange coefficient polynomials given by




3.2.
Fuzzy Neural Network Algorithm

Set initial weights to be priori weights and small random weights for the Inp
ut
-
Hidden layer

and Hidden
-
Output layer

Let learning
-
rate (η) = 0.9, Momentum factor (α) = 0.00001 and
L

= Number of input features. Choose an acceptable error
value
. The basic equations of this
algorithm are outlined as follows:

The activation function of each node with
its inputs and outputs is discussed next layer by layer.
The training phase use the concept of backpropagation to minimize the error function



Layer 1:
Each node in layer 1 represents an input linguistic variable of the network and is
used as a buffer to tr
ansmit the input to the next layer, which is to the MF nodes of its
linguistic values. Thus the number of nodes in this layer, is equal to
L
. Let

be the input to
the
i
th

node in layer 1 then the output of the node will be



(
3
)



Layer 2:
This is the fuzzification layer. Each node in this layer corresponds to one linguistic
label to one of the input variables in layer 2. In other words, the output link represents the
membership value that specifies the degree to
which an input value belongs to a fuzzy set
[14]. The output of a layer 2 node represents the MF grade of the input with respect to a
linguistic value. Each neuron performs a Gaussian
or Lagrange Interpolation MF
. The output
of a node in layer 2 is compute
d by.



(
4
)



Layer 3:
This level performs the defuzzification. The activation and output value for the
neurons of hidden layers can be written as,



(5)



(6)

where

is

the connection weight between the
i
th

node of the second level and the j
th
neuron
of the third level.



Layer 4:
This is the output layer and each node in this layer represents a class. Thus, if there
are
M

classes then there will be
M
nodes in layer 4. T
he nodes in layers 3 and 4 are fully
connected. The output of node
k

in layer 4 is computed as



(7)





where

is the connection weight between the
j
th

node of the third level

and the
k
th

neuron of the fourth level
.


The error

in the network
can be

calculated as




(8)

where

is the difference between the target output value and the actual output val
ue of
the output layer. To minimize the error signal, coupling
-
strength is updated by an amount
proportional to the partial derivative of

with respect to

(Weight between hidden and
output layer units).



(9)

where



(10)

Similarly the partial derivative of

with respect to

(Weight between layer 2 and layer 3
units) can be derived as follows



(1
1)

where



(12)

This is the error removed from the hidden layer.

Change in weights in the network can be calculated as,



(13)



(14)

α and η are learning coefficients, which are usually,
chosen by trial and error. The network
learns the weights of the links connecting layers 3 and 4 and also the parameters associated with
nodes in layer 2, which do the feature selection. The initial values of weights can be so selected
that no feature gets

into the network in the beginning and the learning algorithm will pass the
features, which are important, i.e., the features that can reduce the error rapidly.

3.3
Priori Knowledge

One way to set the weights explicitly is using a priori knowledge. Select
ion layer in fig
ure
. 1
generates priori weights based on the impact of input features for the classification in criminal
law. A priori knowledge for the design of neural networks helps to solve some basic difficulties
encountered in practice: Inefficient t
raining and bad generalization of neural networks. The main
advantages of using a priori knowledge for the design of neural networks should be smaller
needs for learning data, improved training efficiency and better generalization. Since any
information th
at is built directly into a network does not have to be learned any more, less
learning data is needed and generalization is improved. Network training becomes more efficient
because smaller networks can be used to learn the remaining learning data. A prio
ri knowledge
for the design of networks generates consistent benefits for a wide range of different
applications. After initial design, the FNN can learn from training data sets [15]. This allows an
optimal integration of both a priori knowledge and of lea
rning data into FNNs.

4.

E
XPERIMENTAL
R
ESULTS AND
D
ISCUSSION

The methodology is tested on real world data sets with 300 samples as training and 99 samples
as testing to find optimum performance of Artificial Neural Network (ANN), FNN with
Lagrange Interpol
ation and Gaussian MF in criminal law. The classifier model is tested for 3 to
30 hidden neurons in the hidden layer using priori weights and random weights. The average and
optimum number of neurons in the hidden layer is reported in terms of number of it
erations,
training time and classification accuracy. The parameters used in all the models are listed in
Table 1 and performance of the networks is evaluated using both the training and testing data
sets. In this application, the identification of preceden
ts in the area of criminal law is examined.
It is known that all features that characterize a data point may not have the same impact with
regard to its classification. Priori weights are assigned based on the impact of features for the
classification in c
riminal law. It comprises 27 input features and 3 output features. The input
features considered are committing murder, imprisonment for life for committing murder,
inhuman acts, the accused acted in a cruel manner, death due to homicide etc. Use of deadly

weapon
,

with the intention of causing grievous injuries, trustworthy evidence by an eye witness,
abetment of suicide of a child or an insane person, kidnapping for ransom or murder, rape, death
of a woman caused by burns or bodily injury
,

within seven ye
ars of her marriage
,

subject to
cruelty or harassment by husband for dowry, before the death, the injure
r
’s dying declaration,
circumstantial evidence, medical evidence, member of unlawful assembly, number of persons
killed, strangulation, grave and sudd
en provocation, house tresspass and theft, causing
miscarriage, death of a quick unborn child and causing miscarriage without woman’s consent.
The punishment for the offence is given as output. In the experimental evaluation of the
algorithm on the set of
real world data of murder cases, the total number of iterations, total
running time in seconds and the classification accuracy are taken as the performance metric.

Table 1
.

Parameters used for FNN

Parameter

Number/Value

Input nodes

Output nodes

Learning rate (η)

Momentum Term (α)
=
乥畲ul
=
Network Error Tolerance (τ)
=
f湩n楡氠睥ig桴猠慮搠扩a獥搠瑥d洠癡汵ms
=
景f=
oa湤潭⁷n楧桴猠慮搠景f=楮i瑩a氠睥楧桴猠扥瑷he渠
桩摤敮
-
潵瑰畴=yer
=
f湩n楡氠睥ig桴猠扥瑷ee渠n湰畴
-
桩摤敮hye爠ro爠
m物潲椠睥楧桴h
=

27

3

0.
9

0.00001

0.01

Randomly Generated Values betwee
n
-
1 and 1


Values generated between 0.5 to 1 for
weights emerging from most important
features and values generated between
0 to
-
0.5 for weights emerging from
least important features

T
able 2.

Results of classification model.


Table 2 shows the results of classifier in cri
minal law for all possible hidden neurons for ANN
and FNN using Gaussian and Lagrange Interpolation MFs for Priori weights and Radom weights.



Figure 2.
The Comparison chart for
Optimum
training time of criminal law between ANN and
FNN


Classifier
Model

ANN

FNN with Gaussi
an MF

FNN with Lagrange interpolation
MF

No of
iterations

Training
time (Sec)

Test Set
accuracy
rate (%)

No of
iterations

Training
time (Sec)

accuracy

No of
iterations

Training
time (Sec)

Test Set
accuracy
rate (%)

Priori Weights

Optimum
number of
neuro
ns of
hidden
layer

168


1.208791


97.98

189

0.879121

98.99

147

0.879121

98.99

Average
number of
neurons of
hidden
layer

143.21


1.226452


95.24

131.79

1.642465

95.28

140.96

1.774522

95.71

Random Weights

Optimum
number of
neurons of
hidden
layer

289


1.2
08791


96.97

164

0.934066

97.98

152

0.894176


97.98


Average
number of
neurons of
hidden
layer

522.71


4.515306


93.90


161.30

2
.032967

95.21

154.19

1.929739

95.70

Fig. 2 and fig.
3 shows the comparative performance of classifiers in criminal law in terms of
training time and accuracy rate for test data set respectively. As shown
in

fig. 2 and fig. 3
experimental results clearly reveals that ANN and FNN using Priori weights outperfo
rm ANN
and FNN using random weights. As seen
in

fig. 3 FNN with priori weights is classified with an
accuracy of 98.99%.



Figure 3.
The Comparison chart for
Optimum Classification Accuracy
of criminal law between
ANN and FNN


5.

C
ONCLUSION

The authors
have
proposed a Fuzzy Neural Network (FNN) classifier using Priori and Random
weights in Criminal law using Lagrange Interpolation and Gaussian membership function. The
performance of optimum and average numbers of neurons in the hidden layer is reported.
The
proposed system was trained and tested for sufficient number of court decisions. The results
show that the networks using priori weights have lower training time and increases classification
accuracy on test data set than the networks using random weig
hts.


R
EFERENCES


[1]

Sotarat Thammaboosadee and Udom Silparcha, A GUI Prototype for the Framework of Criminal Judicial
Reasoning System, Journal of International Commercial Law and Technology, Vol. 4, Issue 3, pp. 224
-
230
,
2009
.

[2]

Dan Hunter, Out of their minds
: Legal theory in neural networks, Artificial intelligence and Law
,

7
,

129
-
151
,
1999
.

[3]

Lothar Philipps, Giovanni Sartor, Introduction: From legal theories to neural networks and fuzzy reasoning,
Artificial intelligence and Law
,
7
,
115
-
128
, 1999
.

[4]

Jurgen Holl
atz, A Fuzzy Advisory System for Estimating the Waiting Period after Traffic Accidents, IEEE
Fuzzy Systems
,
3
,

1475
-
1479
, 1997
.

[5]

R. Saneifard, Some properties of neural networks in designing fuzzy systems, Neural Computing & Applications,
DOI 10.1007/s00521
-
011
-
0777
-
1
, 2011
.

[6]

Shuang Feng, HongxingLi, DanHuc , A new training algorithm for HHFNN based on Gaussian membership
function for approximations, Neurocomputing Vol. 72, pp. 1631
-
1638
,

2009
.

[7]

Anjeli Garg and S. R. Singh, Solving Fuzzy system of Equations

Using Gaussian Membership Function,
International Journal of Computational Cognition, Vol. 7, No. 4,
pp. 25
-
32,
December 2009.

[8]

Giovanna Castellano, Anna Maria Fanelli, Corrado Mencar, A neuro
-
fuzzy network to generate human
-
understandable knowledge from d
ata, Cognitive Systems Research vol. 3
, pp. 125

144

, 2002
.

[9]

Ching
-
Hung Lee and Ching
-
Cheng Teng, Fine Tuning of Membership Functions for Fuzzy Neural Systems,
Asian Journal of Control, Vol 3, No 3, pp. 216
-
225
, 2001
.

[10]

Hong
-
Xing Li and C. L. Philip Chen,
The Equivalence Between Fuzzy Logic Systems and Feedforward Neural
Networks, IEEE
Transactions on Neural Networks,
Vol 11, No 3, pp. 356
-
365
, 2000
.

[11]

H.O. Nyongesa, Fuzzy Assisted Learning in Back propagation Neural Networks, Neural computing &
Applications,

1997 Springer
-
verlog London Limited.
p
p. 424
-
428.

[12]

M.M. Janeela Theresa, V. Joseph Raj, Analogy Making in Criminal Law with Neural Network, Proceedings of
ICETECT 2011,

IEEE explore,
pp. 772
-
775.

[13]

Sotarat Thammaboosadee, Udom Silparcha, A Framework for Crim
inal Judicial Reasoning System using Data
Mining Techniques, FUZZ
-
IEEE’97, pp. 1475
-
1479.

[14]

M. Tabesh, M. Dini, Fuzzy and Neuro
-
Fuzzy Models for Short
-
Term Water Demand Forecasting in Tehran,
Iranian Journal of Science & Technology, Transaction B, Engineerin
g, Vol. 33, No. B1, pp 61
-
77.

[15]

Rico A. Cozzio
-
Bueler, (1995)The Design of Neural Networks using A Priori Knowledge, Ph. D Dissertation,
Swiss Federal Institute of Technology, Zurich
.

Authors


M. M. Janeela Theresa

received her MPhil degree from
Mother Ter
esa Women’s University, Kodaikanal, India and
P
ost
G
raduate degree

from

Barathidasan

University
,
Tiruchirapalli
, India.

She is presently working as Assistant
professor at St. Xavier’s Catholic College of Engineering,
Anna University
, Chunkankadai, India. S
he has a vast
teaching experience of about 18 years and research experience
of about 12 years. Her research interests include neural
network, data mining, expert system and soft computing.






V. Joseph Raj

received his PhD degree from M
anonmaniam
Sund
aranar
University, Tirunelveli, India and P
ost
G
raduate
degree

in Anna University, Chennai, India. He is presently
working as a Professor

and Head of the Department of
Computer Science,

Kamaraj College,
Manonmaniam
Sundaranar University,
Thoothukudi, India. He has also
worked as a
n Associate
Professor in

Computer Engineering in

Euro
pean University of Lefke, North Cyprus
. He is guiding
PhD scholar
s of various Indian
universities. He has a vast

teaching experience of about 25
years

and resea
rch experience
of about 18
years.
His research inter
ests include neural
network,
biometrics
, network security and operations research.