Better Performance of Priori Weight on Fuzzy Neural
Network for Classification of Criminal Law
Janeela Theresa
1
and
Joseph Raj
2
1
Department of MCA
,
St. Xavier’s Catholic College of Engineering
,
Anna University,
Chennai, India
.
theresajaneela@yahoo.com
2
Department of Computer Sciences
,
Kamaraj College, M
anonmani
a
m Sundaranar
University
, Tirunelveli, India
v.jose08@gmail.com
Abstract
The paper proposes a Fuzzy Neural N
etwork (FNN) classifier using Priori and Random weights
in Criminal law using Lagrange Interpolation and Gaussian
m
embership function
s
. Since the
initial weights of Fuzzy Neural Networks affect the performance of the classification models, the
implementati
on
of Priori weights improves the classification results. The classifier model is
implemented in C++ and tested on real data sets and found to perform well. Experimental
results show that the FNN using Priori weights outperform FNN using Random weights.
Keywords
Criminal Law, Fuzzy Neural Network (FNN), Gaussian membership function, Priori weights, Lagrange
Interpolation membership function.
1.
I
NTRODUCTION
Criminal law is the body of law that relates to crime. It
can
be defined as the body of rules that
defines conduct that is not allowed because it is held to threaten, harm or endanger the safety and
welfare of people, and that sets out the punishment to be imposed on people who do not obey
these laws. The court’s judgments are based on the provisions o
f codes and statutes, from which
solutions in particular cases are to be derived. Courts thus have to reason extensively on the basis
of general rules and principles of the code [1]. Most researchers in artificial intelligence and law
should examine how le
gal theorists view judicial adjudication, and particularly the degree to
which judges are constrained by the law in the decisions they make. Naturally this involves a
subsidiary analysis about the way in which lawyers predict the outcome of cases, based up
on
their understanding of the adjudicatory models adopted by a given judge or judges [2]. A judge
must decide on legally relevant situations, which can often be described only in indeterminate
terms. The decisions must be determinate, however and often eve
n expressed as a numerical
quantity [3]. The purpose of a fuzzy logic application in legal science is to assist lawyers and
judges in forming a judgement on facts given by the computer [4].
While modeling the real world problem one is encountered by proble
m of vagueness in some of
the parameters and hence fuzzy numbers are used rather than crisp numbers. It is believed that
fuzzy systems theory is more proper to solve complex systems, especially humanistic systems.
Fuzzy systems theory was developed based o
n fuzzy logic and some other related disciplines in
such a way that the relationships among system variables are expressed via fuzzy logic [5]. In
these systems, the parameters of the fuzzy system are determined by means of the learning
algorithms used in
neural networks. Researchers have proposed different models and algorithms
based on neural networks, fuzzy systems or their combinations.
Feng et al. [6] proposed a training algorithm for hierarchical hybrid fuzzy
–
neural networks based
on Gaussian Members
hip Function (MF) that adjusts the widths of Gaussian MF of the IF parts
of fuzzy rules in the lower fuzzy sub

systems, and updates the weights and bias terms of the
neural network by gradient

descent method which results in fewer parameters. Garg et al. [
7]
presented the study of examining the suitability of fuzzy numbers having Gaussian
M
embership
F
unction and its implementation in finding the solution of fuzzy linear and non linear system.
Giovanna et al. [8] proposed an approach to extract automatically
fuzzy rules by learning from
data, with the main objective of obtaining human

readable fuzzy knowledge base. A neuro

fuzzy
model and its learning algorithm is developed that works in a parameter space with reduced
dimensionality with respect to space of
all the free parameters of the model. Lee et al. [9] deals
with the function approximation problem by analyzing the relationship between membership
functions and approximation accuracy in FNN systems. Its objective is to find a functional
expansion of the
Gaussian function and to tune the weight so as to modify the shape. Li
et al.
[10] presented a theoretical tool or basis for fuzzy logic systems or neural networks, especially
when combining both of them. Nyongesa [11] reports on studies to overcome diffic
ulties
associated with setting of the learning rates of back propagation neural networks by using fuzzy
logic.
A preliminary attempt, made by the authors was to develop an Analogy Making in Criminal Law
with Neural Network [12]. Unfortunately, neural netwo
rks are able to learn or represent
knowledge explicitly, which a fuzzy system can do through fuzzy if

then rules. Thus an
integration of neural networks and fuzzy systems can yield systems, which are capable of
learning and decision making. The layers of t
he network learn the rules required for the
classification task. The motivation of this paper is to construct a more effective training
algorithm with higher classification accuracy with priori weights. The database consists of 399
court decisions. The pro
posed system of Priori and Random Weights Fuzzy Neural Networks
classifier in Criminal Law can serve as an aid for judges about to pass sentence in criminal law
and as a learning tool for law students.
This paper is organized as follows: Section
2 describe
s about Criminal law,
Section 3 describes
the Methodology and Simulation Design for classification and also the network architecture.
Section 4 gives the Experimental results and discussions on a real data set. Finally, the paper
concludes with
section 5.
2.
C
RIMINAL LAW
In court level, one of the most challenging and complex legal activity is judicial reasoning. The
activity requires skillful of consideration and examination in the credibility of witnesses and
evidences, meanings and effects of precedents
and consistency with legal principles. In most
countries, judges and lawyers require to justify these factors in text documents with no
semantics, or written hard copies.
The advantages of the investigation
in
computer system
,
artificial intelligence and e
xpert systems
in case law reasoning and they found that there are various ways to implement the intelligent
system in different legal domain such as case reasoning, case classification, rules interpreting,
etc. Most of the researchers investigated in legal
reasoning of the common law model which
started to learn or build legal rules from cases [13].
Criminal act or omission is considered as an offence against the state. So the criminal acts are
punishable. The Indian Penal Code consists of 511 sections and
starts with its applicability and
jurisdiction. There are 9 defenses which are treated as exceptions to crime and as such the
offender is not punishable.
3.
M
ETHODOLOGY AND
S
IMULATION
D
ESIGN
A fuzzy logic system is unique because it is able to handle num
erical data and linguistic
knowledge
simultaneously
. It is a non linear mapping of an input data vector into a scalar output.
The architecture of FNN system is extended from the multilayer feedforward neural network
(See fig
ure
1).
Fig
ure 1.
F
uzzy Neural Network Architecture
X
1
X
L
FUZZIFICATION LAYER
Input
L
ayer
Hidden
Layer
Output
Laye
r
Mechanism for
generation of
priori weights
Selection layer
The FNN system consists of the components of a conventional fuzzy system except that
computation of degree of membership for each rule is performed by Fuzzification layer and the
neural network’s learning capacity is provi
ded to enhance the system knowledge.
3.1.
Generating Degree of Membership
There are several types of Membership Functions (MF) in representing fuzzy phenomena
including Gaussian and Lagrange Interpolation
M
embership
F
unction.
The degree of membership of
Gaussian
MF
is generated for
the
i
th
pattern
with
the
j
th
class
, can be generated as follows:
(1)
where
.
and
are the
centre and width to the set of patterns for that
particular feature across
M
classes.
The method of Lagrange interpolation is a curve

fitting method in which the constructed MF is
assumed to be expressed by a suitable polynomial form. Given the function
and
given
n+1
distinct point
in
.
L
et
denote the polynomial of degree at most
n
that interpolates
f
at the given points. In other words,
is suc
h that
(2)
where
L
i
are the Lagrange coefficient polynomials given by
3.2.
Fuzzy Neural Network Algorithm
Set initial weights to be priori weights and small random weights for the Inp
ut

Hidden layer
and Hidden

Output layer
Let learning

rate (η) = 0.9, Momentum factor (α) = 0.00001 and
L
= Number of input features. Choose an acceptable error
value
. The basic equations of this
algorithm are outlined as follows:
The activation function of each node with
its inputs and outputs is discussed next layer by layer.
The training phase use the concept of backpropagation to minimize the error function
Layer 1:
Each node in layer 1 represents an input linguistic variable of the network and is
used as a buffer to tr
ansmit the input to the next layer, which is to the MF nodes of its
linguistic values. Thus the number of nodes in this layer, is equal to
L
. Let
be the input to
the
i
th
node in layer 1 then the output of the node will be
(
3
)
Layer 2:
This is the fuzzification layer. Each node in this layer corresponds to one linguistic
label to one of the input variables in layer 2. In other words, the output link represents the
membership value that specifies the degree to
which an input value belongs to a fuzzy set
[14]. The output of a layer 2 node represents the MF grade of the input with respect to a
linguistic value. Each neuron performs a Gaussian
or Lagrange Interpolation MF
. The output
of a node in layer 2 is compute
d by.
(
4
)
Layer 3:
This level performs the defuzzification. The activation and output value for the
neurons of hidden layers can be written as,
(5)
(6)
where
is
the connection weight between the
i
th
node of the second level and the j
th
neuron
of the third level.
Layer 4:
This is the output layer and each node in this layer represents a class. Thus, if there
are
M
classes then there will be
M
nodes in layer 4. T
he nodes in layers 3 and 4 are fully
connected. The output of node
k
in layer 4 is computed as
(7)
where
is the connection weight between the
j
th
node of the third level
and the
k
th
neuron of the fourth level
.
The error
in the network
can be
calculated as
(8)
where
is the difference between the target output value and the actual output val
ue of
the output layer. To minimize the error signal, coupling

strength is updated by an amount
proportional to the partial derivative of
with respect to
(Weight between hidden and
output layer units).
(9)
where
(10)
Similarly the partial derivative of
with respect to
(Weight between layer 2 and layer 3
units) can be derived as follows
(1
1)
where
(12)
This is the error removed from the hidden layer.
Change in weights in the network can be calculated as,
(13)
(14)
α and η are learning coefficients, which are usually,
chosen by trial and error. The network
learns the weights of the links connecting layers 3 and 4 and also the parameters associated with
nodes in layer 2, which do the feature selection. The initial values of weights can be so selected
that no feature gets
into the network in the beginning and the learning algorithm will pass the
features, which are important, i.e., the features that can reduce the error rapidly.
3.3
Priori Knowledge
One way to set the weights explicitly is using a priori knowledge. Select
ion layer in fig
ure
. 1
generates priori weights based on the impact of input features for the classification in criminal
law. A priori knowledge for the design of neural networks helps to solve some basic difficulties
encountered in practice: Inefficient t
raining and bad generalization of neural networks. The main
advantages of using a priori knowledge for the design of neural networks should be smaller
needs for learning data, improved training efficiency and better generalization. Since any
information th
at is built directly into a network does not have to be learned any more, less
learning data is needed and generalization is improved. Network training becomes more efficient
because smaller networks can be used to learn the remaining learning data. A prio
ri knowledge
for the design of networks generates consistent benefits for a wide range of different
applications. After initial design, the FNN can learn from training data sets [15]. This allows an
optimal integration of both a priori knowledge and of lea
rning data into FNNs.
4.
E
XPERIMENTAL
R
ESULTS AND
D
ISCUSSION
The methodology is tested on real world data sets with 300 samples as training and 99 samples
as testing to find optimum performance of Artificial Neural Network (ANN), FNN with
Lagrange Interpol
ation and Gaussian MF in criminal law. The classifier model is tested for 3 to
30 hidden neurons in the hidden layer using priori weights and random weights. The average and
optimum number of neurons in the hidden layer is reported in terms of number of it
erations,
training time and classification accuracy. The parameters used in all the models are listed in
Table 1 and performance of the networks is evaluated using both the training and testing data
sets. In this application, the identification of preceden
ts in the area of criminal law is examined.
It is known that all features that characterize a data point may not have the same impact with
regard to its classification. Priori weights are assigned based on the impact of features for the
classification in c
riminal law. It comprises 27 input features and 3 output features. The input
features considered are committing murder, imprisonment for life for committing murder,
inhuman acts, the accused acted in a cruel manner, death due to homicide etc. Use of deadly
weapon
,
with the intention of causing grievous injuries, trustworthy evidence by an eye witness,
abetment of suicide of a child or an insane person, kidnapping for ransom or murder, rape, death
of a woman caused by burns or bodily injury
,
within seven ye
ars of her marriage
,
subject to
cruelty or harassment by husband for dowry, before the death, the injure
r
’s dying declaration,
circumstantial evidence, medical evidence, member of unlawful assembly, number of persons
killed, strangulation, grave and sudd
en provocation, house tresspass and theft, causing
miscarriage, death of a quick unborn child and causing miscarriage without woman’s consent.
The punishment for the offence is given as output. In the experimental evaluation of the
algorithm on the set of
real world data of murder cases, the total number of iterations, total
running time in seconds and the classification accuracy are taken as the performance metric.
Table 1
.
Parameters used for FNN
Parameter
Number/Value
Input nodes
Output nodes
Learning rate (η)
Momentum Term (α)
=
乥畲ul
=
Network Error Tolerance (τ)
=
f湩n楡氠睥ig桴猠慮搠扩a獥搠瑥d洠癡汵ms
=
景f=
oa湤潭⁷n楧桴猠慮搠景f=楮i瑩a氠睥楧桴猠扥瑷he渠
桩摤敮

潵瑰畴=yer
=
f湩n楡氠睥ig桴猠扥瑷ee渠n湰畴

桩摤敮hye爠ro爠
m物潲椠睥楧桴h
=
27
3
0.
9
0.00001
0.01
Randomly Generated Values betwee
n

1 and 1
Values generated between 0.5 to 1 for
weights emerging from most important
features and values generated between
0 to

0.5 for weights emerging from
least important features
T
able 2.
Results of classification model.
Table 2 shows the results of classifier in cri
minal law for all possible hidden neurons for ANN
and FNN using Gaussian and Lagrange Interpolation MFs for Priori weights and Radom weights.
Figure 2.
The Comparison chart for
Optimum
training time of criminal law between ANN and
FNN
Classifier
Model
ANN
FNN with Gaussi
an MF
FNN with Lagrange interpolation
MF
No of
iterations
Training
time (Sec)
Test Set
accuracy
rate (%)
No of
iterations
Training
time (Sec)
accuracy
No of
iterations
Training
time (Sec)
Test Set
accuracy
rate (%)
Priori Weights
Optimum
number of
neuro
ns of
hidden
layer
168
1.208791
97.98
189
0.879121
98.99
147
0.879121
98.99
Average
number of
neurons of
hidden
layer
143.21
1.226452
95.24
131.79
1.642465
95.28
140.96
1.774522
95.71
Random Weights
Optimum
number of
neurons of
hidden
layer
289
1.2
08791
96.97
164
0.934066
97.98
152
0.894176
97.98
Average
number of
neurons of
hidden
layer
522.71
4.515306
93.90
161.30
2
.032967
95.21
154.19
1.929739
95.70
Fig. 2 and fig.
3 shows the comparative performance of classifiers in criminal law in terms of
training time and accuracy rate for test data set respectively. As shown
in
fig. 2 and fig. 3
experimental results clearly reveals that ANN and FNN using Priori weights outperfo
rm ANN
and FNN using random weights. As seen
in
fig. 3 FNN with priori weights is classified with an
accuracy of 98.99%.
Figure 3.
The Comparison chart for
Optimum Classification Accuracy
of criminal law between
ANN and FNN
5.
C
ONCLUSION
The authors
have
proposed a Fuzzy Neural Network (FNN) classifier using Priori and Random
weights in Criminal law using Lagrange Interpolation and Gaussian membership function. The
performance of optimum and average numbers of neurons in the hidden layer is reported.
The
proposed system was trained and tested for sufficient number of court decisions. The results
show that the networks using priori weights have lower training time and increases classification
accuracy on test data set than the networks using random weig
hts.
R
EFERENCES
[1]
Sotarat Thammaboosadee and Udom Silparcha, A GUI Prototype for the Framework of Criminal Judicial
Reasoning System, Journal of International Commercial Law and Technology, Vol. 4, Issue 3, pp. 224

230
,
2009
.
[2]
Dan Hunter, Out of their minds
: Legal theory in neural networks, Artificial intelligence and Law
,
7
,
129

151
,
1999
.
[3]
Lothar Philipps, Giovanni Sartor, Introduction: From legal theories to neural networks and fuzzy reasoning,
Artificial intelligence and Law
,
7
,
115

128
, 1999
.
[4]
Jurgen Holl
atz, A Fuzzy Advisory System for Estimating the Waiting Period after Traffic Accidents, IEEE
Fuzzy Systems
,
3
,
1475

1479
, 1997
.
[5]
R. Saneifard, Some properties of neural networks in designing fuzzy systems, Neural Computing & Applications,
DOI 10.1007/s00521

011

0777

1
, 2011
.
[6]
Shuang Feng, HongxingLi, DanHuc , A new training algorithm for HHFNN based on Gaussian membership
function for approximations, Neurocomputing Vol. 72, pp. 1631

1638
,
2009
.
[7]
Anjeli Garg and S. R. Singh, Solving Fuzzy system of Equations
Using Gaussian Membership Function,
International Journal of Computational Cognition, Vol. 7, No. 4,
pp. 25

32,
December 2009.
[8]
Giovanna Castellano, Anna Maria Fanelli, Corrado Mencar, A neuro

fuzzy network to generate human

understandable knowledge from d
ata, Cognitive Systems Research vol. 3
, pp. 125
–
144
, 2002
.
[9]
Ching

Hung Lee and Ching

Cheng Teng, Fine Tuning of Membership Functions for Fuzzy Neural Systems,
Asian Journal of Control, Vol 3, No 3, pp. 216

225
, 2001
.
[10]
Hong

Xing Li and C. L. Philip Chen,
The Equivalence Between Fuzzy Logic Systems and Feedforward Neural
Networks, IEEE
Transactions on Neural Networks,
Vol 11, No 3, pp. 356

365
, 2000
.
[11]
H.O. Nyongesa, Fuzzy Assisted Learning in Back propagation Neural Networks, Neural computing &
Applications,
1997 Springer

verlog London Limited.
p
p. 424

428.
[12]
M.M. Janeela Theresa, V. Joseph Raj, Analogy Making in Criminal Law with Neural Network, Proceedings of
ICETECT 2011,
IEEE explore,
pp. 772

775.
[13]
Sotarat Thammaboosadee, Udom Silparcha, A Framework for Crim
inal Judicial Reasoning System using Data
Mining Techniques, FUZZ

IEEE’97, pp. 1475

1479.
[14]
M. Tabesh, M. Dini, Fuzzy and Neuro

Fuzzy Models for Short

Term Water Demand Forecasting in Tehran,
Iranian Journal of Science & Technology, Transaction B, Engineerin
g, Vol. 33, No. B1, pp 61

77.
[15]
Rico A. Cozzio

Bueler, (1995)The Design of Neural Networks using A Priori Knowledge, Ph. D Dissertation,
Swiss Federal Institute of Technology, Zurich
.
Authors
M. M. Janeela Theresa
received her MPhil degree from
Mother Ter
esa Women’s University, Kodaikanal, India and
P
ost
G
raduate degree
from
Barathidasan
University
,
Tiruchirapalli
, India.
She is presently working as Assistant
professor at St. Xavier’s Catholic College of Engineering,
Anna University
, Chunkankadai, India. S
he has a vast
teaching experience of about 18 years and research experience
of about 12 years. Her research interests include neural
network, data mining, expert system and soft computing.
V. Joseph Raj
received his PhD degree from M
anonmaniam
Sund
aranar
University, Tirunelveli, India and P
ost
G
raduate
degree
in Anna University, Chennai, India. He is presently
working as a Professor
and Head of the Department of
Computer Science,
Kamaraj College,
Manonmaniam
Sundaranar University,
Thoothukudi, India. He has also
worked as a
n Associate
Professor in
Computer Engineering in
Euro
pean University of Lefke, North Cyprus
. He is guiding
PhD scholar
s of various Indian
universities. He has a vast
teaching experience of about 25
years
and resea
rch experience
of about 18
years.
His research inter
ests include neural
network,
biometrics
, network security and operations research.
Comments 0
Log in to post a comment