Fuzzy Decision-making SVM with An Offset for Real-world Lopsided Data

yellowgreatAI and Robotics

Oct 16, 2013 (3 years and 8 months ago)

90 views

SICE-ICASE International Joint Conference 2006
Oct.18-21,2006 in Bexco,Busan,Korea
Fuzzy Decision-making SVMwith An Offset for Real-world Lopsided Data
Classification
Boyang LI,Jinglu HU and Kotaro HIRASAWA
Graduate School of Information,Production and Systems,Waseda University,Hibikino 2-7,Wakamatsu-ku,
Kitakyushu-shi,Fukuoka-ken,Japan
(Tel/fax:(+81)93-692-5271;E-mail:liboyang@akane.waseda.jp,jinglu@waseda.jp,hirasawa@waseda.jp)
Abstract:An improved support vector machine (SVM) classifier model for classifying the real-world lopsided data is
proposed.The most obvious differences between the model proposed and conventional SVMclassifiers are the designs
of decision-making functions and the introduction of an offset parameter.With considering about the vagueness of the
real-world data sets,a fuzzy decision-making function is designed to take the place of the traditional sign function in the
prediction part of SVM classifier.Because of the existence of the interaction and noises influence around the boundary
between different clusters,this flexible design of decision-making model which is more similar to the real-world situations
can present better performances.In addition,in this paper we mainly discuss an offset parameter introduced to modify
the boundary excursion caused by the imbalance of the real-world datasets.Because noises in the real-world can also
influence the separation boundary,a weighted harmonic mean (WHM) method is used to modify the offset parameter.
Due to these improvements,more robust performances are presented in our simulations.
Keywords:SVM,Fuzzy decision-making function,WHMoffset,Real-world lopsided dataset,Classification
1.INTRODUCTION
SVMis an algorithmbased on the structure of statisti-
cal theory,so it has general well performances for unseen
data.Heretofore,SVM has been applied to many actual
problems,especially in various classification problems,
but the classification and SVMitself still have some prob-
lems to be resolved [2][3].
For most real-world classification problems,databases
are usually affected by interaction and noises between
different classes,so the real-world classification prob-
lems are non-separable mostly.For dealing with these
non-separable cases,SVMalgorithmuse a regularization
parameter C in the training part,which can weigh the tol-
erance of SVM.Because this parameter is the unique ad-
justable parameter in SVMto control the chosen of sup-
port vectors,the changing of variable C always affects
the performance of SVM evidently.In order to reduce
the influences caused by improper choice of C and deal
with the misclassified problems caused by the interaction
and noises,a fuzzy decision-making model is proposed
to replace the traditional one in the prediction part of
SVM classifiers.By this way,the hard-shell boundary
between neighboring classes is transformed into a flexi-
ble one.Then the misclassified cases caused by the inter-
action and noises can be reduced [1].
In addition,the number of samples in one of the
classes in the real-world datasets is usually much larger
than the others.This imbalance characteristic is the rea-
son of the excursion of the boundary,which is another
frequent problem in the actual classification.An offset
parameter was introduced for modifying this excursion in
our model [1].In SVM,the support vectors are the near-
est samples to the separation boundary,so their prediction
values can be used to calculate this offset.However,it is
found fromexperiment results that the noises in the real-
world cases not only make the separation boundary to be
a gray zone but also can increase the difficulties to com-
pute a proper offset parameter.In other words,because
SVM admits the violations in non-separable cases,sup-
port vectors can be confirmed by the bounding planes be-
longs to the different subsets.If these support vectors are
disturbed by the interaction or noises consumingly,it will
be difficult to get a correct separation boundary.Based
on this consideration,in this paper we introduce a series
of weights β
1
,...,β
n
are introduced to built a Weighted
Harmonic Mean (WHM) offset,which can equipoise the
influences from all support vectors and make some de-
viant decision values of support vectors to be invalid.Us-
ing this WHMoffset we can get a new separation bound-
ary and then the performances of the this model is studied
in simulations with different kinds of real-world datasets.
This paper is organized as follows:the next section
provides an overview of SVMand its applications in the
nonlinear nonseparable classification problems.Section
3 discusses the decision-making part of SVMand a fuzzy
decision making method is proposed for fitting the real-
world datasets.And In Section 4,an offset parameter
is introduced to modify the excursion of the boundary,
which can be calculated as a weighted harmonic mean
of the decision values of support vectors.And then,the
results with comparison to different kernels and different
classification decision-making functions with details on
accuracy of classification are presented in the Section 5.
Finally,concluding remarks are given in Section 6.
2.SVMFOR CLASSIFICATION
In recent years,SVMreveals its prominent capability
in many practical applications,especially in classification
problems.In the elementary design of SVM classifier,
bounding planes of each subsets are considered and the
distance between these bounding planes is defined as the
Fig.1 Support Vectors in Non-separable Classification
margin.The process of maximizing the margin equals
to the process of finding the optimal separation bound-
ary.In real-world the classification problems are usually
non-separable and non-linear.So the violations can be
accepted in non-separable cases.But for the nonlinear
cases,the input vectors should be firstly mapped into a
high-dimensional feature space in which a separating hy-
perplane is found by solve a quadratic programming (QP)
problemin its dual form.
2.1 Basic Problemof SVMClassifier
Any complex classification problems can be divided
into several binary ones,so we just discuss the binary
cases in our paper.
If we have a training data set denoted as {x
i
,y
i
},
where x
i
∈ R
n
,i = 1,2,...,N.x
i
means the i-th in-
put vector and y
i
is its class label(+1 or -1).The training
data set can be divided into two different sets A and B
which have labels +1 and -1 respectively.As we dis-
cussed above,the distance between two sets bounding
planes is called the margin.It is obvious that maximiz-
ing this margin could improve the ability of the classifier
model generally [4].
2.2 SVMfor Non-separable Classification
But in the case where the training data are non-
separable,one should attempt to minimize the separation
error and to maximizing the margin simultaneously.
The support vector machine classifier is obtained by
solving an optimization problem with an objective func-
tion which balances a termforcing separation between A
and B and a term maximizing the margin of separation,
so the tolerance is accepted in these cases [5].
As shown in the Fig.1,support vectors from A are
those A
i
in the halfspace {x ∈ R
n
|w
T
x ≤ b + 1} (i.e.
those points of A ’on’ or ’below’ the bounding plane
w
T
x = b + 1),where w and b are the weight and bias.
Support vectors from B are those points B
i
in the halfs-
pace {x ∈ R
n
|w
T
x ≥ b − 1} (i.e.those points of ’on’
or ’above’ the plane w
T
x = b −1).These points are the
only data points that are relevant in determining the op-
timal separating plane.The number of support vectors is
usually small and is also proportional to a bound on the
generalization error of the classifier.But there is a prob-
lem in many actual applications.If these support vec-
tors are influenced by the noises,some of themmay have
large absolute decision values,and because the noises are
usually uncertain,we can not get the correct separation
hyperplane.
2.3 SVMfor Nonlinear Classification
Because the common cases in the real-world classifi-
cation are nonlinear nonseparable,so in the primal space,
we transformthe lowdimension large input data sets into
a high dimensional feature space by using a mapping
function ϕ(x).For non-separable cases in the feature
space,the boundary function has a nonnegative variables
ξ
i
to make the margin accept the violations,and then we
have a separating plane function
y
i
[w
T
ϕ(x
i
) +b] ≥ 1 −ξ
i
,∀i (1)
The optimal hyperplane problem becomes to find the so-
lution of the following optimization problem,
min
w,b,ξ
J(w,ξ) =
1
2
w
T
w +C
N
￿
i=1
ξ
i
(2)
s.t.
￿
y
i
[w
T
ϕ(x
i
) +b] ≥ 1 −ξ
i
,
ξ
i
≥ 0,i = 1,...,N.
where parameter C is used to control the degree of toler-
ance,which is the only changeable parameter in SVM.
By introducing the vector of Lagrange multipliers α =

1
,...,α
N
),the problem(Eq.2) can be rebuilt as a QP
problemin dual space [6]:
max
α
Q(α) = −
1
2
N
￿
i,j=1
y
i
y
j
K(x
i
,x
j

i
α
j
+
N
￿
j=1
α
j
(3)
s.t.
￿
￿
N
i=1
α
i
y
i
= 0
0 ≤ α
i
≤ C,∀i
where K(x
i
,x
j
) = ϕ(x
i
)
T
ϕ(x
j
) is the kernel func-
tion [7].In our experiments,Polynomial kernel and
Gaussian RBF kernel are taken into account.Polynomial
mapping is a common method for non-linear modeling
shown as following
K(x
i
,x
j
) = (x
i
,x
j
 +1)
d
(4)
where d is the exponential quantity of the polynomial.
And RBF has outstanding performances in applica-
tions
K(x
i
,x
j
) = exp(−
x
i
−x
j

2

2
) (5)
where σ
2
is the common width.
And then the decision making function can be gained,
y(x) = sign[
N
￿
i=1
α
i
y
i
K(x,x
i
) +b] (6)
y(x) is the output prediction label of input vector x.
Although the sign function can divide the test data set
into two classes by detecting the signs of decision values
￿
N
i=1
α
i
y
i
K(x,x
i
) + b,it is too hard-shelled to make
some mistakes when the decision value is close to zero.
3.FUZZY DECISION-MAKINGSVM
MODEL
3.1 Fuzzy Decision-making SVMProcess
We propose a fuzzy decision-making SVM process,
which is a model based on fuzzy method,SVMalgorithm
and analysis of a mass of database.As an extension of
traditional methods,the model proposed is more suitable
for actual applications.
In the training part,we still use the same method as
traditional one to train the SVM classifier [8].As we
discussed before,the support vectors,which belong to
a subset extracted from the training data and used for
describing the separation boundary,can be found.Us-
ing the trained SVMwe can calculate the decision-value
of the input data.But being different from the tradi-
tional method,the decision value fromconventional sign
decision-making function is used as the independent vari-
able in fuzzy decision-making function to measure the
belief degree of each input point.The general structure
of the whole processing (Fig 2) can be divided into three
main stages:SVM straining,decision value prediction,
and fuzzy decision-making.
Fig.2 fuzzy decision-making SVMprocess
In the decision-making part,a fuzzy model is con-
structed to replace the sign function in conventional
model.This is because interaction usually exists in many
real conditions,especially around the boundary between
classes.So misclassified cases occur in the neighbor of
threshold.Being different from conventional methods,
the fuzzy boundary in our model makes zero not be the
only important value as a threshold,but clusters will be
considered as fuzzy sets.
3.2 Fuzzy Decision-making Functions
Assume that these two fuzzy sets are called A (the
value in this set are deemed to be predicted as -1) and
B (the values in this set are deemed to be predicted as
+1).A gray-zone should be built to take the place of the
hard boundary.The values of the function can indicate
their reliability which handles the concept of belief prob-
lem(belief degree between 1 (completely believable) and
0 (completely false)).Through several experiments,arc-
tangent function is confirmed to construct the curvilinear
fuzzy functions.The boundaries of set A and set B can
be written as:
f
A
(v) =
arctan(−v · s −d · s)
π
+0.5;(7)
f
B
(v) =
arctan(v · s −d · s)
π
+0.5.(8)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-10
-5
0
5
10
Believable value
Decision value
Bondary Function of Fuzzy Set B
Bondary Function of Fuzzy Set A
Positive Set B
Minus Set A
-d
d
-s
s
Fig.3 curvilinear fuzzy decision model
where d indicates the discerption degree,s means the
scale factor and the decision value is denoted as v.The
cross section figure of the fuzzy sets is shown in Fig 3.
4.WEIGHTED HARMONIC MEAN
OFFSET
4.1 Introduction of WHMoffset
Because most real cases are lopsided,the borderline
can not be described by formulae (7) and (8) accurately.
As a result of the imbalance of dataset,the midpoint of
the gray boundary zone always not equals to zero.So
an offset constant δ is needed to be introduced to denote
the distance between the real borderline and the theoretic
one.
Because support vectors are the nearest samples
around the boundary,so one way to calculate the offset δ
is to compute the mean of decision values of support vec-
tors which has been proposed in our former models [1].
The formula can be written as follows:
δ =
￿
n
i=1
S
i
n
(9)
where S
1
,...,S
n
are the decision values of supports.
This mean value can modify the separation boundary to
a better position than the conventional one,but if some
support vectors are also influenced by the noises in the
real-world datasets,this offset will be inauthentic.
For finding a more proper offset,it is necessary to
ignore these false support vectors,so we introduce a
weighted harmonic mean of decision values of support
vectors(SVs).As the previous offset parameter proposed
by us,SVs are used as the test data to gain their decision
values S
1
,...,S
n
,where n is the number of support vec-
tors.Suppose the corresponding weights are β
1
,...,β
n
then the offset has a configuration shown as follows
δ =
￿
n
i=1
β
i
￿
n
i=1
β
i
S
i
(10)
As we concerned in the subsection 2.2,the support
vectors from subset A are the points of A ’on’ or ’be-
low’ the bounding plane w
T
x = b +1,and similarly the
support vectors fromBare those points of ’on’ or ’above’
the plane w
T
x = b −1.So if some of the support vectors
is influenced by the interaction and noises strongly,then
these samples may apart from the separation boundary.
Therefore,we need to give these support vectors smaller
weights so that they will become invalid in the calculation
of the offset.Based on this consideration,we employ the
Blackman equation to calculate the weights of support
vectors,which is shown in the following,
β
i
= 0.42 −0.5cos(
π(S
i
+S
max
)
S
max
)
+0.08cos(
2π(S
i
+S
max
)
S
max
) (11)
where S
max
is the largest absolute value of all the sup-
port vectors’ decision values,S
i
means a certain support
vector’s decision value,and β
i
denotes its corresponding
weight.
4.2 Decision-making model with WHMoffset
Fig.4 fuzzy decision model with an offset
Introduce the proposed WHM offset to the fuzzy
boundary SVMclassifier model concerned above,a mod-
ified fuzzy boundary can be obtained,and then the formu-
las (7) and (8) can be rewritten as:
f
￿
A
(v) =
arctan(−v · s −d · s +δ · s)
π
+0.5 (12)
f
￿
B
(v) =
arctan(v · s −d · s −δ · s)
π
+0.5 (13)
where d,and s are chosen from a great deal of experi-
ments.We can use the values f
￿
A
(v) and f
￿
B
(v) to esti-
mate whether an input vector should be labeled as +1 or
-1.Based on the formulas (12) and (13),the bounding
curves of the model with WHM offset can be drawn in
Fig.4.
So the whole process of the fuzzy decision-making
SVMwith WHMoffset can be described as shown in Fig.
5.From the figure we can find the whole procedure con-
sists of four steps as follows,
Step 1:SVMtraining process.We can find the support
vectors and compute the parameters for building SVM
classifier model.
Step 2:Prediction process for support vectors.We can
calculate the decision values of support vectors and the
weights corresponding to them.
Fig.5 model with weighted harmonic mean offset
Step 3:Prediction process for the test datasets.Deci-
sion values of test input vectors can be gained.
Step 4:Final decision-making process.Using the
decision values of the support vectors and the weights
from step 2 to calculate the offset.And using the fuzzy
decision-making method to predict the final output labels
for the input vectors.
5.SIMULATION RESULTS
5.1 Simulation 1:Heart disease detection problem
5.1.1 Description of problem
The fist problem we used to test our models is heart
disease detection,which is from Statlog datasets.The
whole database consists of two classes:absence class and
presence class.The total number of examples is 200,
in which the number of samples in the absence class is
150 and the number of samples in the presence class is
50.Each sample in the datasets has 13 main attributes,
which are extracted from a larger set of 75.Denote each
pair of input vector and output label as {X(n),Y (n)},
where X(n) is the input vector with 13 attributes and
Y (n) is the output label with two poles:-1 (absence)
Table 1 Comparison of two offsets in simulation 1
C
1
10
100
model A
94.615%
95.38%
96.15%
model B
88.46%
91.54%
92.3%
model C
95.38%
96.15%
96.15%
model D
90%
92.3%
93.08%
or +1 (presence),(n = 1,2,...,200).Assume that the
vector X(n) = (x
1
(n),x
2
(n),...,x
13
(n)),the signifi-
cations of these 13 elements are:age,sex,chest pain
type,resting blood pressure,serum cholestoral,fasting
blood sugar,resting electrocardiographic results,maxi-
mum heart rate,exercise induced angina,oldpeak,the
slope of the peak exercise ST segment,number of major
vessels and thal(3=normal;6=fixed defect;7=reversable
defect).
In brief,our purpose is to predict the bipolar output
Y (n) with as few sign errors as possible,through the
given input vectors X(n).
5.1.2 Results of Classification
Because the database used in our simulations comes
from the real-world,it is non-separable apparently.In
non-separable SVM classifier,not only the kind of ker-
nel but the regularization parameter C can also effect the
accuracy which is a value used to evaluate classifiers.
C is used to denote the tolerance of SVM,along with
changing of which,both the results of training and the
prediction are changed.For confirming the widely ap-
plicability of our method,we let C equal to 1,10,100
in turn as common situations,and use two kinds of ker-
nels presented in a previous section.They are RBF kernel
and Polynomial kernel.Using the fuzzy decision-making
function and WHMoffset in the prediction part of SVM
classifiers based on these two kinds of kernels,we can
form two classification models:fuzzy decision-making
RBF-SVMmodel with WHMoffset and fuzzy decision-
making Polynomial-SVMmodel with WHMoffset.
Using the first 70 samples as the training data,and con-
sidering the remainder as the test data,the accuracies of
these two proposed models are gained.Comparing with
our previous models with mean offsets,the experiment
results are shown in Table 1.Compare model A (fuzzy
decision-making RBF SVMwith mean offset) with model
C (fuzzy decision-making RBF SVM with WHM offset)
and compare model B (fuzzy decision-making Poly SVM
with mean offset) with model D (fuzzy decision-making
Poly SVM with WHM offset),then we can find that the
newmodels with WHMoffsets have better performances.
And then compare with traditional RBF-SVM classi-
fier,traditional Polynomial-SVM classifier,the accura-
cies (y-axis) with three different values of parameter C
(x-coordinate axis) are shown in Fig.6.
In the aspect of parameter selection,we choose a stan-
dard RBF kernel with common width σ
2
= 4 and Poly-
nomial kernel with d = 3.Using different combinations
of kernel and C value to train the SVM classifiers,we
80%
85%
90%
95%
100%
1
10
100
Accuracy
C
Fuzzy Decision RBF SVM with WHM Offset
Fuzzy Decision Poly SVM with WHM Offset
RBF SVM
Poly SVM
Fig.6 Accuracy Curves for heart disease detection
get some different models.And then the boundary off-
set δ can be worked out by the decision values of support
vectors and their weights.We set the scale coefficient
s = 256 for the classifiers using Polynomial kernel,and
set s = 512 for the classifiers using RBF kernel.
5.2 Simulation 2:Misfire detection problem
5.2.1 Description of problem
The second database we used is also from real-world,
which is a misfire detection problem in internal combus-
tion engines.The whole database is divided into two sub-
sets,one is used as the training data and the other is the
test data.These data contain the information of time se-
ries of 50000 samples which are produced by physical
system (10-cylinder internal combustion engine).Sim-
ilarly as the simulation 1,each sample k of time series
consists of four inputs elements and one output label,
each pair of input vector and output label also can be writ-
ten as {X(k),Y (k)},where X(k) is the input vector and
Y (k) is the output label,(k = 1,2,...,50000).
The four elements of input vectors x
1
(k),x
2
(k),
x
3
(k) and x
4
(k) represent cylinder identifier (first),en-
gine crankshaft speed in Revolutions Per Minute (RPM)
(second),load (third) and crankshaft acceleration (fourth)
respectively.Y (k) may have two values:-1 (normal fir-
ing) or +1 (misfire).The amounts of the normal cases in
the training data and the test data are 45093 and 45395
and the numbers of samples in the misfire classes in the
training and test datasets are 4907 and 4605.We can
find that this database is also an imbalance one.In this
problem,Our purpose is also to detect the value of output
Y (k) for a certain given input vectors X(k).
5.2.2 Results of Classification
As described in the simulation 1,we also let C equal to
1,10,100,and use RBF kernel and Polynomial kernel in
our experiments respectively.The comparison of models
with different offsets is shown in Table 2.Model A,B,C
and D have same definitions as Table 1.In this problem
we can also obtain a better performance from the new
model with a WHMoffset.
The accuracies of traditional classifiers and proposed
models with WHMoffset are shown in Fig.7.
In the aspect of parameter selection,we also set σ
2
=
Table 2 Comparison of two offsets in simulation 2
C
1
10
100
model A
95.34%
95.654%
95.42%
model B
92.656%
93.858%
95.008%
model C
95.74%
96.02%
95.82%
model D
92.96%
94.27%
95.01%
90%
92%
94%
96%
98%
100%
1
10
100
Accuracy
C
Fuzzy Decision RBF SVM with WHM Offset
Fuzzy Decision Poly SVM with WHM Offset
RBF SVM
Poly SVM
Fig.7 Accuracy Curves for misfire detection problem
4 for RBF kernel and d = 3 for Polynomial kernel.As
same as the simulation 1,we set s = 256 for the clas-
sifiers using Polynomial kernel,and set s = 512 for the
classifiers using RBF kernel.
5.3 Comparison among Different Classifiers
As shown in Fig.6 and Fig.7,with different values of
parameter C the accuracies of the models using RBF ker-
nel and the ones using Polynomial kernel are also differ-
ent.For different problems,one classifier may presents
different performances,so how to choose a proper kernel
is still a problemin SVMmethod.In other words,for the
dataset in simulation one RBF kernel is better than the
Polynomial kernel but for the second problem,Polyno-
mial kernel is more proper.
Especially,from the figures we can also find that the
method proposed in this paper can present a better ca-
pability than traditional SVM classifier and our former
models for different regularization parameter C,and the
performance of this model is more robust too.
Even as what has been discussed,a more proper
decision-making function,a more valid offset,a more ap-
propriate width of gray zone and a more suitable kernel
would make the classifier be more effective.
6.CONCLUSIONS
We have presented an fuzzy decision-making algo-
rithm for building SVM classifiers and introduced a
WHMoffset to modify the excursion of the boundary in
real-world datasets.
The model is proposed in this paper to improve the
performances of SVMin the classification problems.By
using a fuzzy decision-making function,the prediction
boundary is transformed from a straitlaced configuration
to a more flexible structure so that the influence between
data sets can be reduced and many misclassified points
in the prediction part of traditional SVMclassifier have a
chance to be relabeled.
In addition,we also construct a WHMoffset δ.The in-
troduction of this offset can modify the separation bound-
ary between each two sets to a more appropriate posi-
tion and then the error caused by the imbalance of data
sets could be overcome.Especially along with the em-
ployment of the weight harmonic mean method,an anti-
jamming offset can be calculated without considering the
support vectors influenced by the noises,since the distri-
bution of the weight values is used to control the effec-
tiveness of SVs to the offset parameter.
Although the model proposed presents some well per-
formances in simulations,there still remains some future
problems to be solved.How to choose a proper kernel
for a certain database,how to build a more robust kernel
function and how to confirm the parameters in the fuzzy
decision-making part automatically will be main direc-
tions in our future research.
REFERENCES
[1] Boyang LI,Jinglu HU,Kotaro Hirasawa,Pu SUN,
Kenneth Marko.“Support Vector Machine with
Fuzzy Decision-Making for Real-world Data Clas-
sification”.In IEEE World Congress on Computa-
tional Intelligence 2006,International Joint Con-
ference on Neural Networks,Canada,2006.
[2] N.Cristianini,J.Shawe-Taylor,.“An Introduction
to Support Vector Machines”.Cambridge,UK:
Cambridge Univ.Press,2000.
[3] P.Bartlett,J.Shawe-Taylor,.“Generalization per-
formance of support vector machines and other pat-
tern classifiers”.In Advances in Kernel Method-
sSupport Vector Learning,Cambridge,MA:MIT
Press,1998.
[4] O.Chapelle and V.Vapnik.“Model selection for
Support Vector Machines”.In S.Solla,T.Leen,and
K.-R.M¨uler,editors,Adv.Neural Inf.Proc.Syst.12,
Cambridge,MA,MIT Press,2000.
[5] Steve R.Gunn.“Support Vector Machines for Clas-
sification and Regression”.In Technical Report,
Faculty of Engineering,Science and Mathematics
School of Electronics and Computer Science,10
May,1998.
[6] Johan Suykens.“Least Squares Support Vector Ma-
chines”.In Tutorial IJCNN,2003.
[7] Alex J.Smola and Bernhard Sch¨okopf.“On a ker-
nelbased method for pattern recognition,regression,
approximation and operator inversion”.In Algorith-
mica,22:211231,1998.emTechnical Report 1064,
GMD FIRST,April 1997.
[8] E.Osuna,R.Freund,F.Girosi.“Support Vec-
tor Machines:Training and Applications”.In A.I.
Memo 1602,MIT A.I.Lab.,1997.