Twin Support Vector Machine |

1

Twin Support Vector Machine

Faculty Of Science Alexandria University

Department of Mathematics and Computer Science

by: Ahmed Ali

Supervised by: Assistant prof. Yasser Fouad

Twin Support Vector Machine |

2

Contents:

1.

Mathematical Introduction.

2.

Introduction.

3.

Support Vector Machine.

4.

Linear Twin Support Vector Machine.

5.

Non-linear Twin Support Vector Machine.

6.

Practical results.

7.

Conclusion.

8.

References.

Twin Support Vector Machine |

3

1 Mathematical Introduction:

What is hyperplane:

line equation, which is a relation between the two variables x

1

and x

2

,

w

1

x

1

+

w

2

x

2

+

b

=

0

it can be written as

[

w

1

w

2

]

[

x

1

x

2

]

+

b

=

0

or for simplicity

〈

w

,

x

〉

+

b

=

0

if there is three variables(or three dimensions), the equation

[

w

1

w

2

w

3

]

[

x

1

x

2

x

3

]

+

b

=

0

will give a plane equation.

Now what if there exists n-dimensions the equation will be:

〈

w

,

x

〉

+

b

=

0

where

w

=

[

w

1

,

w

2

,

…

,

w

n

]

T

x

=

[

x

1

,

x

2

,

…

,

x

n

]

which is a hyperplane equation.

What is Quadratic Programming:

it's a special kind of optimization problems to minimize or maximize a quadratic function subject

to linear constraint on this function.

Twin Support Vector Machine |

4

2 Introduction:

twin support vector machine (TWSVM) is a binary classifier based on the standard support

vector(SVM) machine classifier,TWSVM solves two smaller quadratic programming

problems(QPP) instead of one large QPP, in SVM all data points exists in the constraints but in

TWSVM they are distributes such that the patterns of one class determines the constraint of the

other QPP and vice-versa which make TWSVM four times faster than SVM in the training phase,

TWSVM determines two non-parallel hyperplanes by solving two related SVM problems where

each plane is closer to one class and as far as possible from the other and the new patterns are

assigned to the class which belongs to the closer plane.

3 Support Vector Machine:

assuming the patterns to be classified is a set of m-row vectors in n-dimensional real space

R

n

i.e. the matrix

[

A

11

A

12

…

A

1m

…

…

…

…

…

A

i1

A

i2

…

A

i

m

…

…

…

…

…

A

n1

A

n2

…

A

n

m

]

is the patterns to be classified, such that each row is a

single pattern, also assume that

y

i

∈

{

1,

−

1

}

denotes the class which the

i

th

pattern belongs

to.

First consider the data are strictly linearly separable, then it will be separated by the hyperplane

〈

w

,

x

〉

+

b

=

0

(1)

which lies in the middle between the two hyperplanes

〈

w

,

x

〉

+

b

=

1

and

〈

w

,

x

〉

+

b

=

−

1

(2)

and separate the data of each class by a margin of

1

∥

w

2

∥

on each side so the margin of

separation between classes if given by

2

∥

w

2

∥

, to get the equation (1), w must be determined

as it's the unknown in the equation which can obtained by solving the following optimization

problem:

Min

w

,

b

1

2

w

T

w

Twin Support Vector Machine |

5

subject to:

A

i

w

≥

1

−

b

for

y

i

=

1

A

i

w

≤

−

1

−

b

for

y

i

=

−

1

(3)

but of the two classes are not strictly linearly separable, then there will be an error in satisfying

the equation (3) for some patterns, i.e. some patterns will be miss classified so the equation can

be modified to :

Min

w

,

b

1

2

w

T

w

+

c

e

T

q

subject to:

A

i

w

+

q

i

≥

1

−

b

for

y

i

=

1

A

i

w

−

q

i

≤

−

1

−

b

for

y

i

=

−

1

q

i

≥

0

∀

i

∈

[

1,

m

]

(4)

where c is a scalar which his value denotes the trade-off between the classification error and the

margin, large value of c emphasizes the error while small value of c emphasizes the classification

margin, in practice rather than solving (4) it's dual problem is solved to get the classifier.

4 Linear Twin Support Vector Machine:

The algorithm though of TWSVM is to create two non-parallel a positive and a negative

hyperplanes such that one of them is as close as possible to one class and as far as possible from

the other class and vice-versa.

New patterns will be assigned to one of the classes depending on it's distance to the two

hyperplanes.

Each of the two QPP in the TWSVM pair has the typical formulation of SVM but not all

data patterns appear in the constraint of either problems at the same time.

Assuming the data belongs to the classes 1 and -1 is represented by matrices A, B respectively

and the numbers of patterns in each class if

m

1

,

m

2

respectively so the size if each matrix will

be

(

m

1

Χ

n

)

,

(

m

2

Χ

n

)

, TWSVM classifier is obtain by solving the following pair of QPP:

Twin Support Vector Machine |

6

Min

w

(

1

)

,

b

(

1

)

,

q

1

2

(

A

w

(

1

)

+

e

1

b

(

1

)

)

T

(

A

w

(

1

)

+

e

1

b

(

1

)

)

+

c

1

e

2

T

q

(5)

subject to:

−

(

B

w

(

1

)

+

e

2

b

(

1

)

)

+

q

≥

e

2

,

q

≥

0

and

Min

w

(

2

)

,

b

(

2

)

,

q

1

2

(

B

w

(

2

)

+

e

2

b

(

2

)

)

T

(

B

w

(

2

)

+

e

2

b

(

2

)

)

+

c

2

e

1

T

q

(6)

subject to:

−

(

A

w

(

2

)

+

e

1

b

(

2

)

)

+

q

≥

e

1

,

q

≥

0

where

c

1

,

c

2

are parameters and

e

1

,

e

2

are vectors of one of appropriate dimensions.

The first term of the objective function of (5) or (6) is

(

A

w

(

1

)

+

e

1

b

(

1

)

)

T

(

A

w

(

1

)

+

e

1

b

(

1

)

)

=

(

[

A

11

A

12

…

A

1

m

1

…

…

…

…

…

A

i1

A

i2

…

A

i

m

1

…

…

…

…

…

A

n1

A

n2

…

A

n

m

1

]

[

w

(

1

)

1

…

w

(

1

)

i

…

w

(

1

)

m1

]

+

[

1

1

⋮

1

1

]

[

b

(

1

)

1

b

(

1

)

2

…

b

(

1

)

n

]

)

2

=

(

[

∑

m

1

A

1i

w

i

∑

m

1

A

2i

w

i

⋮

∑

m

1

A

n

i

w

i

]

+

[

b

1

b

2

⋮

b

n

]

)

2

which is the sum of square distances from the hyperplane to the points of one class, so

minimizing it tends to keep the hyperplane close to one class say (class +1), the constraint will

make the hyperplane far with a distance at least 1 from the other class say (class -1).

The error variable is used to measure the error when the hyperplane is closer than the

minimum distance which is 1.

TWSVM consists of a pair of QPP such that each objective function is determined by one

class and the constraint of each problem in determined by the other class, i.e. in equation (5) the

Twin Support Vector Machine |

7

objective function is determined by class +1, but the constraint is determined by the patterns of

the class -1

and vice-versa in equation (6) the objective function is determined by class -1 but the constraint

are determined by class +1.

By solving two smaller sized QPPs such that patterns of class +1 are clustered around the

hyperplane

〈

x

T

w

(

1

)

〉

+

b

(

1

)

=

0

and the patterns of class -1 are clustered around the hyperplane

〈

x

T

w

(

2

)

〉

+

b

(

2

)

=

0

TWSVM is approximately four time faster the SVM because the

complexity of SVM is m

3

and TWSVM solves two QPP each one of a complexity of

m

2

so the

ratio of runtime is approximately

m

3

2

(

m

2

)

3

=

m

3

m

3

4

=

4

5 Non-linear Twin Support Vector Machine:

To extend the TWSVM to non-linear data the following kernel-generated surfaces will be

used instead of planes

K

(

x

T

,

C

T

)

u

(

1

)

+

b

(

1

)

=

0

and

K

(

x

T

,

C

T

)

u

(

2

)

+

b

(

2

)

=

0

(7)

where:

C

T

=

[

A

B

]

T

u

=

−

(

H

T

H

)

−

1

G

T

α

H

=

[

A

e

1

]

,

G

=

[

B

e

2

]

α

is

vector

of

lagrange

multiplier.

K

is

appropriatly

choosen

kernel.

so the following optimization problems will be constructed:

Min

w

(

1

)

,

b

(

1

)

,

q

1

2

∥

K

(

A

,

C

T

)

u

(

1

)

+

e

1

b

(

1

)

∥

2

+

C

1

e

2

T

q

(8)

subject to:

−

K

(

B

,

C

T

)

u

(

1

)

+

e

2

b

(

1

)

+

q

≥

e

2

,

q

≥

0

and

Min

w

(

2

)

,

b

(

2

)

,

q

1

2

∥

K

(

B

,

C

T

)

u

(

2

)

+

e

2

b

(

2

)

∥

2

+

C

2

e

1

T

q

(9)

subject to:

−

K

(

A

,

C

T

)

u

(

2

)

+

e

1

b

(

2

)

+

q

≥

e

1

,

q

≥

0

Twin Support Vector Machine |

8

6 Practical results:

TWSVM was implemented using MATLAB 7 and Running on a PC with an Intel P4

processor (3 GHz) with 1 GB RAM. The methods were evaluated on data sets from the

UCI

Machine Learning Repository

, the following results obtained

Table 1: time in second

Twin Support Vector Machine |

9

7 Conclusion:

TWSVM is an effective classifier for large datasets, with high calculation accuracy an

smaller training time of SVM, it needs further improvement such as extending the idea to multi-

class classification it has an advantage in the unbalanced data set such as medical databases

where the number of one class much greater than the other class.

8 References:

[1]

Jayadeva, R. Khemchandani, and Suresh Chandra. Twin Support

Vector Machines for Pattern

Classification. In:IEEE Transactions on pattern analysis and machine intelligence. vol. 29, no. 5,

may 2007.

[2]

Shifei Ding, Junzhao Yu, Bingjuan Qi, and Huajuan Huang. An

overview on twin support

vector machines. J.Springer. 13 March 2012.

## Comments 0

Log in to post a comment