Support Vector Machine(SVM)

zoomzurichΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

95 εμφανίσεις

Support Vector Machine(SVM)
Student :
Jia
-
Hau
Shiu
Advisor:
Sheng
-
Jyh
Wang
Outline

Linear Classifiers

Binary Classification

Perceptron
Classifier

Support Vector Machine(SVM) Classifier

Normalization

Lagrange Theorem

Primal and Dual Formulations

Kernel Function
Binary Classification
Linear
Separability
Linearly
separable
Not
linearly
separable
Linear Classifiers

A linear classifier has the form

In 2D the
discriminant
is a line

W is the normal to the plane, and b is the bias
f(x) = 0
f(x) > 0
f(x) < 0
X
1
X
2
Linear Classifiers

A linear classifier has the form

In 3D the
discriminant
is
a plane

Training data is used to learn
w

Classifying new data with
w
X
1
X
3
X
2
f(x) = 0
New data
Perceptron
Classifier
For Example in 2D
X
2
X
1
Xi
w
X
2
X
1
w
Before update
After update

If data is linear separable, then the algorithm
will converge

Separating line close to training data

We would prefer a larger margin
X
2
X
1
What is the best
w
?
Maximum margin solution: most stable
Outline

Linear Classifiers

Binary Classification

Perceptron
Classifier

Support Vector Machine(SVM) Classifier

Normalization

Lagrange Theorem

Primal and Dual Formulations

Kernel Function
Support Vector Machine
Support Vector
w
T
x+b
= 0
w
Support Vector
Normalization
Support Vector
w
T
x+b
= 0
w
w
T
x+b
=
-
1
w
T
x+b
= 1
Equivalent Equation

SVM can be formulated as :

Equivalently
Lagrange Theorem
Dual Problem
Kernel function
Primal and dual formulations

Primal version of classifier:

Dual version of classifier:
Feature Space

Input space can map to some higher
-
dimensional feature
space with an appropriate function so that the
training set is linearly separable:
Φ
:
x

Φ
(
x
)
x

Φ
(
x
)
Kernel function

A
kernel function
is some function that corresponds to
an inner (dot) product

Linear kernel:

Polynomial kernel of order
p
:

Radial Basis Function (RBF) kernel:
Polynomial Kernel Example
What is the best
w
?
Maximum margin solution: most stable
“Soft” margin solution

Every constraint can be satisfied if is sufficiently large

C is a parameter

Small C allows constraints to be easily ignored => large
margin

Large C makes constraints hard to ignore=>narrow margin

C =
inf
=>hard margin
Loss function
Loss function
Loss function
0
1
-
3
-
4
-
0
-
1
-
hinge
Convex function for “hinge” loss
Reference

http://en.wikipedia.org/wiki/Support_vector_machi
ne
Wikipedia

A Tutorial on Support Vector Machines for Pattern
Recognition
by Christopher J. C. Burges. Data Mining
and Knowledge Discovery 2:121

167, 1998

www.kernel
-
machines.org
(general information and
collection of research papers)

www.support
-
vector
-
machines.org
(Literature,
Review, Software, Links related to Support Vector
Machines

Academic Site)

Animation clip
: SVM with polynomial kernel
visualization.

A very basic SVM tutorial for complete beginners by
Tristan Fletcher
[1]
.

libsvm
libsvm is a library of SVMs which is actively
patched

http://www.cs.caltech.edu/courses/cs253/slides/cs2
53
-
14
-
GPs.pdf
loss function

http://www.stanford.edu/class/cs229/notes/cs229
-
notes3.pdf
Lecture notes for SVM , same author with
Stanfor
Course(video)
Reference : Video

videolectures.net
(SVM
-
related video lectures)

http://academicearth.org/courses/machine
-
learning
machine
-
learning, the course is
offered by Stanford