# Lecture 20: Support Vector Machines

Τεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 4 χρόνια και 7 μήνες)

97 εμφανίσεις

Lecture20:SupportVectorMachines
Outline
￿
Discriminativelearningofclassiﬁers.
￿
￿
Issue:generalization.
￿
LinearSupportVectorMachine(SVM)classiﬁer.
￿
Marginandgeneralization.
￿
TrainingoflinearSVM.
LinearClassiﬁcation
￿
Binaryclassiﬁcationproblem:weassignlabels
￿￿￿￿￿￿￿￿
toinputdata
￿
.
￿
Linearclassiﬁer:
￿￿￿￿￿￿￿￿￿￿￿￿
￿
￿
anditsdecisionsurfaceisa
hyperplanedeﬁnedby
￿￿￿￿￿
￿
￿￿
.
￿
Linearlyseparable:wecanﬁndalinearclassiﬁersothatallthetraining
examplesareclassiﬁedcorrectly.
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿￿
Perceptrons
￿
Findlinethatseparatesinputpatternssothatoutput
￿￿￿￿
onone
side,
￿￿￿￿
onother,andthesematchtargetvalues
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
￿￿￿￿￿￿￿
rewrite–foreverytrainingexample
￿
:
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿
￿
￿￿￿￿
￿
￿
byPerceptronlearningrule,whichguar-
anteestoconvergetothecorrectsolutioninthelinearseparablecase.
￿
Problem:whichsolutionwillhasthebestgeneralization?
GeometricalViewofLinearClassiﬁers
￿
Margin:minimalgapbetweenclassesanddecisionboundary.
￿
GeometricMargin
￿
SomeVectorAlgebra:
w*
x0
x
L
￿
Anytwopoints
￿
￿
and
￿
￿
lyingin
￿
,wehave
￿￿￿￿
￿
￿￿
￿
￿￿￿
,whichimplies
￿
￿
￿￿￿￿￿￿￿￿
istheunitvectornormaltothesurfaceof
￿
.
￿
Anypoint
￿
￿
in
￿
,
￿￿￿
￿
￿￿￿
￿
.
￿
Thesigneddistanceof
￿
to
￿
isgivenby
￿
￿
￿￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿
￿￿￿￿￿￿
￿
￿
￿
Geometricmarginof
￿￿
￿
￿￿
￿
￿
w.r.t
￿
:
￿
￿
￿￿
￿
￿
￿￿￿￿￿
￿￿￿￿
￿
￿￿
￿
￿
.
￿
Geometricmarginof
￿￿￿
￿
￿￿
￿
￿
￿
￿￿￿
￿
w.r.t
￿
:
￿￿￿
￿
￿
￿
.
LinearSVMClassiﬁer
￿
LinearSVMmaximizesthegeometricmarginoftrainingdataset:
￿￿￿
￿￿￿
￿
￿
(1)
￿￿￿￿￿
￿
￿
￿￿￿￿￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
Foranysolutionsatisfyingtheconstraints,anypositivelyscaledmultiple
satisﬁesthemtoo.Soarbitrarilysetting
￿￿￿￿￿￿￿￿￿
,wecanformulate
linearSVMas:(
￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
)
￿￿￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
(2)
￿￿￿￿￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
Withthissetting,wedeﬁneamarginaroundthelineardecisionboundary
withthickness
￿￿￿￿￿￿￿
.
SolutiontoLinearSVM
￿
Wecanconvertthecontrainedminimizationtoanunconstrainedopti-
mizationproblembyrepresentingtheconstraintsaspenalityterms:
￿￿￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
Fordata
￿￿
￿
￿￿
￿
￿
,usethefollowingpenalityterm:
￿
￿￿￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿
￿￿￿￿￿￿￿￿￿￿￿
￿￿￿￿￿
￿
￿
￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿
￿
Rewritetheminimizationproblem
￿￿￿
￿￿￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿￿￿
￿
￿
￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
(3)
￿￿￿￿
￿￿￿
￿
￿￿￿
￿￿
￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
￿￿￿
￿
￿
’sarecalledtheLagrangemultipliers.
SolutiontoLinearSVM(cont’d)
￿
Wecanswap’max’and’min’:
￿￿￿
￿￿￿
￿
￿￿￿
￿￿
￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
(4)
￿￿￿￿
￿￿
￿
￿￿￿
￿￿￿
￿￿￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
￿
￿￿
￿
￿￿￿￿￿
￿
￿￿￿
￿
Weﬁrstminimize
￿￿￿￿￿
￿
￿￿￿
w.r.t
￿￿￿￿
￿
￿
foranyﬁxedsettingofthe
Lagrangemultipliers:
￿
￿￿
￿￿￿￿￿
￿
￿￿￿￿￿￿
￿
￿
￿￿￿
￿
￿
￿
￿
￿
￿
￿￿
(5)
￿
￿￿
￿
￿￿￿￿￿
￿
￿￿￿￿￿
￿
￿
￿￿￿
￿
￿
￿
￿
￿￿
(6)
SolutiontoLinearSVM(cont’d)
￿
Substitute(5)and(6)backto
￿￿￿￿￿
￿
￿￿￿
:
￿￿￿
￿￿
￿
￿￿￿
￿￿￿
￿￿￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
￿
￿￿
￿
￿￿￿￿￿
￿
￿￿￿
(7)
￿￿￿￿
￿
￿
￿￿
￿
￿
￿
￿
￿
￿
￿￿
￿
￿
￿
￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿
￿￿
￿
￿￿
￿
￿￿
￿
grammingproblem(7),whichhastheuniqueoptimalsolution.￿
WecanﬁndtheoptimalsettingoftheLagrangemultipliers
￿￿￿
￿
￿
,then
solvetheoptimalweights
￿￿￿￿￿￿
￿
￿
.
￿
Essentially,wetransformtheprimalproblemtoitsdualform.Whyshould
wedothis?
SummaryofLinearSVM
￿
Binaryandlinearseparableclassﬁcation.
￿
Linearclassiﬁerwithmaximalmargin.
￿
TrainingSVMbymaximizing
￿
￿
￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿
￿￿
￿
￿￿
￿
￿
subjectto
￿
￿
￿￿
and
￿
￿
￿
￿
￿
￿
￿￿
.
￿
Weights
￿￿￿
￿
￿
￿￿￿
￿￿
￿
￿
￿
￿
￿
.
￿
Onlyasmallsubsetof
￿￿
￿
’swillbenonzeroandthecorrespondingdata
￿
￿
’sarecalledsupportvectors.
￿
Predictiononanewexample
￿
isthesignof
￿￿
￿
￿￿￿￿￿￿￿￿
￿
￿￿￿￿
￿
￿
￿￿￿
￿￿
￿
￿
￿
￿
￿
￿￿￿￿
￿
￿
￿
￿￿￿￿
￿￿
￿
￿
￿
￿￿￿￿
￿
￿