Lecture 20: Support Vector Machines

yellowgreatΤεχνίτη Νοημοσύνη και Ρομποτική

16 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

75 εμφανίσεις

Lecture20:SupportVectorMachines
Outline
￿
Discriminativelearningofclassifiers.
￿
Learningadecisionboundary.
￿
Issue:generalization.
￿
LinearSupportVectorMachine(SVM)classifier.
￿
Marginandgeneralization.
￿
TrainingoflinearSVM.
LinearClassification
￿
Binaryclassificationproblem:weassignlabels
￿￿￿￿￿￿￿￿
toinputdata
￿
.
￿
Linearclassifier:
￿￿￿￿￿￿￿￿￿￿￿￿
￿
￿
anditsdecisionsurfaceisa
hyperplanedefinedby
￿￿￿￿￿
￿
￿￿
.
￿
Linearlyseparable:wecanfindalinearclassifiersothatallthetraining
examplesareclassifiedcorrectly.
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿￿
Perceptrons
￿
Findlinethatseparatesinputpatternssothatoutput
￿￿￿￿
onone
side,
￿￿￿￿
onother,andthesematchtargetvalues
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
￿￿￿￿￿￿￿
rewrite–foreverytrainingexample
￿
:
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿
￿
Wecanadjustweights
￿￿￿￿
￿
￿
byPerceptronlearningrule,whichguar-
anteestoconvergetothecorrectsolutioninthelinearseparablecase.
￿
Problem:whichsolutionwillhasthebestgeneralization?
GeometricalViewofLinearClassifiers
￿
Margin:minimalgapbetweenclassesanddecisionboundary.
￿
Answer:Thelineardecisionsurfacewiththemaximalmargin.
GeometricMargin
￿
SomeVectorAlgebra:
w*
x0
x
L
￿
Anytwopoints
￿
￿
and
￿
￿
lyingin
￿
,wehave
￿￿￿￿
￿
￿￿
￿
￿￿￿
,whichimplies
￿
￿
￿￿￿￿￿￿￿￿
istheunitvectornormaltothesurfaceof
￿
.
￿
Anypoint
￿
￿
in
￿
,
￿￿￿
￿
￿￿￿
￿
.
￿
Thesigneddistanceof
￿
to
￿
isgivenby
￿
￿
￿￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿
￿￿￿￿￿￿
￿
￿
￿
Geometricmarginof
￿￿
￿
￿￿
￿
￿
w.r.t
￿
:
￿
￿
￿￿
￿
￿
￿￿￿￿￿
￿￿￿￿
￿
￿￿
￿
￿
.
￿
Geometricmarginof
￿￿￿
￿
￿￿
￿
￿
￿
￿￿￿
￿
w.r.t
￿
:
￿￿￿
￿
￿
￿
.
LinearSVMClassifier
￿
LinearSVMmaximizesthegeometricmarginoftrainingdataset:
￿￿￿
￿￿￿
￿
￿
(1)
￿￿￿￿￿
￿
￿
￿￿￿￿￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
Foranysolutionsatisfyingtheconstraints,anypositivelyscaledmultiple
satisfiesthemtoo.Soarbitrarilysetting
￿￿￿￿￿￿￿￿￿
,wecanformulate
linearSVMas:(
￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
)
￿￿￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
(2)
￿￿￿￿￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
Withthissetting,wedefineamarginaroundthelineardecisionboundary
withthickness
￿￿￿￿￿￿￿
.
SolutiontoLinearSVM
￿
Wecanconvertthecontrainedminimizationtoanunconstrainedopti-
mizationproblembyrepresentingtheconstraintsaspenalityterms:
￿￿￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
￿￿￿￿￿￿￿￿￿￿￿￿￿
￿
Fordata
￿￿
￿
￿￿
￿
￿
,usethefollowingpenalityterm:
￿
￿￿￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿
￿￿￿￿￿￿￿￿￿￿￿
￿￿￿￿￿
￿
￿
￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿
￿
Rewritetheminimizationproblem
￿￿￿
￿￿￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿￿￿
￿
￿
￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
(3)
￿￿￿￿
￿￿￿
￿
￿￿￿
￿￿
￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
￿￿￿
￿
￿
’sarecalledtheLagrangemultipliers.
SolutiontoLinearSVM(cont’d)
￿
Wecanswap’max’and’min’:
￿￿￿
￿￿￿
￿
￿￿￿
￿￿
￿
￿￿￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
(4)
￿￿￿￿
￿￿
￿
￿￿￿
￿￿￿
￿￿￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
￿
￿￿
￿
￿￿￿￿￿
￿
￿￿￿
￿
Wefirstminimize
￿￿￿￿￿
￿
￿￿￿
w.r.t
￿￿￿￿
￿
￿
foranyfixedsettingofthe
Lagrangemultipliers:
￿
￿￿
￿￿￿￿￿
￿
￿￿￿￿￿￿
￿
￿
￿￿￿
￿
￿
￿
￿
￿
￿
￿￿
(5)
￿
￿￿
￿
￿￿￿￿￿
￿
￿￿￿￿￿
￿
￿
￿￿￿
￿
￿
￿
￿
￿￿
(6)
SolutiontoLinearSVM(cont’d)
￿
Substitute(5)and(6)backto
￿￿￿￿￿
￿
￿￿￿
:
￿￿￿
￿￿
￿
￿￿￿
￿￿￿
￿￿￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿￿￿
￿
￿
￿￿￿￿
￿
￿￿
￿
￿￿￿￿
￿
￿￿￿
￿
￿￿
￿
￿￿￿￿￿
￿
￿￿￿
(7)
￿￿￿￿
￿
￿
￿￿
￿
￿
￿
￿
￿
￿
￿￿
￿
￿
￿
￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿
￿￿
￿
￿￿
￿
￿￿
￿
Finally,wetransformtheoriginallinearSVMtrainingtoaquadraticpro-
grammingproblem(7),whichhastheuniqueoptimalsolution.￿
WecanfindtheoptimalsettingoftheLagrangemultipliers
￿￿￿
￿
￿
,then
solvetheoptimalweights
￿￿￿￿￿￿
￿
￿
.
￿
Essentially,wetransformtheprimalproblemtoitsdualform.Whyshould
wedothis?
SummaryofLinearSVM
￿
Binaryandlinearseparableclassfication.
￿
Linearclassifierwithmaximalmargin.
￿
TrainingSVMbymaximizing
￿
￿
￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿￿￿￿￿
￿
￿
￿
￿
￿
￿
￿
￿
￿￿
￿
￿￿
￿
￿
subjectto
￿
￿
￿￿
and
￿
￿
￿
￿
￿
￿
￿￿
.
￿
Weights
￿￿￿
￿
￿
￿￿￿
￿￿
￿
￿
￿
￿
￿
.
￿
Onlyasmallsubsetof
￿￿
￿
’swillbenonzeroandthecorrespondingdata
￿
￿
’sarecalledsupportvectors.
￿
Predictiononanewexample
￿
isthesignof
￿￿
￿
￿￿￿￿￿￿￿￿
￿
￿￿￿￿
￿
￿
￿￿￿
￿￿
￿
￿
￿
￿
￿
￿￿￿￿
￿
￿
￿
￿￿￿￿
￿￿
￿
￿
￿
￿￿￿￿
￿
￿