Jun Wang Adam Woznica Alexandros Kalousis

zoomzurichAI and Robotics

Oct 16, 2013 (3 years and 7 months ago)

75 views

Jun Wang Adam Woznica Alexandros Kalousis

University of Geneva

University of Applied Science
Western Switzerland

Outline



Metric learning


Related works


Our approach


Experiments


Conclusion


2


Computing the distance between instances is a fundamental
problem in machine learning.







Small distance

Small distance

Large distance

3

Standard distance
metric often doesn’t
have such
discriminative power.




Metric Learning


Metric Learning learns a
distance function to reflect
the
given supervised information.


The most
popular
metric learning approach learns
a
Mahalanobis distance metric
.



The given information is often represented as constraints
involving in the learning process.


a) Large margin triplet:

b
) (dis
-
)similar pairs:

4

Metric Learning

5

How to extract
these
constraints
?

What kinds of
objective to
optimize for these
constraints?

Which type of
constraints to use?

Metric Learning

Given supervised information

Information
representation

Outline



Metric learning


Related works


Our approach


Experiments


Conclusion


6

Related Works


Here we only focus on supervised metric learning for
nearest neighbor classification.

7


Define target
neighbors
for each
instance by
k
same class nearest
neighbor in Euclidean
space


Minimize distance from each
point to its target neighbors .


Push away imposters.


Maintain a local margin between
target neighbors and impostors.


State of the art predictive
performance
.

Large Margin Metric Learning

8

Target neighbor

Large margin loss

R
egularization

Target neighbors are
predefined and are not
changed in the learning
process.

LMNN [Weinberger et al. 05],BoostMetric [Shen et al. 09], SML [Ying et al. 09
]

Information
-
Theoretic Metric Learning

9

ITML [Davis et al. 07],
Bk
-
means [Wu et al. 09
]


Mahalanobis metric is related to the inverse covariance matrix in
multivariate Gaussian distribution.



Metric Learning is formulated as

minimizing
the differential relative
entropy between
two multivariate Gaussians
under pairwise constraints.



Efficient learning algorithm.

Pairwise constraints are
randomly predefined and
not changed in the
learning process as well.

pairwise constraints

Prior metric

Stochastic Metric Learning

10


Stochastic nearest neighbor:

NCA:


minimize the leave
-
one
-
out nearest
neighbor error.


The stochastic nearest neighbor
is computationally expensive
.


target neighbor and impostors
are learned implicitly via LOO
error minimization
.

NCA[Goldberger et al. 04], MCML [Globerson and Roweis. 05
]

MCML:






collapse same class samples
into one point


put different
class samples
far
away
to each
other


KL divergence

Motivation


Except NCA, there is no methods learning the target
neighbors.


Target neighbor is crucial for metric learning.

11

Can we learn the target
neighbor?

Yes

Outline



Metric learning


Related works


Our approach


Experiments


Conclusion


12

Reformulation

13


Large Margin Metric Learning


indicates
the
predefined target
neighbor relation


represent
the
loss induced by same
class pair instance .

This is a general
formulation of
many metric
learning methods.

Methodology

14


Learning the target neighbor together with distance metric.


It makes sense to minimize here, as represent
the
loss induced by
same class pair
instance



M
inimizing over P favors local target neighbors.


The difference between and


allows to assign instances
in
sparse regions less target neighbors and instances in dense
regions more target neighbors.

Optimization


Alternative Optimization

15


Fixing , learning is a standard
metric learning problem.


Fixing , learning is a linear
programming problem with integer
optimal solutions.

Proof: showing constraint matrix is a totally unimodular matrix.

16

Complexity: Often converges in 5
-
10 iterations.

Outline



Metric learning


Related works


Our approach


Experiments


Conclusion


17

Experiments






Does learning neighborhoods improve the classification
accuracy?



18

Setup


Comparison methods


LMNN: predefined neighborhoods


NCA: implicitly neighborhoods learning


LN
-
LMNN:
explicitly
neighborhoods learning with tuning.


1
-
NN



Parameter setting



LMNN: predefined 3 target neighbors for each instance.


LN
-
LMNN : 2
-
fold cv to select with


19

Examined Datasets

20

5 small and 7 large datasets.

21

LN
-
LMNN achieves better
accuracy than LMNN (NCA) in
4 (3) out of 5 datasets.

22


LN
-
LMNN achieves better
accuracy than LMNN in 6
out of 7 datasets.


NCA cannot be scaled up.

More experimental results
and comparison methods
are described in the
paper.

Conclusion and future works


In this work, we present a simple general learning
neighborhood
method for metric learning.


Learning neighborhoods does improve the predictive
performance.


A
more theoretical motivated
problem formulation.


Learning neighborhood in the semi
-
supervised problem
setting, e.g. graph
-
based semi
-
supervised learning.

23

24

Thank you for your attention
!