Jun Wang Adam Woznica Alexandros Kalousis
University of Geneva
University of Applied Science
Western Switzerland
Outline
Metric learning
Related works
Our approach
Experiments
Conclusion
2
Computing the distance between instances is a fundamental
problem in machine learning.
Small distance
Small distance
Large distance
3
Standard distance
metric often doesn’t
have such
discriminative power.
Metric Learning
Metric Learning learns a
distance function to reflect
the
given supervised information.
The most
popular
metric learning approach learns
a
Mahalanobis distance metric
.
The given information is often represented as constraints
involving in the learning process.
a) Large margin triplet:
b
) (dis

)similar pairs:
4
Metric Learning
5
How to extract
these
constraints
?
What kinds of
objective to
optimize for these
constraints?
Which type of
constraints to use?
Metric Learning
Given supervised information
Information
representation
Outline
Metric learning
Related works
Our approach
Experiments
Conclusion
6
Related Works
Here we only focus on supervised metric learning for
nearest neighbor classification.
7
•
Define target
neighbors
for each
instance by
k
same class nearest
neighbor in Euclidean
space
•
Minimize distance from each
point to its target neighbors .
•
Push away imposters.
•
Maintain a local margin between
target neighbors and impostors.
•
State of the art predictive
performance
.
Large Margin Metric Learning
8
Target neighbor
Large margin loss
R
egularization
Target neighbors are
predefined and are not
changed in the learning
process.
LMNN [Weinberger et al. 05],BoostMetric [Shen et al. 09], SML [Ying et al. 09
]
Information

Theoretic Metric Learning
9
ITML [Davis et al. 07],
Bk

means [Wu et al. 09
]
•
Mahalanobis metric is related to the inverse covariance matrix in
multivariate Gaussian distribution.
•
Metric Learning is formulated as
minimizing
the differential relative
entropy between
two multivariate Gaussians
under pairwise constraints.
•
Efficient learning algorithm.
Pairwise constraints are
randomly predefined and
not changed in the
learning process as well.
pairwise constraints
Prior metric
Stochastic Metric Learning
10
•
Stochastic nearest neighbor:
NCA:
•
minimize the leave

one

out nearest
neighbor error.
•
The stochastic nearest neighbor
is computationally expensive
.
•
target neighbor and impostors
are learned implicitly via LOO
error minimization
.
NCA[Goldberger et al. 04], MCML [Globerson and Roweis. 05
]
MCML:
•
collapse same class samples
into one point
•
put different
class samples
far
away
to each
other
KL divergence
Motivation
Except NCA, there is no methods learning the target
neighbors.
Target neighbor is crucial for metric learning.
11
Can we learn the target
neighbor?
Yes
Outline
Metric learning
Related works
Our approach
Experiments
Conclusion
12
Reformulation
13
•
Large Margin Metric Learning
indicates
the
predefined target
neighbor relation
represent
the
loss induced by same
class pair instance .
This is a general
formulation of
many metric
learning methods.
Methodology
14
•
Learning the target neighbor together with distance metric.
•
It makes sense to minimize here, as represent
the
loss induced by
same class pair
instance
•
M
inimizing over P favors local target neighbors.
•
The difference between and
allows to assign instances
in
sparse regions less target neighbors and instances in dense
regions more target neighbors.
Optimization
•
Alternative Optimization
15
•
Fixing , learning is a standard
metric learning problem.
•
Fixing , learning is a linear
programming problem with integer
optimal solutions.
Proof: showing constraint matrix is a totally unimodular matrix.
16
Complexity: Often converges in 5

10 iterations.
Outline
Metric learning
Related works
Our approach
Experiments
Conclusion
17
Experiments
Does learning neighborhoods improve the classification
accuracy?
18
Setup
Comparison methods
LMNN: predefined neighborhoods
NCA: implicitly neighborhoods learning
LN

LMNN:
explicitly
neighborhoods learning with tuning.
1

NN
Parameter setting
LMNN: predefined 3 target neighbors for each instance.
LN

LMNN : 2

fold cv to select with
19
Examined Datasets
20
5 small and 7 large datasets.
21
LN

LMNN achieves better
accuracy than LMNN (NCA) in
4 (3) out of 5 datasets.
22
•
LN

LMNN achieves better
accuracy than LMNN in 6
out of 7 datasets.
•
NCA cannot be scaled up.
More experimental results
and comparison methods
are described in the
paper.
Conclusion and future works
In this work, we present a simple general learning
neighborhood
method for metric learning.
Learning neighborhoods does improve the predictive
performance.
A
more theoretical motivated
problem formulation.
Learning neighborhood in the semi

supervised problem
setting, e.g. graph

based semi

supervised learning.
23
24
Thank you for your attention
!
Comments 0
Log in to post a comment