Exploring Intrinsic Structures
from Samples:
Supervised, Unsupervised, and
Semisupervised Frameworks
Supervised by Prof. Xiaoou Tang & Prof.
Jianzhuang
Liu
Outline
Outline
•
Trace Ratio Optimization
Tensor Subspace Learning
•
Correspondence Propagation
Preserve sample feature structures
Explore the geometric structures and
feature domain relations concurrently
•
Notations & introductions
Dimensionality reduction
Concept
Concept. Tensor
•
Tensor:
multi

dimensional (or multi

way) arrays of components
Application
Concept. Tensor
•
real

world data are affected by multifarious
factors
for the person identification, we may have facial images of
different
►
views and poses
►
lightening conditions
►
expressions
•
the observed data evolve differently along
the variation of different factors
►
image columns and rows
Application
Concept. Tensor
•
it is desirable to dig through the intrinsic
connections among different affection
factors of the data.
•
Tensor provides a concise and effective
representation.
Illumination
pose
expression
Image columns
Image rows
Images
Introduction
Concept. Dimensionality Reduction
•
Preserve sample feature structures
•
Enhance classification capability
•
Reduce the computational complexity
Trace Ratio Optimization. Definition
w.r.t
.
•
Positive
semidefinite
•
Homogeneous property:
•
Special case, when W is a vector
Generalized Rayleigh Quotient
GEVD
•
Orthoganality
constraint
Optimization over the
Grassman
manifold
Trace Ratio Formulation
Trace Ratio Formulation
•
Linear Discriminant Analysis
Trace Ratio Formulation
Trace Ratio Formulation
•
Kernel Discriminant Analysis
w.r.t
.
Decompose
w.r.t
.
Let
w.r.t
.
Trace Ratio Formulation
Trace Ratio Formulation
•
Marginal Fisher Analysis
Intra

class graph
(Intrinsic graph)
Inter

class graph
(Penalty graph)
Trace Ratio Formulation
Trace Ratio Formulation
•
Kernel Marginal Fisher Analysis
w.r.t
.
Decompose
w.r.t
.
Let
w.r.t
.
Concept
Trace Ratio Formulation
•
2

D Linear Discriminant Analysis
Left Projection & Right Projection
Fix one projection matrix & optimize the other
•
Discriminant Analysis with Tensor Representation
Trace Ratio Formulation
Trace Ratio Formulation
•
Tensor Subspace Analysis
Trace Ratio Formulation
Trace Ratio Formulation
Conventional Solution:
GEVD
Singularity problem of
Nullspace
LDA
Dualspace
LDA
from Trace Ratio to Trace Difference
Preprocessing
Remove the Null Space of with Principal Component Analysis.
from Trace Ratio to Trace Difference
What will we do?
from Trace Ratio to Trace Difference
Objective:
Define
Then
Trace Ratio
Trace Difference
Find
So that
from Trace Ratio to Trace Difference
What will we do?
from Trace Ratio to Trace Difference
Constraint
Let
We have
Thus
The Objective rises
monotonously!
Where
are the leading
eigen vectors of .
Main Algorithm Process
Main Algorithm
1:
Initialization.
Initialize as
arbitrary column orthogonal matrices.
2:
Iterative optimization.
For
t
=1
,
2
, . . . , Tmax
, Do
1. Set.
2. Conduct Eigenvalue Decomposition:
3. Reshape the projection directions
4.
3:
Output the projection matrices
Tensor Subspace Learning
algorithms
Traditional Tensor Discriminant algorithms
•
Tensor Subspace Analysis
He et.al
•
Two

dimensional Linear Discriminant Analysis
•
Discriminant Analysis with Tensor Representation
Ye et.al
Yan et.al
•
project the tensor along different dimensions or ways
•
projection matrices for different dimensions are derived
iteratively
•
solve an trace ratio optimization problem
•
DO NOT CONVERGE !
Discriminant Analysis Objective
Solve the projection matrices iteratively: leave one projection
matrix as variable while keeping others as constant.
•
No closed form solution
Mode

k unfolding of the tensor
Objective Deduction
Discriminant Analysis Objective
Trace Ratio: General Formulation for the objectives of the
Discriminant Analysis based Algorithms.
DATER:
TSA:
Within Class Scatter of the
unfolded data
Between Class Scatter of the
unfolded data
Diagonal Matrix with weights
Constructed from Image Manifold
Disagreement between the Objective and the
Optimization Process
Why do previous algorithms not converge
?
GEVD
The conversion from Trace Ratio to
Ratio Trace induces an inconsistency
among the objectives of different
dimensions!
from Trace Ratio to Trace
Difference
What will we do?
from Trace Ratio to Trace Difference
Objective:
Define
Then
Trace Ratio
Trace Difference
Find
So that
from Trace Ratio to Trace
Difference
What will we do?
from Trace Ratio to Trace Difference
Constraint
Let
We have
Thus
The Objective rises
monotonously!
Projection matrices of different
dimensions share the same
objective
Where
are the leading
eigen vectors of .
Main Algorithm Process
Main Algorithm
1:
Initialization.
Initialize as
arbitrary column orthogonal matrices.
2:
Iterative optimization.
For
t
=1
,
2
, . . . , Tmax
, Do
For
k
=1
,
2
, . . . , n
, Do
1. Set.
2. Compute
and .
3. Conduct Eigenvalue Decomposition:
4. Reshape the projection directions
5.
3:
Output the projection matrices
Hightlights of the Trace Ratio based algorithm
Highlights of our algorithm
•
The objective value is guaranteed to monotonously increase;
and the multiple projection matrices are proved to converge.
•
Only eigenvalue decomposition method is applied for iterative
optimization, which makes the algorithm extremely efficient
.
•
Enhanced potential classification capability of the derived low

dimensional representation from the subspace learning
algorithms.
•
The first work to give a convergent solution to the
general tensor

based subspace learning.
Projection Visualization
Experimental Results
Visualization of the projection matrix W of PCA, ratio trace based LDA, and trace
ratio based LDA (ITR) on the FERET database.
Face Recognition Results.Linear
Experimental Results
Comparison: Trace Ratio Based LDA vs. the Ratio Trace based LDA (PCA+LDA)
Comparison: Trace Ratio Based MFA vs. the Ratio Trace based MFA (PCA+MFA)
Face Recognition Results.Kernelization
Experimental Results
Trace Ratio Based KDA vs. the Ratio Trace based KDA
Trace Ratio Based KMFA vs. the Ratio Trace based KMFA
Results on UCI Dataset
Experimental Results
Testing classification errors on three UCI databases for both linear and kernel

based algorithms. Results are obtained from 100 realizations of randomly
generated 70/30 splits of data.
Monotony of the Objective & Projection Matrix Convergence
Experimental Results
Face Recognition Results
Experimental Results
1. TMFA TR mostly outperforms all the
other methods concerned in this work,
with only one exception for the case
G
5
P
5
on the CMU PIE database.
2. For vector

based algorithms, the trace
ratio based formulation is consistently
superior to the ratio trace based one for
subspace learning.
3. Tensor representation has the potential
to improve the classification performance
for both trace ratio and ratio trace
formulations of subspace learning.
Correspondence Propagation
Geometric Structures & Feature Structures
Explore the geometric structures and feature
domain consistency for object registration
Objective
Aim
•
Exploit the geometric structures of sample features
•
Introduce human interaction for correspondence guidance
•
Seek a mapping of features from sets of different cardinalities
•
Objects are represented as sets of feature points
Graph Construction
Spatial Graph
Similarity Graph
From Spatial Graph to Categorical Product Graph
iff
and
Definition:
Suppose
and
are the vertices
of graph
and
respectively. Two assignments
and
are neighbors iff both pairs
and
are neighbors in
and
respectively, namely,
where
means
and
are neighbors.
Assignment Neighborhood Definition
From Spatial Graph to Categorical Product Graph
The adjacency matrix
of
can be derived from:
where
is the matrix Kronecker product operator.
Smoothness along the spatial distribution:
Feature Domain Consistency & Soft Constraints
Similarity Measure:
One

to

one correspondence penalty
where
is matrix Hardamard product and
returns the sum of all elements in T
where
and
or
Assignment Labeling
Assign zeros to those pairs with extremely low similarity scores.
Labeled assignments: Reliable correspondence & Inhomogeneous Pairs
Inhomogeneous Pair Labeling
Reliable Pair Labeling
Assign ones to those reliable pairs
arrangement
Reliable Correspondence Propagation
Assignment variables
Arrangement:
Coefficient matrices
Spatial Adjacency matrices
Objective
Reliable Correspondence Propagation
Objective:
Feature domain agreement:
Geometric smoothness regularization:
One

to

one correspondence penalty:
Solution
Reliable Correspondence Propagation
where
Relax to real domain & Closed

form Solution:
and
Rearrangement & Discretizing
Rearrangement and Discretization
Inverse process of the element arrangement:
Reshape the assignment vector into matrix:
Thresholding:
Assignments larger than a threshold are regarded as correspondences.
Eliciting:
Sequentially pick up the assignments with largest assignment scores.
Semisupervised & Automatic Systems
Semi

supervised & Unsupervised Frameworks
Exact pairwise correspondence
labeling:
Users give exact correspondence
guidance
Obscure correspondence guidance:
Rough correspondence of image parts
Experimental Results. Demonstration
Experiment. Dataset
Experimental Results. Details
Automatic feature matching score on the Oxford real image transformation dataset. The
transformations include viewpoint change ((a) Graffiti and (b) Wall sequence), image blur
((c) bikes and (d) trees sequence), zoom and rotation ((e) bark and (f) boat sequence),
illumination variation ((g) leuven ) and JPEG compression ((h) UBC).
Summary
Future Works
•
From point

to

point correspondence to set

to

set
correspondence.
•
Multi

scale correspondence searching.
Summary
Future Works
•
From point

to

point correspondence to set

to

set
correspondence.
•
Multi

scale correspondence searching.
•
Combine the object segmentation and registration.
Publications
Publications:
Publications:
[1]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘A convergent solution to Tensor
Subspace Learning’, International Joint Conferences on Artificial Intelligence (IJCAI 07 Regular
paper) , Jan. 2007.
[2]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Trace Ratio vs. Ratio Trace for
Dimensionality Reduction’,
IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 07), Jun. 2007.
[3]
Huan Wang, Shuicheng Yan, Thomas Huang, Jianzhuang Liu and Xiaoou Tang, ‘Transductive
Regression Piloted by Inter

Manifold Relations ’,
International Conference on Machine Learning
(ICML 07), Jun. 2007.
[4]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Maximum unfolded embedding:
formulation, solution, and application for image clustering ’, ACM international conference on
Multimedia (ACM MM07), Oct. 2006.
[5]
Shuicheng Yan, Huan Wang, Thomas Huang and Xiaoou Tang, ‘Ranking with Uncertain Labels ’,
IEEE International Conference on Multimedia & Expo (ICME07), May. 2007.
[6]
Shuicheng Yan, Huan Wang, Xiaoou Tang and Thomas Huang, ‘Exploring Feature Descriptors
for Face Recognition ’, IEEE International Conference on Acoustics, Speech, and Signal
Processing (
ICASSP07 Oral
), Apri. 2007.
Thank You!
Transductive
Regression
on Multi

Class Data
Explore the intrinsic feature structures
w.r.t
.
different
classes for regression
Regression Algorithms. Reviews
Exploit the manifold structures to guide the
regression
Belkin et.al, Regularization and semi

supervised learning on large graphs
transduces the function values from the
labeled data to the unlabeled ones utilizing
local neighborhood relations,
Global optimization for a robust prediction.
Cortes et.al, On transductive regression.
Tikhonov Regularization on the Reproducing
Kernel Hilbert Space (RKHS)
Classification problem can be regarded as a
special version of regression
Fei Wang et.al, Label Propagation Through Linear
Neighborhoods
An iterative procedure is deduced to propagate the
class labels within local neighborhood and has
been proved convergent
Regression Values are constrained at 0 and 1
(binary)
samples belonging to the corresponding class =>1
o.w. => 0
The convergence point can be deduced from the
regularization framework
The Problem We are Facing
Age estimation
w.r.t. different genders
Pose Estimation
w.r.t. different
Genders
Illumination
s
Expressions
Persons
w.r.t. different persons
FG

NET Aging Database
CMU

PIE Dataset
The problem
The Problem We are Facing
•
All samples are considered as
in the same class
•
Samples close in the data space
X are assumed to have similar
function values (smoothness
along the manifold)
•
For the incoming sample, no class
information is given.
•
Utilize class information in the training
process to boost the performance
Regression on Multi

Class Samples.
Traditional Algorithms
•
The class information is easy to obtain
for the training data
TRIM. Intra

Manifold Regularization
•
Respective intrinsic graphs are
built for different sample classes
•
Correspondingly, intra

manifold
regularization item for different classes are
calculated separately
intrinsic graph
•
The Regularization
when p=1
when p=2
•
It may not be proper to preserve
smoothness between samples from
different classes.
The algorithm
TRIM. Inter

Manifold Regularization
•
Assumptions
Samples with similar labels lie generally in similar relative positions on the
corresponding
sub

manifold
s
.
•
Motivation
1.Align the sub

manifolds of different
class samples according to the
labeled points and graph structures.
2. Derive the correspondence in the
aligned space using nearest
neighbor technique.
The algorithm
TRIM. Manifold Alignment
•
Minimize the correspondence error on the landmark points
•
Hold the intra

manifold structures
•
The item
is a global compactness regularization, and
is the Laplacian Matrix of
where
1 If and are of different classes
0 o.w.
TRIM. Inter

Manifold Regularization
•
Concatenate the derived inter

manifold graphs to form
•
Laplacian Regularization
Objective Deduction
TRIM. Objective
•
Fitness Item
•
RKHS Norm
•
Intra

Manifold Regularization
•
Inter

Manifold Regularization
Solution
TRIM. Solution
•
The solution to the minimization
of the objective
admits an expansion
(Generalized Representer theorem)
Thus the minimization over Hilbert space boils down to minimizing the coefficient vector
over
The minimizer is given by
where
and K is the N
×
N Gram matrix of labeled and unlabeled points over all
the sample classes.
Solution
TRIM.Generalization
•
For the out

of

sample data, the labels can be estimated using
Note here in this framework the class information for the incoming sample is not
required in the prediction stage.
Original version without kernel
Two Moons
Experiments
YAMAHA Dataset
Experiments.Age Dataset
TRIM vs traditional graph Laplacian
regularized regression for the training set
evaluation on YAMAHA database.
Open set evaluation for the kernelized regression on the YAMAHA
database. (left) Regression on the training set. (right) Regression
on out

of

sample data
Comments 0
Log in to post a comment