Microsoft® Office Outlook® 2007 Training

builderanthologyAI and Robotics

Oct 19, 2013 (3 years and 9 months ago)

68 views

Exploring Intrinsic Structures

from Samples:

Supervised, Unsupervised, and

Semisupervised Frameworks

Supervised by Prof. Xiaoou Tang & Prof.
Jianzhuang

Liu

Outline

Outline


Trace Ratio Optimization


Tensor Subspace Learning



Correspondence Propagation



Preserve sample feature structures


Explore the geometric structures and
feature domain relations concurrently


Notations & introductions


Dimensionality reduction

Concept

Concept. Tensor


Tensor:
multi
-
dimensional (or multi
-
way) arrays of components


Application

Concept. Tensor


real
-
world data are affected by multifarious
factors

for the person identification, we may have facial images of
different



views and poses



lightening conditions



expressions


the observed data evolve differently along
the variation of different factors



image columns and rows

Application

Concept. Tensor


it is desirable to dig through the intrinsic
connections among different affection
factors of the data.


Tensor provides a concise and effective
representation.

Illumination

pose

expression

Image columns

Image rows

Images

Introduction

Concept. Dimensionality Reduction


Preserve sample feature structures



Enhance classification capability



Reduce the computational complexity


Trace Ratio Optimization. Definition

w.r.t
.



Positive
semidefinite


Homogeneous property:



Special case, when W is a vector



Generalized Rayleigh Quotient

GEVD


Orthoganality

constraint

Optimization over the
Grassman

manifold

Trace Ratio Formulation

Trace Ratio Formulation


Linear Discriminant Analysis


Trace Ratio Formulation

Trace Ratio Formulation


Kernel Discriminant Analysis


w.r.t
.

Decompose


w.r.t
.

Let


w.r.t
.

Trace Ratio Formulation

Trace Ratio Formulation


Marginal Fisher Analysis


Intra
-
class graph
(Intrinsic graph)

Inter
-
class graph
(Penalty graph)

Trace Ratio Formulation

Trace Ratio Formulation


Kernel Marginal Fisher Analysis


w.r.t
.

Decompose


w.r.t
.

Let


w.r.t
.

Concept

Trace Ratio Formulation


2
-
D Linear Discriminant Analysis


Left Projection & Right Projection


Fix one projection matrix & optimize the other



Discriminant Analysis with Tensor Representation

Trace Ratio Formulation

Trace Ratio Formulation


Tensor Subspace Analysis


Trace Ratio Formulation

Trace Ratio Formulation


Conventional Solution:

GEVD

Singularity problem of

Nullspace

LDA

Dualspace

LDA

from Trace Ratio to Trace Difference

Preprocessing

Remove the Null Space of with Principal Component Analysis.

from Trace Ratio to Trace Difference

What will we do?
from Trace Ratio to Trace Difference

Objective:

Define

Then

Trace Ratio

Trace Difference

Find

So that

from Trace Ratio to Trace Difference

What will we do?
from Trace Ratio to Trace Difference

Constraint

Let

We have

Thus

The Objective rises
monotonously!

Where

are the leading

eigen vectors of .

Main Algorithm Process

Main Algorithm

1:
Initialization.
Initialize as
arbitrary column orthogonal matrices.

2:
Iterative optimization.

For
t
=1
,
2
, . . . , Tmax
, Do

1. Set.

2. Conduct Eigenvalue Decomposition:

3. Reshape the projection directions

4.

3:
Output the projection matrices

Tensor Subspace Learning
algorithms

Traditional Tensor Discriminant algorithms




Tensor Subspace Analysis

He et.al


Two
-
dimensional Linear Discriminant Analysis


Discriminant Analysis with Tensor Representation

Ye et.al

Yan et.al


project the tensor along different dimensions or ways


projection matrices for different dimensions are derived
iteratively


solve an trace ratio optimization problem


DO NOT CONVERGE !

Discriminant Analysis Objective

Solve the projection matrices iteratively: leave one projection
matrix as variable while keeping others as constant.


No closed form solution

Mode
-
k unfolding of the tensor

Objective Deduction

Discriminant Analysis Objective

Trace Ratio: General Formulation for the objectives of the
Discriminant Analysis based Algorithms.

DATER:

TSA:

Within Class Scatter of the
unfolded data

Between Class Scatter of the
unfolded data

Diagonal Matrix with weights

Constructed from Image Manifold

Disagreement between the Objective and the
Optimization Process

Why do previous algorithms not converge
?

GEVD

The conversion from Trace Ratio to
Ratio Trace induces an inconsistency
among the objectives of different
dimensions!

from Trace Ratio to Trace
Difference

What will we do?
from Trace Ratio to Trace Difference

Objective:

Define

Then

Trace Ratio

Trace Difference

Find

So that

from Trace Ratio to Trace
Difference

What will we do?
from Trace Ratio to Trace Difference

Constraint

Let

We have

Thus

The Objective rises
monotonously!

Projection matrices of different
dimensions share the same
objective

Where

are the leading

eigen vectors of .

Main Algorithm Process

Main Algorithm

1:
Initialization.
Initialize as
arbitrary column orthogonal matrices.

2:
Iterative optimization.

For
t
=1
,
2
, . . . , Tmax
, Do

For
k
=1
,
2
, . . . , n
, Do

1. Set.

2. Compute

and .

3. Conduct Eigenvalue Decomposition:

4. Reshape the projection directions

5.

3:
Output the projection matrices

Hightlights of the Trace Ratio based algorithm

Highlights of our algorithm


The objective value is guaranteed to monotonously increase;
and the multiple projection matrices are proved to converge.


Only eigenvalue decomposition method is applied for iterative
optimization, which makes the algorithm extremely efficient
.



Enhanced potential classification capability of the derived low
-
dimensional representation from the subspace learning
algorithms.


The first work to give a convergent solution to the
general tensor
-
based subspace learning.

Projection Visualization

Experimental Results

Visualization of the projection matrix W of PCA, ratio trace based LDA, and trace
ratio based LDA (ITR) on the FERET database.

Face Recognition Results.Linear

Experimental Results

Comparison: Trace Ratio Based LDA vs. the Ratio Trace based LDA (PCA+LDA)

Comparison: Trace Ratio Based MFA vs. the Ratio Trace based MFA (PCA+MFA)

Face Recognition Results.Kernelization

Experimental Results

Trace Ratio Based KDA vs. the Ratio Trace based KDA


Trace Ratio Based KMFA vs. the Ratio Trace based KMFA


Results on UCI Dataset

Experimental Results


Testing classification errors on three UCI databases for both linear and kernel
-
based algorithms. Results are obtained from 100 realizations of randomly
generated 70/30 splits of data.

Monotony of the Objective & Projection Matrix Convergence

Experimental Results

Face Recognition Results

Experimental Results

1. TMFA TR mostly outperforms all the
other methods concerned in this work,
with only one exception for the case
G
5
P
5
on the CMU PIE database.

2. For vector
-
based algorithms, the trace
ratio based formulation is consistently
superior to the ratio trace based one for
subspace learning.

3. Tensor representation has the potential
to improve the classification performance
for both trace ratio and ratio trace
formulations of subspace learning.

Correspondence Propagation

Geometric Structures & Feature Structures


Explore the geometric structures and feature
domain consistency for object registration

Objective

Aim


Exploit the geometric structures of sample features



Introduce human interaction for correspondence guidance



Seek a mapping of features from sets of different cardinalities



Objects are represented as sets of feature points


Graph Construction

Spatial Graph

Similarity Graph

From Spatial Graph to Categorical Product Graph

iff

and

Definition:

Suppose


and

are the vertices

of graph

and

respectively. Two assignments

and

are neighbors iff both pairs

and

are neighbors in

and

respectively, namely,

where

means

and

are neighbors.

Assignment Neighborhood Definition

From Spatial Graph to Categorical Product Graph

The adjacency matrix

of

can be derived from:

where

is the matrix Kronecker product operator.

Smoothness along the spatial distribution:

Feature Domain Consistency & Soft Constraints

Similarity Measure:

One
-
to
-
one correspondence penalty

where

is matrix Hardamard product and

returns the sum of all elements in T

where

and

or

Assignment Labeling

Assign zeros to those pairs with extremely low similarity scores.

Labeled assignments: Reliable correspondence & Inhomogeneous Pairs

Inhomogeneous Pair Labeling

Reliable Pair Labeling

Assign ones to those reliable pairs

arrangement

Reliable Correspondence Propagation

Assignment variables

Arrangement:

Coefficient matrices

Spatial Adjacency matrices

Objective

Reliable Correspondence Propagation

Objective:

Feature domain agreement:

Geometric smoothness regularization:

One
-
to
-
one correspondence penalty:

Solution

Reliable Correspondence Propagation

where

Relax to real domain & Closed
-
form Solution:

and

Rearrangement & Discretizing

Rearrangement and Discretization

Inverse process of the element arrangement:

Reshape the assignment vector into matrix:

Thresholding:


Assignments larger than a threshold are regarded as correspondences.

Eliciting:


Sequentially pick up the assignments with largest assignment scores.

Semisupervised & Automatic Systems

Semi
-
supervised & Unsupervised Frameworks

Exact pairwise correspondence

labeling:

Users give exact correspondence

guidance

Obscure correspondence guidance:

Rough correspondence of image parts


Experimental Results. Demonstration

Experiment. Dataset

Experimental Results. Details

Automatic feature matching score on the Oxford real image transformation dataset. The

transformations include viewpoint change ((a) Graffiti and (b) Wall sequence), image blur

((c) bikes and (d) trees sequence), zoom and rotation ((e) bark and (f) boat sequence),

illumination variation ((g) leuven ) and JPEG compression ((h) UBC).

Summary

Future Works


From point
-
to
-
point correspondence to set
-
to
-
set
correspondence.


Multi
-
scale correspondence searching.

Summary

Future Works


From point
-
to
-
point correspondence to set
-
to
-
set
correspondence.


Multi
-
scale correspondence searching.


Combine the object segmentation and registration.

Publications

Publications:

Publications:

[1]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘A convergent solution to Tensor
Subspace Learning’, International Joint Conferences on Artificial Intelligence (IJCAI 07 Regular
paper) , Jan. 2007.

[2]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Trace Ratio vs. Ratio Trace for
Dimensionality Reduction’,


IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 07), Jun. 2007.

[3]
Huan Wang, Shuicheng Yan, Thomas Huang, Jianzhuang Liu and Xiaoou Tang, ‘Transductive
Regression Piloted by Inter
-
Manifold Relations ’,

International Conference on Machine Learning
(ICML 07), Jun. 2007.

[4]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Maximum unfolded embedding:
formulation, solution, and application for image clustering ’, ACM international conference on
Multimedia (ACM MM07), Oct. 2006.

[5]

Shuicheng Yan, Huan Wang, Thomas Huang and Xiaoou Tang, ‘Ranking with Uncertain Labels ’,
IEEE International Conference on Multimedia & Expo (ICME07), May. 2007.

[6]

Shuicheng Yan, Huan Wang, Xiaoou Tang and Thomas Huang, ‘Exploring Feature Descriptors
for Face Recognition ’, IEEE International Conference on Acoustics, Speech, and Signal
Processing (
ICASSP07 Oral
), Apri. 2007.


Thank You!

Transductive
Regression

on Multi
-
Class Data


Explore the intrinsic feature structures
w.r.t
.
different

classes for regression

Regression Algorithms. Reviews

Exploit the manifold structures to guide the
regression

Belkin et.al, Regularization and semi
-
supervised learning on large graphs

transduces the function values from the
labeled data to the unlabeled ones utilizing
local neighborhood relations,

Global optimization for a robust prediction.

Cortes et.al, On transductive regression.

Tikhonov Regularization on the Reproducing
Kernel Hilbert Space (RKHS)

Classification problem can be regarded as a
special version of regression

Fei Wang et.al, Label Propagation Through Linear
Neighborhoods

An iterative procedure is deduced to propagate the
class labels within local neighborhood and has
been proved convergent

Regression Values are constrained at 0 and 1
(binary)

samples belonging to the corresponding class =>1

o.w. => 0

The convergence point can be deduced from the
regularization framework

The Problem We are Facing

Age estimation

w.r.t. different genders


Pose Estimation

w.r.t. different

Genders

Illumination
s

Expressions

Persons

w.r.t. different persons

FG
-
NET Aging Database

CMU
-
PIE Dataset

The problem

The Problem We are Facing


All samples are considered as
in the same class


Samples close in the data space
X are assumed to have similar
function values (smoothness
along the manifold)


For the incoming sample, no class
information is given.


Utilize class information in the training
process to boost the performance

Regression on Multi
-
Class Samples.


Traditional Algorithms


The class information is easy to obtain
for the training data

TRIM. Intra
-
Manifold Regularization


Respective intrinsic graphs are
built for different sample classes


Correspondingly, intra
-
manifold
regularization item for different classes are
calculated separately

intrinsic graph


The Regularization


when p=1


when p=2


It may not be proper to preserve
smoothness between samples from
different classes.

The algorithm

TRIM. Inter
-
Manifold Regularization


Assumptions


Samples with similar labels lie generally in similar relative positions on the

corresponding
sub
-
manifold
s
.


Motivation


1.Align the sub
-
manifolds of different
class samples according to the
labeled points and graph structures.


2. Derive the correspondence in the
aligned space using nearest
neighbor technique.

The algorithm

TRIM. Manifold Alignment


Minimize the correspondence error on the landmark points


Hold the intra
-
manifold structures


The item

is a global compactness regularization, and


is the Laplacian Matrix of

where

1 If and are of different classes

0 o.w.

TRIM. Inter
-
Manifold Regularization


Concatenate the derived inter
-
manifold graphs to form


Laplacian Regularization

Objective Deduction

TRIM. Objective


Fitness Item


RKHS Norm


Intra
-
Manifold Regularization


Inter
-
Manifold Regularization

Solution

TRIM. Solution


The solution to the minimization
of the objective

admits an expansion

(Generalized Representer theorem)

Thus the minimization over Hilbert space boils down to minimizing the coefficient vector

over

The minimizer is given by

where

and K is the N
×

N Gram matrix of labeled and unlabeled points over all
the sample classes.

Solution

TRIM.Generalization


For the out
-
of
-
sample data, the labels can be estimated using

Note here in this framework the class information for the incoming sample is not

required in the prediction stage.

Original version without kernel

Two Moons

Experiments

YAMAHA Dataset

Experiments.Age Dataset

TRIM vs traditional graph Laplacian
regularized regression for the training set
evaluation on YAMAHA database.

Open set evaluation for the kernelized regression on the YAMAHA
database. (left) Regression on the training set. (right) Regression
on out
-
of
-
sample data