# Microsoft® Office Outlook® 2007 Training

AI and Robotics

Oct 19, 2013 (4 years and 9 months ago)

91 views

Exploring Intrinsic Structures

from Samples:

Supervised, Unsupervised, and

Semisupervised Frameworks

Supervised by Prof. Xiaoou Tang & Prof.
Jianzhuang

Liu

Outline

Outline

Trace Ratio Optimization

Tensor Subspace Learning

Correspondence Propagation

Preserve sample feature structures

Explore the geometric structures and
feature domain relations concurrently

Notations & introductions

Dimensionality reduction

Concept

Concept. Tensor

Tensor:
multi
-
dimensional (or multi
-
way) arrays of components

Application

Concept. Tensor

real
-
world data are affected by multifarious
factors

for the person identification, we may have facial images of
different

views and poses

lightening conditions

expressions

the observed data evolve differently along
the variation of different factors

image columns and rows

Application

Concept. Tensor

it is desirable to dig through the intrinsic
connections among different affection
factors of the data.

Tensor provides a concise and effective
representation.

Illumination

pose

expression

Image columns

Image rows

Images

Introduction

Concept. Dimensionality Reduction

Preserve sample feature structures

Enhance classification capability

Reduce the computational complexity

Trace Ratio Optimization. Definition

w.r.t
.

Positive
semidefinite

Homogeneous property:

Special case, when W is a vector

Generalized Rayleigh Quotient

GEVD

Orthoganality

constraint

Optimization over the
Grassman

manifold

Trace Ratio Formulation

Trace Ratio Formulation

Linear Discriminant Analysis

Trace Ratio Formulation

Trace Ratio Formulation

Kernel Discriminant Analysis

w.r.t
.

Decompose

w.r.t
.

Let

w.r.t
.

Trace Ratio Formulation

Trace Ratio Formulation

Marginal Fisher Analysis

Intra
-
class graph
(Intrinsic graph)

Inter
-
class graph
(Penalty graph)

Trace Ratio Formulation

Trace Ratio Formulation

Kernel Marginal Fisher Analysis

w.r.t
.

Decompose

w.r.t
.

Let

w.r.t
.

Concept

Trace Ratio Formulation

2
-
D Linear Discriminant Analysis

Left Projection & Right Projection

Fix one projection matrix & optimize the other

Discriminant Analysis with Tensor Representation

Trace Ratio Formulation

Trace Ratio Formulation

Tensor Subspace Analysis

Trace Ratio Formulation

Trace Ratio Formulation

Conventional Solution:

GEVD

Singularity problem of

Nullspace

LDA

Dualspace

LDA

from Trace Ratio to Trace Difference

Preprocessing

Remove the Null Space of with Principal Component Analysis.

from Trace Ratio to Trace Difference

What will we do?
from Trace Ratio to Trace Difference

Objective:

Define

Then

Trace Ratio

Trace Difference

Find

So that

from Trace Ratio to Trace Difference

What will we do?
from Trace Ratio to Trace Difference

Constraint

Let

We have

Thus

The Objective rises
monotonously!

Where

eigen vectors of .

Main Algorithm Process

Main Algorithm

1:
Initialization.
Initialize as
arbitrary column orthogonal matrices.

2:
Iterative optimization.

For
t
=1
,
2
, . . . , Tmax
, Do

1. Set.

2. Conduct Eigenvalue Decomposition:

3. Reshape the projection directions

4.

3:
Output the projection matrices

Tensor Subspace Learning
algorithms

Tensor Subspace Analysis

He et.al

Two
-
dimensional Linear Discriminant Analysis

Discriminant Analysis with Tensor Representation

Ye et.al

Yan et.al

project the tensor along different dimensions or ways

projection matrices for different dimensions are derived
iteratively

solve an trace ratio optimization problem

DO NOT CONVERGE !

Discriminant Analysis Objective

Solve the projection matrices iteratively: leave one projection
matrix as variable while keeping others as constant.

No closed form solution

Mode
-
k unfolding of the tensor

Objective Deduction

Discriminant Analysis Objective

Trace Ratio: General Formulation for the objectives of the
Discriminant Analysis based Algorithms.

DATER:

TSA:

Within Class Scatter of the
unfolded data

Between Class Scatter of the
unfolded data

Diagonal Matrix with weights

Constructed from Image Manifold

Disagreement between the Objective and the
Optimization Process

Why do previous algorithms not converge
?

GEVD

The conversion from Trace Ratio to
Ratio Trace induces an inconsistency
among the objectives of different
dimensions!

from Trace Ratio to Trace
Difference

What will we do?
from Trace Ratio to Trace Difference

Objective:

Define

Then

Trace Ratio

Trace Difference

Find

So that

from Trace Ratio to Trace
Difference

What will we do?
from Trace Ratio to Trace Difference

Constraint

Let

We have

Thus

The Objective rises
monotonously!

Projection matrices of different
dimensions share the same
objective

Where

eigen vectors of .

Main Algorithm Process

Main Algorithm

1:
Initialization.
Initialize as
arbitrary column orthogonal matrices.

2:
Iterative optimization.

For
t
=1
,
2
, . . . , Tmax
, Do

For
k
=1
,
2
, . . . , n
, Do

1. Set.

2. Compute

and .

3. Conduct Eigenvalue Decomposition:

4. Reshape the projection directions

5.

3:
Output the projection matrices

Hightlights of the Trace Ratio based algorithm

Highlights of our algorithm

The objective value is guaranteed to monotonously increase;
and the multiple projection matrices are proved to converge.

Only eigenvalue decomposition method is applied for iterative
optimization, which makes the algorithm extremely efficient
.

Enhanced potential classification capability of the derived low
-
dimensional representation from the subspace learning
algorithms.

The first work to give a convergent solution to the
general tensor
-
based subspace learning.

Projection Visualization

Experimental Results

Visualization of the projection matrix W of PCA, ratio trace based LDA, and trace
ratio based LDA (ITR) on the FERET database.

Face Recognition Results.Linear

Experimental Results

Comparison: Trace Ratio Based LDA vs. the Ratio Trace based LDA (PCA+LDA)

Comparison: Trace Ratio Based MFA vs. the Ratio Trace based MFA (PCA+MFA)

Face Recognition Results.Kernelization

Experimental Results

Trace Ratio Based KDA vs. the Ratio Trace based KDA

Trace Ratio Based KMFA vs. the Ratio Trace based KMFA

Results on UCI Dataset

Experimental Results

Testing classification errors on three UCI databases for both linear and kernel
-
based algorithms. Results are obtained from 100 realizations of randomly
generated 70/30 splits of data.

Monotony of the Objective & Projection Matrix Convergence

Experimental Results

Face Recognition Results

Experimental Results

1. TMFA TR mostly outperforms all the
other methods concerned in this work,
with only one exception for the case
G
5
P
5
on the CMU PIE database.

2. For vector
-
based algorithms, the trace
ratio based formulation is consistently
superior to the ratio trace based one for
subspace learning.

3. Tensor representation has the potential
to improve the classification performance
for both trace ratio and ratio trace
formulations of subspace learning.

Correspondence Propagation

Geometric Structures & Feature Structures

Explore the geometric structures and feature
domain consistency for object registration

Objective

Aim

Exploit the geometric structures of sample features

Introduce human interaction for correspondence guidance

Seek a mapping of features from sets of different cardinalities

Objects are represented as sets of feature points

Graph Construction

Spatial Graph

Similarity Graph

From Spatial Graph to Categorical Product Graph

iff

and

Definition:

Suppose

and

are the vertices

of graph

and

respectively. Two assignments

and

are neighbors iff both pairs

and

are neighbors in

and

respectively, namely,

where

means

and

are neighbors.

Assignment Neighborhood Definition

From Spatial Graph to Categorical Product Graph

of

can be derived from:

where

is the matrix Kronecker product operator.

Smoothness along the spatial distribution:

Feature Domain Consistency & Soft Constraints

Similarity Measure:

One
-
to
-
one correspondence penalty

where

is matrix Hardamard product and

returns the sum of all elements in T

where

and

or

Assignment Labeling

Assign zeros to those pairs with extremely low similarity scores.

Labeled assignments: Reliable correspondence & Inhomogeneous Pairs

Inhomogeneous Pair Labeling

Reliable Pair Labeling

Assign ones to those reliable pairs

arrangement

Reliable Correspondence Propagation

Assignment variables

Arrangement:

Coefficient matrices

Objective

Reliable Correspondence Propagation

Objective:

Feature domain agreement:

Geometric smoothness regularization:

One
-
to
-
one correspondence penalty:

Solution

Reliable Correspondence Propagation

where

Relax to real domain & Closed
-
form Solution:

and

Rearrangement & Discretizing

Rearrangement and Discretization

Inverse process of the element arrangement:

Reshape the assignment vector into matrix:

Thresholding:

Assignments larger than a threshold are regarded as correspondences.

Eliciting:

Sequentially pick up the assignments with largest assignment scores.

Semisupervised & Automatic Systems

Semi
-
supervised & Unsupervised Frameworks

Exact pairwise correspondence

labeling:

Users give exact correspondence

guidance

Obscure correspondence guidance:

Rough correspondence of image parts

Experimental Results. Demonstration

Experiment. Dataset

Experimental Results. Details

Automatic feature matching score on the Oxford real image transformation dataset. The

transformations include viewpoint change ((a) Graffiti and (b) Wall sequence), image blur

((c) bikes and (d) trees sequence), zoom and rotation ((e) bark and (f) boat sequence),

illumination variation ((g) leuven ) and JPEG compression ((h) UBC).

Summary

Future Works

From point
-
to
-
point correspondence to set
-
to
-
set
correspondence.

Multi
-
scale correspondence searching.

Summary

Future Works

From point
-
to
-
point correspondence to set
-
to
-
set
correspondence.

Multi
-
scale correspondence searching.

Combine the object segmentation and registration.

Publications

Publications:

Publications:

[1]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘A convergent solution to Tensor
Subspace Learning’, International Joint Conferences on Artificial Intelligence (IJCAI 07 Regular
paper) , Jan. 2007.

[2]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Trace Ratio vs. Ratio Trace for
Dimensionality Reduction’,

IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 07), Jun. 2007.

[3]
Huan Wang, Shuicheng Yan, Thomas Huang, Jianzhuang Liu and Xiaoou Tang, ‘Transductive
Regression Piloted by Inter
-
Manifold Relations ’,

International Conference on Machine Learning
(ICML 07), Jun. 2007.

[4]
Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Maximum unfolded embedding:
formulation, solution, and application for image clustering ’, ACM international conference on
Multimedia (ACM MM07), Oct. 2006.

[5]

Shuicheng Yan, Huan Wang, Thomas Huang and Xiaoou Tang, ‘Ranking with Uncertain Labels ’,
IEEE International Conference on Multimedia & Expo (ICME07), May. 2007.

[6]

Shuicheng Yan, Huan Wang, Xiaoou Tang and Thomas Huang, ‘Exploring Feature Descriptors
for Face Recognition ’, IEEE International Conference on Acoustics, Speech, and Signal
Processing (
ICASSP07 Oral
), Apri. 2007.

Thank You!

Transductive
Regression

on Multi
-
Class Data

Explore the intrinsic feature structures
w.r.t
.
different

classes for regression

Regression Algorithms. Reviews

Exploit the manifold structures to guide the
regression

Belkin et.al, Regularization and semi
-
supervised learning on large graphs

transduces the function values from the
labeled data to the unlabeled ones utilizing
local neighborhood relations,

Global optimization for a robust prediction.

Cortes et.al, On transductive regression.

Tikhonov Regularization on the Reproducing
Kernel Hilbert Space (RKHS)

Classification problem can be regarded as a
special version of regression

Fei Wang et.al, Label Propagation Through Linear
Neighborhoods

An iterative procedure is deduced to propagate the
class labels within local neighborhood and has
been proved convergent

Regression Values are constrained at 0 and 1
(binary)

samples belonging to the corresponding class =>1

o.w. => 0

The convergence point can be deduced from the
regularization framework

The Problem We are Facing

Age estimation

w.r.t. different genders

Pose Estimation

w.r.t. different

Genders

Illumination
s

Expressions

Persons

w.r.t. different persons

FG
-
NET Aging Database

CMU
-
PIE Dataset

The problem

The Problem We are Facing

All samples are considered as
in the same class

Samples close in the data space
X are assumed to have similar
function values (smoothness
along the manifold)

For the incoming sample, no class
information is given.

Utilize class information in the training
process to boost the performance

Regression on Multi
-
Class Samples.

The class information is easy to obtain
for the training data

TRIM. Intra
-
Manifold Regularization

Respective intrinsic graphs are
built for different sample classes

Correspondingly, intra
-
manifold
regularization item for different classes are
calculated separately

intrinsic graph

The Regularization

when p=1

when p=2

It may not be proper to preserve
smoothness between samples from
different classes.

The algorithm

TRIM. Inter
-
Manifold Regularization

Assumptions

Samples with similar labels lie generally in similar relative positions on the

corresponding
sub
-
manifold
s
.

Motivation

1.Align the sub
-
manifolds of different
class samples according to the
labeled points and graph structures.

2. Derive the correspondence in the
aligned space using nearest
neighbor technique.

The algorithm

TRIM. Manifold Alignment

Minimize the correspondence error on the landmark points

Hold the intra
-
manifold structures

The item

is a global compactness regularization, and

is the Laplacian Matrix of

where

1 If and are of different classes

0 o.w.

TRIM. Inter
-
Manifold Regularization

Concatenate the derived inter
-
manifold graphs to form

Laplacian Regularization

Objective Deduction

TRIM. Objective

Fitness Item

RKHS Norm

Intra
-
Manifold Regularization

Inter
-
Manifold Regularization

Solution

TRIM. Solution

The solution to the minimization
of the objective

(Generalized Representer theorem)

Thus the minimization over Hilbert space boils down to minimizing the coefficient vector

over

The minimizer is given by

where

and K is the N
×

N Gram matrix of labeled and unlabeled points over all
the sample classes.

Solution

TRIM.Generalization

For the out
-
of
-
sample data, the labels can be estimated using

Note here in this framework the class information for the incoming sample is not

required in the prediction stage.

Original version without kernel

Two Moons

Experiments

YAMAHA Dataset

Experiments.Age Dataset