EE 290A: Generalized Principal Component Analysis

naivenorthΤεχνίτη Νοημοσύνη και Ρομποτική

8 Νοε 2013 (πριν από 4 χρόνια και 5 μέρες)

121 εμφανίσεις

EE 290A: Generalized
Principal Component Analysis

Lecture 2 (by Allen Y. Yang):

Extensions of PCA

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

1

Last time


Challenges in modern data clustering problems.


PCA reduces dimensionality of the data while
retaining as much data variation as possible.


Statistical view: The first
d

PCs are given by the
d

leading eigenvectors of the covariance.


Geometric view: Fitting a
d
-
dim subspace model via
SVD

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

2

This lecture


Determine an optimal number of PCs:
d


Probabilistic PCA


Kernel PCA


Robust PCA shall be discussed later

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

3

Determine the number of PCs


Choosing the optimal number of PCs in noise
-
free
case is straightforward:

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

4


In the noisy case

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

5

knee
point

A Model Selection Problem


With moderate Gaussian noise, to keep 100%
fidelity of the data, all
D
-
dim must be preserved.


However, we can still find tradeoff
between model
complexity and data fidelity
?

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

6

More principled conditions

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

7

Probabilistic PCA: A generative
approach

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

8


Given sample statistics, (*) contains ambiguities


Assume
y

is standard normal, and
ε
is

isotropic




Then each observation is also Gaussian

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

9

Determining principal axes by MLE

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

10


Compute the log
-
likelihood for
n

samples




The gradient of
L

leads to stationary points


Two nontrivial solutions

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

11

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

12

Kernel PCA: for nonlinear data


Nonlinear embedding

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

13

Example

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

14

Question: How to recover the
coef
?


Compute the null space of the data matrix





The special polynomial embedding is called the
Veronese map

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

15

Dimensionality Issue in Embedding


Given
D

and order
n
, what is the dimension of the
Veronese map?




Often the dimension blows up with large D or
n
.


Question
: Can we find the higher
-
order nonlinear
structures without explicitly calling the embedding
function?

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

16

Nonlinear PCA


Nonlinear PCs

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

17


In the case
M

is much larger than
n

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

18

Kernel PCA


Computations in NLPCA only involve inner
products of the embedded samples, not the samples
themselves.


Therefore, the mapping relation can be expressed in
the the
computation of PCA
without explicitly
calling the embedding function.


The inner product of two embedded samples is
called the
kernel function
.

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

19

Kernel Function

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

20

Computing
NLPCs

via Kernel Matrix

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

21

Examples of Popular Kernels


Polynomial kernel:



Gaussian kernel (Radial Basis Function):




Intersection kernel:

Sastry & Yang © Spring, 2011

EE 290A, University of California, Berkeley

22