Part I: Classifier Performance

journeycartΤεχνίτη Νοημοσύνη και Ρομποτική

15 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

79 εμφανίσεις

Part I: Classifier Performance

Mahesan Niranjan


Department of Computer Science

The University of Sheffield

M.Niranjan@Sheffield.ac.uk

&

Cambridge Bioinformatics Limited

Mahesan.Niranjan@ntlworld.com

BCS, Exeter, July 2004

Mahesan Niranjan

2

Relevant Reading


Bishop,
Neural Networks for Pattern Recognition


http://www.ncrg.aston.ac.uk/netlab


David Hand,
Construction and Assessment of
Classification Rules



Lovell, et. Al. CUED/F
-
INFENG/TR.299


Scott et al CUED/F
-
INFENG/TR.323


reports linked from
http://www.dcs.shef.ac.uk/~niranjan


BCS, Exeter, July 2004

Mahesan Niranjan

3

Pattern Recognition Framework

BCS, Exeter, July 2004

Mahesan Niranjan

4

Two Approaches to Pattern Recognition


Probabilistic via explicit modelling of probabilities
encountered in Bayes’ formula






Parametric form for class boundary and optimise it


In some specific cases (often not) both reduce to the
same answer


BCS, Exeter, July 2004

Mahesan Niranjan

5

Pattern Recognition: Simple case

O


Gaussian Distributions


Isotropic


Equal Variances


Optimal Classifier:




Distance to mean



Linear Class Boundary

BCS, Exeter, July 2004

Mahesan Niranjan

6

Distance can be misleading

O

Mahalanobis Distance

Optimal Classifier for this case is Fisher Linear Discriminant

BCS, Exeter, July 2004

Mahesan Niranjan

7

Support Vector Machines

Maximum Margin Perceptron

X

X

X

X

X

X

O

O

O

O

O

O

O

O

O

O

X

X

X

X

X

X

BCS, Exeter, July 2004

Mahesan Niranjan

8

Support Vector Machines

Nonlinear Kernel Functions

X

X

X

X

O

O

O

O

O

O

O

X

X

X

X

X

X

O

O

O

O

O

O

O

BCS, Exeter, July 2004

Mahesan Niranjan

9

Support Vector Machines

Computations


Quadratic Programming




Class boundary defined only by data that lie close to
it
-

support vectors


Kernels in data space equal scalar products in higher
dimensional space


BCS, Exeter, July 2004

Mahesan Niranjan

10

Support Vector Machines

The Hypes


Strong theoretical basis
-

Computational
Learning Theory; complexity controlled by the
Vapnik
-
Chervonenkis dimension


Not many parameters to tune


High performance on many practical
problems, high dimensional problems in
particular

BCS, Exeter, July 2004

Mahesan Niranjan

11

Support Vector Machines

The Truths


Worst case bounds from Learning theory are
not very practical


Several parameters to tune


What kernel?


Internal workings of the optimiser


Noise in training data


Performance?



depends on who you ask

BCS, Exeter, July 2004

Mahesan Niranjan

12

SVM: data driven kernel


Fisher Kernel [Jaakola & Haussler]


Kernel based on a generative model of all the data







BCS, Exeter, July 2004

Mahesan Niranjan

13

Classifier Performance


Error rates can be misleading



Imbalance in training/test data


98% of population healthy


2% population has disease



Cost of misclassification can change after
design of classifier

BCS, Exeter, July 2004

Mahesan Niranjan

14

x

x

x

x

x

x

x

x

x

x

x

x

Adverse Outcome

Benign Outcome

Threshold

Class Boundary

BCS, Exeter, July 2004

Mahesan Niranjan

15

True Positive

False Positive

Area under the ROC Curve: Neat Statistical Interpretation

BCS, Exeter, July 2004

Mahesan Niranjan

16

Convex Hull of ROC Curves

False Positive

True Positive

BCS, Exeter, July 2004

Mahesan Niranjan

17

Yeast Gene Example:

MATLAB Demo here

Part II: Particle Filters for
Tracking and Sequential
Problems

Mahesan Niranjan


Department of Computer Science

The University of Sheffield

BCS, Exeter, July 2004

Mahesan Niranjan

19

Overview


Motivation


State Space Model


Kalman Filter and Extensions


Sequential MCMC Methods


Particle Filter & Variants

BCS, Exeter, July 2004

Mahesan Niranjan

20

Motivation


Neural Networks for Learning:


Function Approximation


Statistical Estimation


Dynamical Systems


Parallel Processing


Guarantee Generalisation:


Regularise / control complexity


Cross validate to detect / avoid overfitting


Bootstrap to deal with model / data uncertainty


Many of the above tricks won’t work in a sequential
setting

BCS, Exeter, July 2004

Mahesan Niranjan

21

Interesting Applications


Speech Signal Processing


Medical Signals


Monitoring Liver Transplant Patients


Tracking the prices of Options contracts in
computational finance

BCS, Exeter, July 2004

Mahesan Niranjan

22

Good References


Bar
-
Shalom and Fortman:


Tracking and Data Association


Jazwinski:


Stochastic Processes and Filtering Theory


Arulampalam
et al
:


“Tutorial on Particle Filters…”;
IEEE Transactions on Signal Processing



Arnaud Doucet:


Technical Report 310, Cambridge University Engineering Department


Benveniste, A
et al
:


Adaptive Algorithms and Stochastic Approximation


Simon Haykin:


Adaptive Filters

BCS, Exeter, July 2004

Mahesan Niranjan

23

Matrix Inversion Lemma

BCS, Exeter, July 2004

Mahesan Niranjan

24

Linear Regression

BCS, Exeter, July 2004

Mahesan Niranjan

25

Recursive Least Squares

BCS, Exeter, July 2004

Mahesan Niranjan

26

State Space Model

State

Process Noise

Observation

Measurement Noise

BCS, Exeter, July 2004

Mahesan Niranjan

27

Simple

Linear Gaussian Model

BCS, Exeter, July 2004

Mahesan Niranjan

28

Kalman Filter

Prediction

Correction

BCS, Exeter, July 2004

Mahesan Niranjan

29

Kalman Filter

Innovation

Kalman Gain

BCS, Exeter, July 2004

Mahesan Niranjan

30

Bayesian Setting

Prior

Likelihood

Innovation Probability


Run Multiple Models and Switch
-

Bar
-
Shalom


Set Noise Levels to Max Likelihood Values
-

Jazwinski

BCS, Exeter, July 2004

Mahesan Niranjan

31

Extended Kalman Filter

Lee Feldkamp @ Ford

Successful training of Recurrent Neural Networks

Taylor Series Expansion around the operating point

First Order

Second Order

Iterated Extended Kalman Filter

BCS, Exeter, July 2004

Mahesan Niranjan

32

Iterated Extended Kalman Filter

Local Linearization of State and / or Observation

Propagation and Update

BCS, Exeter, July 2004

Mahesan Niranjan

33

Unscented Kalman Filter

Generate some points at time

So they can represent the mean and covariance:

Propagate these through the state equations

Recompute predicted mean and covariance:

BCS, Exeter, July 2004

Mahesan Niranjan

34

Recipe to define:

Recompute:

BCS, Exeter, July 2004

Mahesan Niranjan

35

Formant Tracking Example

Linear Filter

Excitation

Speech

BCS, Exeter, July 2004

Mahesan Niranjan

36

Formant Tracking Example

BCS, Exeter, July 2004

Mahesan Niranjan

37

Formant Track Example

BCS, Exeter, July 2004

Mahesan Niranjan

38

Grid
-
based methods

Discretize continuous state into “cells”

Integrating probabilities over each partition

Fixed partitioning of state space


BCS, Exeter, July 2004

Mahesan Niranjan

39

Sampling Methods: Bayesian
Inference

Parameters

Uncertainty over parameters


Inference:

BCS, Exeter, July 2004

Mahesan Niranjan

40

Basic Tool: Composition

[
Tanner
]

To generate samples of

BCS, Exeter, July 2004

Mahesan Niranjan

41

Importance Sampling

BCS, Exeter, July 2004

Mahesan Niranjan

42

Particle Filters

Prediction

Weights of Sample

Bootstrap Filters ( Gordon et al, Tracking )

CONDENSATION

Algorithm ( Isard et al, Vision

)

BCS, Exeter, July 2004

Mahesan Niranjan

43

Sequential Importance Sampling

Recursive update of weights

Only upto a constant of proportionality

BCS, Exeter, July 2004

Mahesan Niranjan

44

Degeneracy in SIS

Variance of weights monotonically increases




All except one decay to zero very rapidly

Effective number of particles

Resample

if

BCS, Exeter, July 2004

Mahesan Niranjan

45

Sampling, Importance Re
-
Sampling (SIR)

Multiply samples of high weight;

kill off samples in parts of space not relevant




“Particle Collapse”

BCS, Exeter, July 2004

Mahesan Niranjan

46

Marginalizing Part of the State Space

Suppose

Possible to analytically integrate with respect

to part of the state space

Sample with respect to

Integrate with respect to

Rao
-
Blackwell

BCS, Exeter, July 2004

Mahesan Niranjan

47

Variations to the Basic Algorithm


Integrate out part of the state space


Rao
-
Blackwellized particle filters


(
e.g.

Multi
-
layer perceptron with linear output layer )


Variational Importance Sampling
( Lawrence et al )


Auxilliary Particle Filters ( Pitt et al )


Regularized Particle Filters


Likelihood Particle Filters

BCS, Exeter, July 2004

Mahesan Niranjan

48

Regularised PF: basic idea

Samples

Kernel Density

Resample

Propagate in time

BCS, Exeter, July 2004

Mahesan Niranjan

49

Conclusion / Summary


Collection of powerful algorithms



New and interesting signal processing
problems