Introduction to Machine Learning

zoomzurichAI and Robotics

Oct 16, 2013 (3 years and 5 months ago)

53 views

Introduction to

Machine Learning

Multivariate Methods


姓名
:
李政軒


Multiple measurements


d

inputs/features/attributes:
d
-
variate


N

instances/observations/examples

Multivariate Data

Multivariate Parameters

Parameter Estimation

Estimation of Missing Values


What to do if certain instances have missing
attributes


Ignore those instances: not a good idea if the
sample is small


Use ‘missing’ as an attribute: may give
information


Imputation: Fill in the missing value


Mean imputation: Use the most likely
value


Imputation by regression: Predict based on other
attributes

Multivariate Normal Distribution


Mahalanobis distance: (
x


μ
)
T



1

(
x


μ
)


measures the distance from
x

to
μ

in terms of


(normalizes for difference in variances and correlations)


Bivariate:
d
= 2


Multivariate Normal Distribution

Bivariate Normal


Is probability contour plot of the
bivariate

normal distribution.


Its center is given by the mean, and its shape and orientation depend on
the covariance matrix.


If
x
i

are independent, offdiagonals of


are 0,
Mahalanobis distance reduces to weighted (by
1/
σ
i
) Euclidean distance:






If variances are also equal, reduces to Euclidean
distance


Independent Inputs: Naive Bayes

Parametric Classification


If
p
(
x

|
C
i
) ~ N (
μ
i

,

i
)






Discriminant functions

Estimation of Parameters

Different
S
i



Quadratic discriminant

likelihoods

posterior for C
1

discriminant:


P

(
C
1
|
x
) = 0.5


Shared common sample covariance
S



Discriminant
reduces to



which
is a linear discriminant

Common Covariance Matrix
S

Common Covariance Matrix
S



Covariances

may be arbitrary but shared by both classes.


When
x
j

j
= 1,..
d
, are independent,


is diagonal


p
(
x
|
C
i
) = ∏
j

p

(
x
j
|
C
i
)







Classify based on weighted Euclidean distance (in
s
j

units) to the nearest mean

Diagonal
S


Diagonal
S

variances may be

different


Nearest mean classifier: Classify based on
Euclidean distance to the nearest mean







Each mean can be considered a prototype or
template and this is template matching

Diagonal
S
, equal variances

Diagonal
S
, equal variances

*

?



All classes have equal, diagonal covariance matrices of
equal variances on both dimensions.







As we increase complexity (less restricted
S
),
bias decreases and variance increases


Assume simple models (allow some bias) to
control variance (regularization)

Model Selection



Different cases of the covariance matrices fitted to
the same data lead to different boundaries.


Binary features:



if
x
j

are independent (Naive Bayes’)





the discriminant is linear



Discrete Features

Estimated parameters


Multinomial (1
-
of
-
n
j
) features:
x
j

Î {
v
1
,
v
2
,...,
v
n
j
}




if
x
j

are independent



Discrete Features



Multivariate linear model






Multivariate polynomial model:



Define new higher
-
order variables




z
1
=
x
1
,
z
2
=
x
2
,
z
3
=
x
1
2
,
z
4
=
x
2
2
,
z
5
=
x
1
x
2



and use the linear model in this new
z

space




Multivariate Regression