# Brian D. - Informatics

AI and Robotics

Oct 24, 2013 (4 years and 6 months ago)

103 views

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

What is Mutual Information?

Essential to Probability and
Information Theory

MI is concerned with quantifying the
independence

of two variables

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

What is Mutual Information?

MI measures the amount of information in
variable
x

that is
shared

by

variable
y

MI quantifies the distance between the joint
distribution of x and y

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

When is MI important?

Suppose we know
y
. If
x

contains no shared
information with
y
, then the variables are totally
independent

Mutual Information:
0

Entropy

of
x

is
very high

However
x

is

not important

since it’s not
y

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

When is MI important?

Again we know
y,
but this time
all

the
information conveyed in
x

is also conveyed in
y

Mutual Information:
100

x
, so
entropy

is
very low

x
not important

because we could simply study
y

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

When is MI important?

MI is important (and powerful) when
two variables are
not independent

and
are
not identical

in the information
they convey

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

Why

Apply MI?

If mutual information is maximized
(dependencies increased),
conditional entropy
can be minimized

Reducing conditional entropy makes the
behavior of random variables more
predictable

because their values are more dependent on one
another

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

MI Applications

Discriminative training procedures for
hidden
Markov models

have been proposed based on
the maximum mutual information (MMI)
criterion.

Hidden parameters predicted from known

Applicable to speech recognition, character
recognition, natural language processing

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

MI Applications

Mutual information is often used as a
significance function for the computation of
collocations in corpus linguistics
.

Essential to coherent speak

Easy for humans, hard to artificial systems

MI has been shown improve connections in AI
systems

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

MI Applications

Mutual information is used in
medical imaging

for image registration.

Given a reference image (for example, a brain scan),
and a second image which needs to be put the same
coordinate system as the reference image, this image
is deformed until the mutual information between it
and the reference image is maximized.

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

MI Applications

Mutual information has been used as a criterion for
feature selection and feature transformations in
machine learning and
agent
-
based learning
.

Using MI criteria, it was found that the more input variables
available, the lower the conditional entropy become

MI
-
based criteria could effectively select features AND
roughly estimate optimal feature subsets, classic problems in
feature selection

Mutual Information

Brian Dils

I590

ALife/AI

02.28.05

References

Huang, D., & Chow, T.W.S. (2003). Searching optimal feature subset using
mutual information.
Proceedings of the 2003 International Symposium on Artificial
Neural Networks
(pp. 161
-
166). Bruges, Belgium.

Battiti, R. (1994) Using mutual information for selecting features in
supervised neural net learning.
Neural Networks, 5,
537
-
550

Bonnlander, B., & Weigend, A.S. (1994). Selecting input variables using
mutual information and nonparametric density estimations.
Proceedings of the
1994 International Symposium on Artificial Neural Networks
(pp. 42
-
50). Tainan,
Taiwan.

Wikipedia entries on “Mutual Information”, “Probability Theory”, and
“Information Theory”