# Presentation

AI and Robotics

Nov 7, 2013 (4 years and 6 months ago)

117 views

An introduction
to Bayesian
networks

Stochastic Processes Course

Hossein Amirkhani

Spring
2011

2

An introduction to Bayesian networks

Outline

Introduction,

Bayesian Networks
,

Probabilistic Graphical Models,

Conditional Independence,

I
-
equivalence.

3

An introduction to Bayesian networks

Introduction

Our goal is to
represent a joint distribution
𝑃

over
some set of random variables

=

1
,

,

𝑛
.

Even in the
simplest
case where these
variables are
binary
-
valued,
a joint distribution
requires the
specification of
2
𝑛

1

numbers.

The
representation
of the
joint distribution
is
from every
perspective:

Computationally, Cognitively, and Statistically.

4

An introduction to Bayesian networks

Bayesian Networks

Bayesian networks
exploit

properties of the distribution in order to allow a
compact and
natural representation.

They are a specific type of
.

BNs
are directed acyclic graphs (
).

5

An introduction to Bayesian networks

Probabilistic Graphical Models

Nodes
are the random variables in our
domain.

Edges
correspond, intuitively, to direct influence of
one node on another.

Factor Graph

Markov Random Field

Bayesian Network

6

An introduction to Bayesian networks

Probabilistic Graphical Models

Graphs are an intuitive way of representing and
visualising

the relationships
between many
variables.

A
graph allows us to abstract out the conditional
independence relationships
between the
variables from
the details of their parametric forms.

Thus
questions like
: “Is A dependent on B
given that we know the value of C ?” just by looking
at
the
graph
.

Graphical models allow us to define general

algorithms
that implement
probabilistic

efficiently.

Graphical models = statistics
×

graph theory
×

computer science.

7

An introduction to Bayesian networks

Bayesian Networks

8

An introduction to Bayesian networks

Bayesian
Networks

9

An introduction to Bayesian networks

Conditional Independence: Example 1

tail
-
to
-
tail at c

10

An introduction to Bayesian networks

Conditional Independence: Example
1

11

An introduction to Bayesian networks

Conditional Independence: Example 1

Smoking

Lung Cancer

Yellow Teeth

12

An introduction to Bayesian networks

Conditional Independence: Example
2

-
to
-
tail at c

13

An introduction to Bayesian networks

Conditional Independence: Example
2

14

An introduction to Bayesian networks

Conditional Independence: Example
2

Type of Car

Speed

Amount of speeding Fine

15

An introduction to Bayesian networks

Conditional Independence: Example
3

-
to
-

v
-
structure

16

An introduction to Bayesian networks

Conditional Independence: Example
3

17

An introduction to Bayesian networks

Conditional Independence: Example 3

Ability of team A

Ability of team B

Outcome of A vs. B game

18

An introduction to Bayesian networks

D
-
separation

A, B, and C are non
-
intersecting subsets of nodes in a directed
graph.

A path from A to B is
if it contains a node such that
either

a)
the arrows on the path meet either head
-
to
-
tail or tail
-
to
-
tail at the node, and the node is in the set C, or

b)
-
to
-
head at the node, and neither
the node, nor any of its descendants, are in the set C.

If all paths from A to B are blocked, A is said to be d
-
separated
from B by C.

If A is d
-
separated from B by C, the joint distribution over all
variables in the graph satisfies .

19

An introduction to Bayesian networks

I
-
equivalence

Let
𝑃

be a distribution over

.
We
define

to
be the set of independence assertions
that
hold in
𝑃
.

Two graph structures
𝐾
1

and
𝐾
2

over

are

if
𝐼
𝐾
1
=
𝐼
𝐾
2
.

The
set
of all graphs over
X is partitioned into a set
of mutually exclusive and exhaustive I
-
equivalence
classes.

20

An introduction to Bayesian networks

The skeleton of a Bayesian network

The

of a Bayesian network graph
𝐺

over

is an undirected graph over

that
contains an
edge
{

,

}

for every edge
(

,

)

in
𝐺
.

21

An introduction to Bayesian networks

Immorality

A





is an

if
there is no direct edge between X and Y.

22

An introduction to Bayesian networks

Relationship between immorality,
skeleton and I
-
equivalence

Let
𝐺
1

and
𝐺
2

be two graphs over

. Then
𝐺
1

and
𝐺
2

have
the same

and the same set
of

they are

We can use this theorem to recognize that whether two BNs
are I
-
equivalent or not.

In addition, this theorem can be used for
the
structure of the Bayesian network related to a distribution.

We can construct the I
-
equivalence class for a distribution by
determining its
skeleton and
its immoralities from the
independence properties of the given
distribution
.

We
then
use both
of these components to build a
representation of the equivalence class.

23

An introduction to Bayesian networks

Identifying the Undirected Skeleton

The

is to use
independence
queries of the
form


|
𝑼

for different
sets of
variables
𝑼
.

If

and


𝐺

,
we cannot
separate them
with any set of variables
.

Conversely, if

and


𝐺

,
we
would hope to be able to find a set of variables that
makes these two variables conditionally independent:
we call this set a
of their independence.

24

An introduction to Bayesian networks

Identifying the Undirected Skeleton

Let
𝐺

be an I
-
map
of a
distribution
𝑃
, and let

and


be two
variables
that are not
𝐺

. Then either
𝑃


|
𝑃


𝐺

or
𝑃


|
𝑃


𝐺

.

Thus, if

and


𝐺

, then we can
find a witness of
.

Thus, if we assume that
𝐺

has bounded
, say
less than or equal to
d,
then we do not need to consider
witness sets larger than
d.

25

An introduction to Bayesian networks

26

An introduction to Bayesian networks

Identifying Immoralities

At this stage we have reconstructed the
undirected
skeleton
. Now, we
want to reconstruct
edge direction
.

Our
goal is to consider

in the
skeleton
and for
each one determine whether it is
indeed an immorality
.

A triplet of variables
X, Z, Y
is
a potential
immorality

if
the skeleton contains





but does not
contain
an edge
between
X
and
Y
.

A potential immorality





is an immorality

Z
is not in the witness set(s) for
X
and
Y.

27

An introduction to Bayesian networks

28

An introduction to Bayesian networks

Representing Equivalence Classes

An acyclic graph containing both directed and
undirected edges is called a

or
.

29

An introduction to Bayesian networks

Representing Equivalence Classes

Let
𝐺

be a DAG. A
chain graph
𝐾

is a

of
the
equivalence
class
of
𝐺

if shares the
same skeleton
as
𝐺
, and contains a directed edge



all
𝐺

that are
I
-
equivalent
to
𝐺

contain the edge


.

If the edge is directed, then

the members of the
equivalence class agree on the
orientation of
the edge.

If
the edge is undirected, there are

DAGs in the
equivalence class that
disagree with
the orientation of
the edge
.

30

An introduction to Bayesian networks

Representing Equivalence Classes

Is the output of Mark
-
Immoralities the class PDAG?

Clearly
, edges involved in immoralities must be
directed in
K.

The obvious question is whether
K
can contain directed
edges that are not involved in immoralities.

In other words
, can there be additional edges whose
direction is necessarily the same in every
member of
the equivalence class?

31

An introduction to Bayesian networks

Rules

32

An introduction to Bayesian networks

33

An introduction to Bayesian networks

Example

34

An introduction to Bayesian networks

References

D.
Koller

and N. Friedman:
Probabilistic
Graphical Models. MIT
Press, 2009.

C. M.
Bishop:
Pattern Recognition and Machine
Learning. Springer
,
2006
.

35

An introduction to Bayesian networks