# Bayesian Networks - Division of Statistical Genomics

AI and Robotics

Nov 7, 2013 (4 years and 8 months ago)

86 views

Bayesian Networks

Aldi Kraja

Division of Statistical Genomics

Bayesian Networks and Decision
Graphs. Chapter 1

Causal networks are a set of variables
and a set of directed links between
variables

Variables represent events (propositions)

A variable can have any number of states

Purpose: Causal networks can be used to
follow how a change of certainty in one
variable may change certainty of other
variables

Causal networks

Fuel

Fuel Meter

Standing

F,

½,

E

Start

Y,

N

Y,

N

Clean Sparks

Y,

N

Causal Network for a reduced start car problem

Causal Networks and
d
-
separation

Serial connection (blocking)

A

B

C

Evidence maybe transmitted through a serial connection

unless the state of the variable in the connection is known.

A and C and are d
-
separated given B

When B is instantiated it blocks the communication between A and C

Causal networks and
d
-
separation

Diverging connections (Blocking)

A

B

C

E

Influence can pass between all children of A unless the state of A is known

Evidence may be transmitted through a diverging connection

unless it is instantiated.

Causal networks and
d
-
separation

Converging connections (opening)

A

B

C

E

Case1: If nothing is known about A,

except inference from knowledge of its parents => then parents are independent

Evidence on one of the parents has no influence on other parents

Case 2: If anything is known about the consequences, then information in one

may tell us something about the other causes. (Explaining away effect)

Evidence may only

be transmitted through

the converging connection

If either A or one of its

descendants has

Evidence

Evidence on a variable is a statement of
the certainties of its states

If the variable is instantiated then the
variable provides hard evidence

Blocking in the case of serial and
diverging connections requires hard
evidence

Opening in the case of converging
connections holds for all kind of evidence

D
-
separation

Two distinct variables A and B in a causal
network are d
-
separated if, for all paths between
A and B there is an intermediate variable V
(distinct from A and B) such that:

-
The connection is SERIAL or DIVERGING and
V is instantiated

Or

-

the connection is CONVERGING and neither
V nor any of V’s descendants have received
evidence

Probability Theory

The uncertainty raises from noise in the
measurements and from the small sample
size in the data.

Use probability theory to quantify the
uncertainty.

P(B=r)=4/10

P(B=g)=6/10

ripe

Wheat

unripe

Wheat

Red

fungus

Gray

fungus

Probability Theory

The probability of an event is
the fraction
of times that event occurs out of the total
number of trails
, in the limit that the total
number of trails goes to infinity

Probability Theory

Sum rule:

Product rule

Y
Y
X
p
X
p
)
,
(
)
(
)
(
)
|
(
)
,
(
X
p
X
Y
p
Y
X
p

i=1

……

M

j=1

……

L

n
ij

Y=y
i

X=x
i

c
i

r
j

Probability Theory

)
(
)
|
(
)
,
(
)
,
(
)
(
,
)
(
)
,
(
1
i
i
j
i
i
ij
ij
j
i
L
j
j
i
i
j
ij
i
i
i
ij
i
i
x
X
p
x
X
y
Y
p
N
c
c
n
N
n
y
Y
x
X
p
y
Y
x
X
p
x
X
p
n
c
where
N
c
x
X
p
N
n
y
Y
x
X
p

Y
Y
X
p
X
p
)
,
(
)
(
i=1

……

M

j=1

……

L

n
ij

Y=y
i

X=x
i

c
i

r
j

)
(
)
|
(
)
,
(
X
p
X
Y
p
Y
X
p

Probability Theory

Symmetry property

)
(
)
(
)
,
(
:
'
)
(
)
(
)
|
(
)
|
(
)
(
)
|
(
)
(
)
|
(
)
,
(
)
,
(
Y
p
X
p
Y
X
p
case
Special
theorem
s
Baye
X
p
Y
p
Y
X
p
X
Y
p
Y
p
Y
X
p
X
p
X
Y
p
X
Y
p
Y
X
p

Probability Theory

P(W=u | F=R)=8/32=1/4

P(W=r | F=R)=24/32=3/4

P(W=u | F=G)=18/24=3/4

P(W=r | F=G)=6/24=1/4

P(F=R)=4/10=0.4

P(F=G)=6/10

=0.6

unripe

Wheat

Gray

fungus

Red

fungus

ripe

Wheat

1

1

Probability Theory

p(W=u)=p(W=u|F=R)p(F=R)+p(W=u|F=G)p(F=G)
=1/4*4/10+3/4*6/10=11/20

p(W=r)=1
-
11/20=9/20

p(F=R|W=r)=(p(W=r|F=R)p(F=R)/p(W=r))=

3/4*4/10*20/9=2/3

P(F=G|W=u)=1
-
2/3=1/3

P(F=R)=4/10=0.4

P(F=G)=6/10

=0.6

unripped

Wheat

Gray

fungus

Red

fungus

ripe

Wheat

Conditional probabilities

Convergence connection (blocking)

p(a|b)p(b)=p(a,b)

p(a|b,c)p(b|c)=p(a,b|c)

p(b|a)=p(a|b)p(b)/p(a)

p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

b

a

c

p(a,b,c)=p(a|b)p(c|b)p(b)

b

a

c

p(a,b,c)/p(b)=p(a|b)p(c|b)p(b)/p(b)

a
╨c | b

Conditional probabilities

Serial connection (blocking)

p(a|b)p(b)=p(a,b)

p(a|b,c)p(b|c)=p(a,b|c)

p(b|a)=p(a|b)p(b)/p(a)

p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

b

a

c

b

a

c

p(a,b,c)=p(a)p(b|a)p(c|b)

p(a,c|b)=p(a,b,c)/p(b)= p(a)p(b|a)p(c|b)/p(b)=

p(a) {p(a|b)p(b)/p(a)} p(c|b)/p(b)=p(a|b)p(c|b)

a
╨c | b

Conditional probabilities

Convergence connection (opening)

p(a|b)p(b)=p(a,b)

p(a|b,c)p(b|c)=p(a,b|c)

p(b|a)=p(a|b)p(b)/p(a)

p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

b

a

c

b

a

c

p(a,b,c)=p(a)p(c)p(b|a,c)

p(a,c|b)=p(a,b,c)/p(b)= p(a)p(c)p(b|a,c)/p(b)

a
╨c | 0

a
╨c | b

Graphical Models

We need probability
theory to quantify the
uncertainty. All the
probabilistic inference
can be expressed
with
the sum

and the
product

rule.

p(a,b,c)=p(c|a,b)p(a,b)

p(a,b,c)=p(c|a,b)p(b|a)p(a)

a

c

b

DAG

P(x
1
,x
2
,….,x
K
-
1
,x
K
)=p(x
K
|x
1
,...,x
K
-
1
)…p(x
2
|x
1
)p(x
1
)

Graphical Models

DAG explaining joint distribution of x
1
,…x
7

The joint distribution defined by a graph is given by the product, over all of the nodes
of a graph, of a conditional distribution of each node conditioned on the variables
corresponding to the parents of that node in the graph.

)
|
(
)
|
(
)
,
|
(
)
,
,
|
(
)
(
)
(
)
(
)
,...,
(
5
7
4
6
3
1
5
3
2
1
4
3
2
1
7
1
x
x
p
x
x
p
x
x
x
p
x
x
x
x
p
x
p
x
p
x
p
x
x
p

x
1

x
2

x
3

x
4

x
5

x
6

x
7

K
k
k
k
pa
x
p
x
p
1
)
|
(
)
(