Bayesian Networks - Division of Statistical Genomics

lettuceescargatoireAI and Robotics

Nov 7, 2013 (4 years and 1 day ago)

79 views

Bayesian Networks

Aldi Kraja

Division of Statistical Genomics


Bayesian Networks and Decision
Graphs. Chapter 1


Causal networks are a set of variables
and a set of directed links between
variables


Variables represent events (propositions)


A variable can have any number of states


Purpose: Causal networks can be used to
follow how a change of certainty in one
variable may change certainty of other
variables

Causal networks

Fuel

Fuel Meter

Standing

F,

½,

E

Start


Y,

N



Y,

N


Clean Sparks


Y,

N


Causal Network for a reduced start car problem

Causal Networks and
d
-
separation


Serial connection (blocking)

A

B

C

Evidence maybe transmitted through a serial connection

unless the state of the variable in the connection is known.

A and C and are d
-
separated given B

When B is instantiated it blocks the communication between A and C

Causal networks and
d
-
separation


Diverging connections (Blocking)

A

B

C

E



Influence can pass between all children of A unless the state of A is known

Evidence may be transmitted through a diverging connection

unless it is instantiated.

Causal networks and
d
-
separation


Converging connections (opening)

A

B

C

E



Case1: If nothing is known about A,

except inference from knowledge of its parents => then parents are independent

Evidence on one of the parents has no influence on other parents


Case 2: If anything is known about the consequences, then information in one

may tell us something about the other causes. (Explaining away effect)

Evidence may only

be transmitted through

the converging connection

If either A or one of its

descendants has

received evidence

Evidence


Evidence on a variable is a statement of
the certainties of its states


If the variable is instantiated then the
variable provides hard evidence


Blocking in the case of serial and
diverging connections requires hard
evidence


Opening in the case of converging
connections holds for all kind of evidence



D
-
separation


Two distinct variables A and B in a causal
network are d
-
separated if, for all paths between
A and B there is an intermediate variable V
(distinct from A and B) such that:


-
The connection is SERIAL or DIVERGING and
V is instantiated


Or


-

the connection is CONVERGING and neither
V nor any of V’s descendants have received
evidence

Probability Theory


The uncertainty raises from noise in the
measurements and from the small sample
size in the data.


Use probability theory to quantify the
uncertainty.


P(B=r)=4/10

P(B=g)=6/10

ripe

Wheat

unripe

Wheat

Red

fungus

Gray

fungus

Probability Theory


The probability of an event is
the fraction
of times that event occurs out of the total
number of trails
, in the limit that the total
number of trails goes to infinity

Probability Theory


Sum rule:



Product rule



Y
Y
X
p
X
p
)
,
(
)
(
)
(
)
|
(
)
,
(
X
p
X
Y
p
Y
X
p

i=1

……


M


j=1


……




L

n
ij

Y=y
i

X=x
i

c
i

r
j

Probability Theory

)
(
)
|
(
)
,
(
)
,
(
)
(
,
)
(
)
,
(
1
i
i
j
i
i
ij
ij
j
i
L
j
j
i
i
j
ij
i
i
i
ij
i
i
x
X
p
x
X
y
Y
p
N
c
c
n
N
n
y
Y
x
X
p
y
Y
x
X
p
x
X
p
n
c
where
N
c
x
X
p
N
n
y
Y
x
X
p
























Y
Y
X
p
X
p
)
,
(
)
(
i=1

……


M


j=1


……




L

n
ij

Y=y
i

X=x
i

c
i

r
j

)
(
)
|
(
)
,
(
X
p
X
Y
p
Y
X
p

Probability Theory


Symmetry property

)
(
)
(
)
,
(
:
'
)
(
)
(
)
|
(
)
|
(
)
(
)
|
(
)
(
)
|
(
)
,
(
)
,
(
Y
p
X
p
Y
X
p
case
Special
theorem
s
Baye
X
p
Y
p
Y
X
p
X
Y
p
Y
p
Y
X
p
X
p
X
Y
p
X
Y
p
Y
X
p




Probability Theory


P(W=u | F=R)=8/32=1/4


P(W=r | F=R)=24/32=3/4


P(W=u | F=G)=18/24=3/4


P(W=r | F=G)=6/24=1/4

P(F=R)=4/10=0.4

P(F=G)=6/10

=0.6

unripe

Wheat

Gray

fungus

Red

fungus

ripe

Wheat

1

1

Probability Theory


p(W=u)=p(W=u|F=R)p(F=R)+p(W=u|F=G)p(F=G)
=1/4*4/10+3/4*6/10=11/20


p(W=r)=1
-
11/20=9/20


p(F=R|W=r)=(p(W=r|F=R)p(F=R)/p(W=r))=


3/4*4/10*20/9=2/3


P(F=G|W=u)=1
-
2/3=1/3

P(F=R)=4/10=0.4

P(F=G)=6/10

=0.6

unripped

Wheat

Gray

fungus

Red

fungus

ripe

Wheat

Conditional probabilities


Convergence connection (blocking)


p(a|b)p(b)=p(a,b)


p(a|b,c)p(b|c)=p(a,b|c)


p(b|a)=p(a|b)p(b)/p(a)


p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

b

a

c

p(a,b,c)=p(a|b)p(c|b)p(b)

b

a

c

p(a,b,c)/p(b)=p(a|b)p(c|b)p(b)/p(b)

a
╨c | b

Conditional probabilities


Serial connection (blocking)


p(a|b)p(b)=p(a,b)


p(a|b,c)p(b|c)=p(a,b|c)


p(b|a)=p(a|b)p(b)/p(a)


p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

b

a

c

b

a

c

p(a,b,c)=p(a)p(b|a)p(c|b)

p(a,c|b)=p(a,b,c)/p(b)= p(a)p(b|a)p(c|b)/p(b)=

p(a) {p(a|b)p(b)/p(a)} p(c|b)/p(b)=p(a|b)p(c|b)


a
╨c | b

Conditional probabilities


Convergence connection (opening)


p(a|b)p(b)=p(a,b)


p(a|b,c)p(b|c)=p(a,b|c)


p(b|a)=p(a|b)p(b)/p(a)


p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)


b

a

c

b

a

c

p(a,b,c)=p(a)p(c)p(b|a,c)

p(a,c|b)=p(a,b,c)/p(b)= p(a)p(c)p(b|a,c)/p(b)


a
╨c | 0

a
╨c | b

Graphical Models


We need probability
theory to quantify the
uncertainty. All the
probabilistic inference
can be expressed
with
the sum

and the
product

rule.

p(a,b,c)=p(c|a,b)p(a,b)

p(a,b,c)=p(c|a,b)p(b|a)p(a)

a

c

b

DAG

P(x
1
,x
2
,….,x
K
-
1
,x
K
)=p(x
K
|x
1
,...,x
K
-
1
)…p(x
2
|x
1
)p(x
1
)

Graphical Models


DAG explaining joint distribution of x
1
,…x
7













The joint distribution defined by a graph is given by the product, over all of the nodes
of a graph, of a conditional distribution of each node conditioned on the variables
corresponding to the parents of that node in the graph.

)
|
(
)
|
(
)
,
|
(
)
,
,
|
(
)
(
)
(
)
(
)
,...,
(
5
7
4
6
3
1
5
3
2
1
4
3
2
1
7
1
x
x
p
x
x
p
x
x
x
p
x
x
x
x
p
x
p
x
p
x
p
x
x
p

x
1

x
2

x
3

x
4

x
5

x
6

x
7




K
k
k
k
pa
x
p
x
p
1
)
|
(
)
(