1
Impact of Structuring on
Bayesian Network Learning
and Reasoning
M
ieczysław
.A.
.K
ł
opotek
Institute of Computer Science,
Polish Academy of Sciences,
Warsaw, Poland,
First Warsaw International Seminar
on Soft
Computing
Warsaw, September 8th, 2003
2
2
Agenda
Definitions
Approximate Reasoning
Bayesian networks
Reasoning in Bayesian networks
Learning Bayesian networks from data
Structured Bayesian networks (SBN)
Reasoning in SBN
Learning SBN from data
Concluding remarks
3
3
Approximate Reasoning
One possible method of expressing
uncertainty: Joint Probability Distribution
Variables: causes, effects, observables
Reasoning: How probable is that a variable
takes a given value if we kniow the values of
some other variables
Given: P(X,Y,....,Z)
Find: P(X=x  T=t,...,W=w)
Difficult, if more than 40 variables have to be
taken into account
hard to represent,
hard to reason,
hard to collect data)
4
4
The method of choice for representing
uncertainty in AI.
Many efficient reasoning methods and
learning methods
Utilize explicit representation of
structure
to:
provide a natural and compact
representation of large probability
distributions.
allow for efficient methods for answering a
wide range of queries.
Bayesian Network
5
5
Bayesian Network
Efficient and effective representation of a
probability distribution
Directed acyclic graph
Nodes

random variables of interests
Edges

direct (causal) influence
Nodes are statistically independent of their
non descendants given the state of their
parents
6
6
A Bayesian network
R
Z
T
Y
X
S
Pr(r,s,x,z,y)=
Pr(z) .
P
r(sz) .
P
r(yz)
.
P
r(xy) .
P
r(ry,s)
7
7
Applications of Bayesian
networks
Genetic optimization algorithms with
probabilistic mutation/crossing mechanism
Classification, including text classification
Medical diagnosis (PathFinder, QMR), other
decision making tasks under uncertainty
Hardware diagnosis (Microsoft
troubleshooter, NASA/Rockwell Vista project)
Information retrieval (Ricoh helpdesk)
Recommender systems
other
8
8
Reasoning
–
the problem with
a Bayesian network
Fusion algorithm of Pearl elaborated for tree

like networks only
For other types of networks transformations
to trees:
transformation to Markov tree (MT) is needed
(Shafer/Shenoy, Spiegelhalter/Lauritzen)
–
except
for trees and polytrees NP hard
Cutset reasoning (Pearl)
–
finding cutsets difficult,
the reasoning complexity grows exponentially with
cutset size needed
evidence absorption reasoning by edge reversal
(Shachter)
–
not always possible in a simple way
9
9
Towards MT
–
moral graph
R
Z
T
Y
X
S
Parents of a node in BN
connected, edges not oriented
1
0
10
Towards MT
–
triangulated graph
R
Z
T
Y
X
S
All cycles with more than 3 nodes
have at least one link between non

neighboring nodes of the cycle.
1
1
11
Towards MT
–
Hypertree
R
Z
T
Y
X
S
Hypertree = acyclic hypergraph
1
2
12
The Markov tree
Z,T,Y
T,Y,S
Y,S,R
Y,X
Hypernodes of hypertree are
nodes of the Markov tree
1
3
13
Junction tree
–
alternative
representation of MT
Z,T,S
Z,Y,S
Y,S,R
Y,X
Z,S
Y,S
Y
Common BN nodes assigned to
edges joining MT nodes
1
4
14
Efficient reasoning in Markov
trees, but ....
Z,T,S
Z,Y,S
Y,S,R
Y,X
Z,S
Y,S
Y
msg
msg
msg
MT node contents
projected onto common
variables are passed to
the neighbors
1
5
15
Triangulability test

Triangulation not always
possible
All
neighbors
need to be
connected
1
6
16
Evidence absorption
reasoning
R
Z
T
Y
X
S
R
Z
T
Y
X
S
Evidence
absorption
R
Z
T
Y
X
S
Edge reversal
Efficient only for good

luck selection
of conditioning variables
1
7
17
Cutset reasoning
–
fixing
values of some nodes creates
a (poly)tree
R
Z
T
Y
X
S
Node
fixed
Hence edge
ignorable
1
8
18
How to overcome the difficulty
when reasoning with BN
Learn directly a triangulated graph or Markov
tree from data (
Cercone N., Wong S.K.M.,
Xiang Y
)
Hard and inefficient for long dependence chains,
danger of large hypernodes
Learn only tree

structured/polytree structured
BN (e.g. In Goldberg’s Bayesian Genetic
Algorithms, TAN text classifiers etc.)
Oversimplification, long dependence chains lost
Our approach: Propose a more general class
of Bayesian networks that is still efficient for
reasoning
1
9
19
What is a structured Bayesian
network
An analogon of well

structured
programs
Graphical structure: nested sequences
and alternatives
By collapsing sequences and
alternatives to single nodes, one single
node obtainable
Efficient reasoning possible
2
0
20
Structured Bayesian Network
(SBN), an example
For comparison: a tree

structured BN
2
1
21
SBN collapsing
2
2
22
SBN construction steps
means
0,1 or 2
arrows
2
3
23
Reasoning in SBN
Either directly in the structure
Or easily transformable to Markov tree
Direct reasoning consisting of
Forward step (leave node/root node
valuation calculation)
Backward step (intermediate node
valuation calculation
2
4
24
Reasoning in SBN forward
step
means
0,1 or 2
arrows
A
B
A
B
C
E
P(BA)
P(BC,E)
2
5
25
Reasoning in SBN backward
step: local context
A
C
B
D
.....
.....
A
C
B
D
.....
.....
A
C
B
.....
.....
A
C
B
.....
.....
(a)
(b)
(c)
(d)
Joint
distribu

tion of
A,B
known,
joint C,D
or C
sought
2
6
26
Reasoning in SBN
–
backward
step: local reasoning
A,B,............
A,B,C,D
A,B
Msg
2(A,B)
Msg
1(A,B)
P(A)*P(BA,D)
Not needed
2
7
27
SBN
–
towards a MT
2
8
28
SBN
–
towards a MT
2
9
29
SBN
–
towards a MT
3
0
30
B
C
D
E
A
J
I
S
R
H
F
G
P
K
L
M
N
O
Towards a Markobv tree
–
an
example
3
1
31
B
C
D
E
A
J
I
S
R
H
F
G
P
K
L
M
N
O
Towards a Markobv tree
–
an
example
3
2
32
A,B,I
B,C,D,I
C,D,E,I
F,G,I
G,H,I
I,H
,E,R
D,E,I
E,H,R,J
H,R,J
K,L,R
L,M,N,R
M,N,O,R
N,O,R
O,P,R
R,J,P
P,J,S
Markov tree from SBN
3
3
33
B
C
D
E
A
J
I
S
R
H
F
G
P
K
L
M
N
O
Structured Bayesian network
–
a Hierarchical
(Object

Oriented) Bayesian network
3
4
34
Learning SBN from Data
Define the DEP
() measure as follows:
DEP
(Y,X)=P(xy)

P(x
y
).
Define
DEP
[
]
(Y,X)= (DEP
(Y,X) )
2
Construct a tree according to Chow/Liu
algorithm using DEP
[
]
(Y,X) with Y
belonging to the tree and X not.
3
5
35
Continued ....
Let
us
call
all
the
edges
obtained
by
the
previous
algorithm
“free
edges”
.
During
the
construction
process
the
following
type
of
edges
may
additionally
appear
“node
X
loop
unoriented
edge”,
“node
X
loop
oriented
edge”,
“node
X
loop
transient
edge”
.
Do
in
a
loop
(till
termination
condition
below
is
satisfied)
:
For
each
two
properly
connected
non

neighboring
nodes
identify
the
unique
connecting
path
between
them
.
3
6
36
Continued ....
Two
nodes
are
properly
connected
if
the
path
between
them
consists
either
of
edges
having
the
status
of
free
edges
or
of
oriented,
unoriented
(but
not
suspended)
edges
of
the
same
loop,
with
no
pair
of
oriented
or
transient
oriented
edges
pointing
in
different
directions
and
no
transient
edge
pointing
to
one
of
the
two
connected
points
.
Note
that
in
this
sense
there
is
at
most
one
path
properly
connecting
two
nodes
.
3
7
37
Continued ....
Connect
that
a
pair
of
non

neighboring
nodes
X,Y
by
an
edge,
that
maximizes
DEP
[
]
(X,Y),
the
minimum
of
unconditional
DEP
and
conditional
DEP
given
a
direct
successor
of
X
on
the
path
to
Y
.
Identify
the
loop
that
has
emerged
from
this
operation
.
3
8
38
Continued ....
We
can
have
one
of
the
following
cases
:
(
1
)
it
consists
entirely
of
free
edges
(
2
)
it
contains
some
unoriented
loop
edges,
but
no
oriented
edge
.
(
3
)
It
contains
at
least
one
oriented
edge
.
Depending
on
this,
give
a
proper
status
to
edges
contained
in
a
loop
:
“node
X
loop
unoriented
edge”,
“node
X
loop
oriented
edge”,
“node
X
loop
transient
edge”
.
(details
in
written
presentation)
.
3
9
39
Places of edge insertion
X
C
D
Y
B
Y
C
D
E
X
Y
G
D
E
X
C
Y
C
D
X
B
H
X
C
D
E
Y
X
G
D
E
Y
C
4
0
40
Concluding Remarks
new class of Bayesian networks defined
completely new method of reasoning in Bayesian
networks
outlined
Local computation
–
at most 4 nodes involved
applicable to a more general class of networks
then known reasoning methods
new class Bayesian networks easily transfornmed
to Markov trees
new class Bayesian networks
–
a kind of
hierarchical or object

oriented Bayesian networks
Can be learned from data
4
1
41
THANK YOU
Comments 0
Log in to post a comment