# Bayesian Network Learning

AI and Robotics

Nov 7, 2013 (4 years and 6 months ago)

98 views

1

Impact of Structuring on
Bayesian Network Learning
and Reasoning

M
ieczysław
.A.
.K
ł
opotek

Institute of Computer Science,

Polish Academy of Sciences,

Warsaw, Poland,

First Warsaw International Seminar

on Soft
Computing
Warsaw, September 8th, 2003

2

2

Agenda

Definitions

Approximate Reasoning

Bayesian networks

Reasoning in Bayesian networks

Learning Bayesian networks from data

Structured Bayesian networks (SBN)

Reasoning in SBN

Learning SBN from data

Concluding remarks

3

3

Approximate Reasoning

One possible method of expressing
uncertainty: Joint Probability Distribution

Variables: causes, effects, observables

Reasoning: How probable is that a variable
takes a given value if we kniow the values of
some other variables

Given: P(X,Y,....,Z)

Find: P(X=x | T=t,...,W=w)

Difficult, if more than 40 variables have to be
taken into account

hard to represent,

hard to reason,

hard to collect data)

4

4

The method of choice for representing
uncertainty in AI.

Many efficient reasoning methods and
learning methods

Utilize explicit representation of
structure

to:

provide a natural and compact
representation of large probability
distributions.

allow for efficient methods for answering a
wide range of queries.

Bayesian Network

5

5

Bayesian Network

Efficient and effective representation of a
probability distribution

Directed acyclic graph

Nodes
-

random variables of interests

Edges
-

direct (causal) influence

Nodes are statistically independent of their
non descendants given the state of their
parents

6

6

A Bayesian network

R
Z
T
Y
X
S
Pr(r,s,x,z,y)=

Pr(z) .
P
r(s|z) .
P
r(y|z)

.
P
r(x|y) .
P
r(r|y,s)

7

7

Applications of Bayesian
networks

Genetic optimization algorithms with
probabilistic mutation/crossing mechanism

Classification, including text classification

Medical diagnosis (PathFinder, QMR), other
decision making tasks under uncertainty

Hardware diagnosis (Microsoft
troubleshooter, NASA/Rockwell Vista project)

Information retrieval (Ricoh helpdesk)

Recommender systems

other

8

8

Reasoning

the problem with
a Bayesian network

Fusion algorithm of Pearl elaborated for tree
-
like networks only

For other types of networks transformations
to trees:

transformation to Markov tree (MT) is needed
(Shafer/Shenoy, Spiegelhalter/Lauritzen)

except
for trees and polytrees NP hard

Cutset reasoning (Pearl)

finding cutsets difficult,
the reasoning complexity grows exponentially with
cutset size needed

evidence absorption reasoning by edge reversal
(Shachter)

not always possible in a simple way

9

9

Towards MT

moral graph

R

Z

T

Y

X

S

Parents of a node in BN
connected, edges not oriented

1
0

10

Towards MT

triangulated graph

R

Z

T

Y

X

S

All cycles with more than 3 nodes
have at least one link between non
-
neighboring nodes of the cycle.

1
1

11

Towards MT

Hypertree

R

Z

T

Y

X

S

Hypertree = acyclic hypergraph

1
2

12

The Markov tree

Z,T,Y

T,Y,S

Y,S,R

Y,X

Hypernodes of hypertree are
nodes of the Markov tree

1
3

13

Junction tree

alternative
representation of MT

Z,T,S

Z,Y,S

Y,S,R

Y,X

Z,S

Y,S

Y

Common BN nodes assigned to
edges joining MT nodes

1
4

14

Efficient reasoning in Markov
trees, but ....

Z,T,S

Z,Y,S

Y,S,R

Y,X

Z,S

Y,S

Y

msg

msg

msg

MT node contents
projected onto common
variables are passed to
the neighbors

1
5

15

Triangulability test
-

Triangulation not always
possible

All
neighbors
need to be
connected

1
6

16

Evidence absorption
reasoning

R
Z
T
Y
X
S
R
Z
T
Y
X
S
Evidence

absorption

R
Z
T
Y
X
S
Edge reversal

Efficient only for good
-
luck selection
of conditioning variables

1
7

17

Cutset reasoning

fixing
values of some nodes creates
a (poly)tree

R
Z
T
Y
X
S
Node
fixed

Hence edge
ignorable

1
8

18

How to overcome the difficulty
when reasoning with BN

Learn directly a triangulated graph or Markov
tree from data (
Cercone N., Wong S.K.M.,
Xiang Y
)

Hard and inefficient for long dependence chains,
danger of large hypernodes

Learn only tree
-
structured/polytree structured
BN (e.g. In Goldberg’s Bayesian Genetic
Algorithms, TAN text classifiers etc.)

Oversimplification, long dependence chains lost

Our approach: Propose a more general class
of Bayesian networks that is still efficient for
reasoning

1
9

19

What is a structured Bayesian
network

An analogon of well
-
structured
programs

Graphical structure: nested sequences
and alternatives

By collapsing sequences and
alternatives to single nodes, one single
node obtainable

Efficient reasoning possible

2
0

20

Structured Bayesian Network
(SBN), an example

For comparison: a tree
-
structured BN

2
1

21

SBN collapsing

2
2

22

SBN construction steps

means
0,1 or 2
arrows

2
3

23

Reasoning in SBN

Either directly in the structure

Or easily transformable to Markov tree

Direct reasoning consisting of

Forward step (leave node/root node
valuation calculation)

Backward step (intermediate node
valuation calculation

2
4

24

Reasoning in SBN forward
step

means
0,1 or 2
arrows

A

B

A

B

C

E

P(B|A)

P(B|C,E)

2
5

25

Reasoning in SBN backward
step: local context

A

C

B

D

.....

.....

A

C

B

D

.....

.....

A

C

B

.....

.....

A

C

B

.....

.....

(a)

(b)

(c)

(d)

Joint
distribu
-
tion of
A,B
known,
joint C,D
or C
sought

2
6

26

Reasoning in SBN

backward
step: local reasoning

A,B,............

A,B,C,D

A,B

Msg
2(A,B)

Msg
1(A,B)

P(A)*P(B|A,D)

Not needed

2
7

27

SBN

towards a MT

2
8

28

SBN

towards a MT

2
9

29

SBN

towards a MT

3
0

30

B

C

D

E

A

J

I

S

R

H

F

G

P

K

L

M

N

O

Towards a Markobv tree

an
example

3
1

31

B

C

D

E

A

J

I

S

R

H

F

G

P

K

L

M

N

O

Towards a Markobv tree

an
example

3
2

32

A,B,I

B,C,D,I

C,D,E,I

F,G,I

G,H,I

I,H
,E,R

D,E,I

E,H,R,J

H,R,J

K,L,R

L,M,N,R

M,N,O,R

N,O,R

O,P,R

R,J,P

P,J,S

Markov tree from SBN

3
3

33

B

C

D

E

A

J

I

S

R

H

F

G

P

K

L

M

N

O

Structured Bayesian network

a Hierarchical
(Object
-
Oriented) Bayesian network

3
4

34

Learning SBN from Data

Define the DEP

() measure as follows:
DEP

(Y,X)=P(x|y)
-
P(x|
y
).

Define
DEP
[

]
(Y,X)= (DEP

(Y,X) )
2

Construct a tree according to Chow/Liu
algorithm using DEP
[

]
(Y,X) with Y
belonging to the tree and X not.

3
5

35

Continued ....

Let

us

call

all

the

edges

obtained

by

the

previous

algorithm

“free

edges”
.

During

the

construction

process

the

following

type

of

edges

may

appear

“node

X

loop

unoriented

edge”,

“node

X

loop

oriented

edge”,

“node

X

loop

transient

edge”
.

Do

in

a

loop

(till

termination

condition

below

is

satisfied)
:

For

each

two

properly

connected

non
-
neighboring

nodes

identify

the

unique

connecting

path

between

them
.

3
6

36

Continued ....

Two

nodes

are

properly

connected

if

the

path

between

them

consists

either

of

edges

having

the

status

of

free

edges

or

of

oriented,

unoriented

(but

not

suspended)

edges

of

the

same

loop,

with

no

pair

of

oriented

or

transient

oriented

edges

pointing

in

different

directions

and

no

transient

edge

pointing

to

one

of

the

two

connected

points
.

Note

that

in

this

sense

there

is

at

most

one

path

properly

connecting

two

nodes
.

3
7

37

Continued ....

Connect

that

a

pair

of

non
-
neighboring

nodes

X,Y

by

an

edge,

that

maximizes

DEP
[

]
(X,Y),

the

minimum

of

unconditional

DEP

and

conditional

DEP

given

a

direct

successor

of

X

on

the

path

to

Y
.

Identify

the

loop

that

has

emerged

from

this

operation
.

3
8

38

Continued ....

We

can

have

one

of

the

following

cases
:

(
1
)
it

consists

entirely

of

free

edges

(
2
)
it

contains

some

unoriented

loop

edges,

but

no

oriented

edge
.

(
3
)
It

contains

at

least

one

oriented

edge
.

Depending

on

this,

give

a

proper

status

to

edges

contained

in

a

loop
:

“node

X

loop

unoriented

edge”,

“node

X

loop

oriented

edge”,

“node

X

loop

transient

edge”
.

(details

in

written

presentation)
.

3
9

39

Places of edge insertion

X

C

D

Y

B

Y

C

D

E

X

Y

G

D

E

X

C

Y

C

D

X

B

H

X

C

D

E

Y

X

G

D

E

Y

C

4
0

40

Concluding Remarks

new class of Bayesian networks defined

completely new method of reasoning in Bayesian
networks

outlined

Local computation

at most 4 nodes involved

applicable to a more general class of networks
then known reasoning methods

new class Bayesian networks easily transfornmed
to Markov trees

new class Bayesian networks

a kind of
hierarchical or object
-
oriented Bayesian networks

Can be learned from data

4
1

41

THANK YOU