Bayesian Network Learning

cabbageswerveAI and Robotics

Nov 7, 2013 (3 years and 10 months ago)

88 views

1

Impact of Structuring on
Bayesian Network Learning
and Reasoning


M
ieczysław
.A.
.K
ł
opotek

Institute of Computer Science,

Polish Academy of Sciences,

Warsaw, Poland,

First Warsaw International Seminar

on Soft
Computing
Warsaw, September 8th, 2003


2

2

Agenda


Definitions


Approximate Reasoning


Bayesian networks


Reasoning in Bayesian networks


Learning Bayesian networks from data


Structured Bayesian networks (SBN)


Reasoning in SBN


Learning SBN from data


Concluding remarks

3

3

Approximate Reasoning


One possible method of expressing
uncertainty: Joint Probability Distribution


Variables: causes, effects, observables


Reasoning: How probable is that a variable
takes a given value if we kniow the values of
some other variables


Given: P(X,Y,....,Z)


Find: P(X=x | T=t,...,W=w)


Difficult, if more than 40 variables have to be
taken into account


hard to represent,


hard to reason,


hard to collect data)


4

4

The method of choice for representing
uncertainty in AI.

Many efficient reasoning methods and
learning methods


Utilize explicit representation of
structure

to:


provide a natural and compact
representation of large probability
distributions.


allow for efficient methods for answering a
wide range of queries.

Bayesian Network

5

5

Bayesian Network


Efficient and effective representation of a
probability distribution


Directed acyclic graph


Nodes
-

random variables of interests


Edges
-

direct (causal) influence


Nodes are statistically independent of their
non descendants given the state of their
parents

6

6

A Bayesian network

R
Z
T
Y
X
S
Pr(r,s,x,z,y)=

Pr(z) .
P
r(s|z) .
P
r(y|z)


.
P
r(x|y) .
P
r(r|y,s)

7

7

Applications of Bayesian
networks


Genetic optimization algorithms with
probabilistic mutation/crossing mechanism


Classification, including text classification


Medical diagnosis (PathFinder, QMR), other
decision making tasks under uncertainty


Hardware diagnosis (Microsoft
troubleshooter, NASA/Rockwell Vista project)


Information retrieval (Ricoh helpdesk)


Recommender systems


other

8

8

Reasoning


the problem with
a Bayesian network


Fusion algorithm of Pearl elaborated for tree
-
like networks only


For other types of networks transformations
to trees:


transformation to Markov tree (MT) is needed
(Shafer/Shenoy, Spiegelhalter/Lauritzen)


except
for trees and polytrees NP hard


Cutset reasoning (Pearl)


finding cutsets difficult,
the reasoning complexity grows exponentially with
cutset size needed


evidence absorption reasoning by edge reversal
(Shachter)


not always possible in a simple way

9

9

Towards MT


moral graph

R

Z

T

Y

X

S

Parents of a node in BN
connected, edges not oriented

1
0

10

Towards MT


triangulated graph

R

Z

T

Y

X

S

All cycles with more than 3 nodes
have at least one link between non
-
neighboring nodes of the cycle.

1
1

11

Towards MT


Hypertree

R

Z

T

Y

X

S

Hypertree = acyclic hypergraph

1
2

12

The Markov tree

Z,T,Y

T,Y,S

Y,S,R

Y,X

Hypernodes of hypertree are
nodes of the Markov tree

1
3

13

Junction tree


alternative
representation of MT

Z,T,S

Z,Y,S

Y,S,R

Y,X

Z,S

Y,S

Y

Common BN nodes assigned to
edges joining MT nodes

1
4

14

Efficient reasoning in Markov
trees, but ....

Z,T,S

Z,Y,S

Y,S,R

Y,X

Z,S

Y,S

Y

msg

msg

msg

MT node contents
projected onto common
variables are passed to
the neighbors

1
5

15

Triangulability test
-

Triangulation not always
possible

All
neighbors
need to be
connected

1
6

16

Evidence absorption
reasoning

R
Z
T
Y
X
S
R
Z
T
Y
X
S
Evidence

absorption

R
Z
T
Y
X
S
Edge reversal

Efficient only for good
-
luck selection
of conditioning variables

1
7

17

Cutset reasoning


fixing
values of some nodes creates
a (poly)tree


R
Z
T
Y
X
S
Node
fixed

Hence edge
ignorable

1
8

18

How to overcome the difficulty
when reasoning with BN


Learn directly a triangulated graph or Markov
tree from data (
Cercone N., Wong S.K.M.,
Xiang Y
)


Hard and inefficient for long dependence chains,
danger of large hypernodes


Learn only tree
-
structured/polytree structured
BN (e.g. In Goldberg’s Bayesian Genetic
Algorithms, TAN text classifiers etc.)


Oversimplification, long dependence chains lost


Our approach: Propose a more general class
of Bayesian networks that is still efficient for
reasoning

1
9

19

What is a structured Bayesian
network


An analogon of well
-
structured
programs


Graphical structure: nested sequences
and alternatives


By collapsing sequences and
alternatives to single nodes, one single
node obtainable


Efficient reasoning possible


2
0

20

Structured Bayesian Network
(SBN), an example

For comparison: a tree
-
structured BN

2
1

21

SBN collapsing

2
2

22

SBN construction steps

means
0,1 or 2
arrows

2
3

23

Reasoning in SBN


Either directly in the structure


Or easily transformable to Markov tree


Direct reasoning consisting of


Forward step (leave node/root node
valuation calculation)


Backward step (intermediate node
valuation calculation


2
4

24

Reasoning in SBN forward
step

means
0,1 or 2
arrows

A

B

A

B

C

E

P(B|A)

P(B|C,E)

2
5

25

Reasoning in SBN backward
step: local context

A

C

B

D

.....

.....

A

C

B

D

.....

.....

A

C

B

.....

.....

A

C

B

.....

.....

(a)

(b)

(c)

(d)

Joint
distribu
-
tion of
A,B
known,
joint C,D
or C
sought

2
6

26

Reasoning in SBN


backward
step: local reasoning

A,B,............

A,B,C,D

A,B

Msg
2(A,B)

Msg
1(A,B)

P(A)*P(B|A,D)

Not needed

2
7

27

SBN

towards a MT

2
8

28

SBN

towards a MT

2
9

29

SBN

towards a MT

3
0

30

B

C

D

E

A

J

I

S

R

H

F

G

P

K

L

M

N

O

Towards a Markobv tree


an
example

3
1

31

B

C

D

E

A

J

I

S

R

H

F

G

P

K

L

M

N

O

Towards a Markobv tree


an
example

3
2

32

A,B,I

B,C,D,I

C,D,E,I

F,G,I

G,H,I

I,H
,E,R


D,E,I

E,H,R,J


H,R,J


K,L,R

L,M,N,R

M,N,O,R

N,O,R

O,P,R

R,J,P


P,J,S


Markov tree from SBN

3
3

33

B

C

D

E

A

J

I

S

R

H

F

G

P

K

L

M

N

O



Structured Bayesian network


a Hierarchical
(Object
-
Oriented) Bayesian network

3
4

34

Learning SBN from Data


Define the DEP

() measure as follows:
DEP

(Y,X)=P(x|y)
-
P(x|
y
).


Define
DEP
[

]
(Y,X)= (DEP

(Y,X) )
2



Construct a tree according to Chow/Liu
algorithm using DEP
[

]
(Y,X) with Y
belonging to the tree and X not.


3
5

35

Continued ....


Let

us

call

all

the

edges

obtained

by

the

previous

algorithm

“free

edges”
.


During

the

construction

process

the

following

type

of

edges

may

additionally

appear

“node

X

loop

unoriented

edge”,

“node

X

loop

oriented

edge”,

“node

X

loop

transient

edge”
.


Do

in

a

loop

(till

termination

condition

below

is

satisfied)
:



For

each

two

properly

connected

non
-
neighboring

nodes

identify

the

unique

connecting

path

between

them
.


3
6

36

Continued ....


Two

nodes

are

properly

connected

if

the

path

between

them

consists

either

of

edges

having

the

status

of

free

edges

or

of

oriented,

unoriented

(but

not

suspended)

edges

of

the

same

loop,

with

no

pair

of

oriented

or

transient

oriented

edges

pointing

in

different

directions

and

no

transient

edge

pointing

to

one

of

the

two

connected

points
.



Note

that

in

this

sense

there

is

at

most

one

path

properly

connecting

two

nodes
.


3
7

37

Continued ....


Connect

that

a

pair

of

non
-
neighboring

nodes

X,Y

by

an

edge,

that

maximizes

DEP
[

]
(X,Y),

the

minimum

of

unconditional

DEP

and

conditional

DEP

given

a

direct

successor

of

X

on

the

path

to

Y
.



Identify

the

loop

that

has

emerged

from

this

operation
.

3
8

38

Continued ....


We

can

have

one

of

the

following

cases
:


(
1
)
it

consists

entirely

of

free

edges


(
2
)
it

contains

some

unoriented

loop

edges,

but

no

oriented

edge
.



(
3
)
It

contains

at

least

one

oriented

edge
.


Depending

on

this,

give

a

proper

status

to

edges

contained

in

a

loop
:

“node

X

loop

unoriented

edge”,

“node

X

loop

oriented

edge”,

“node

X

loop

transient

edge”
.


(details

in

written

presentation)
.

3
9

39

Places of edge insertion


X

C

D

Y

B


Y

C

D

E

X


Y

G

D

E

X

C


Y

C

D

X

B

H


X

C

D

E

Y


X

G

D

E

Y

C

4
0

40

Concluding Remarks


new class of Bayesian networks defined


completely new method of reasoning in Bayesian
networks

outlined


Local computation


at most 4 nodes involved


applicable to a more general class of networks
then known reasoning methods



new class Bayesian networks easily transfornmed
to Markov trees


new class Bayesian networks


a kind of
hierarchical or object
-
oriented Bayesian networks


Can be learned from data

4
1

41

THANK YOU