Bayesian Belief Networks: Computational Considerations for the Environmental Researcher

ocelotgiantAI and Robotics

Nov 7, 2013 (3 years and 9 months ago)

123 views

Bayesian Belief Networks: Computational Considerations for the Environmental
Researcher

Stephen Jensen & Alix I Gitelman

Statistics Department

Oregon State University

August, 2003

Bayesian Belief Networks (BBN)



Bayesian

Belief

Networks

are

a

class

of

models

which

can

be

used

to

describe

succinctly

the

dependencies

and

interactions

between

large

sets

of

variables
.

BBN

have

been

used

extensively

in

fields

such

as

artificial

intelligence

and

decision

sciences
.

BBN

are

also

well
-
suited

to

environmental

research

with

its

large

numbers

of

variables

and

extensive

dependencies

and

interactions
.

In

fact,

BBN

are

ideally

suited

to

studying

relationships

between

effluent

limitations

and

water

quality

(e
.
g
.
,

Borsuk

et

al
.
,

2003
)
.



BBN

are

graphical

models

a

set

of

nodes

represents

the

random

variables

and

a

set

of

vertices

(edges)

represents

relationships

between

the

nodes
.

A

BBN

can

contain

directed

or

undirected

vertices,

and

even

a

mixture

of

the

two

(some

examples

are

shown

below)
:










Undirected

BBN

have

their

genesis

in

statistical

physics

work

by

Gibbs

(
1902
)
.

Additionally,

they

have

been

used

through

the

years

for

modeling

spatial

interactions

(Besag,

1974
),

interpreting

hierarchical

log
-
linear

models

by

analogy

to

Markov

random

fields

(Darroch

et

al
.
,

1980
)
.

Directed

BBN

originated

in

path

analysis

(Wright

1921
)
.

They

have

seen

much

more

use

recently

in

the

guise

of

causal

networks

(Pearl,

1986
;

Lauritzen

&

Speed,

1988
)
.



Estimating BBN



Several

directed

BBN

algorithms

have

been

devised,

including

HUGIN

(Andersen

et

al
.
,

1989
),

TETRAD

(Sprites

et

al
.
,

1993
)

and

DEAL

(Bøttcher

&

Dethlefsen,

2003
)
.

To

determine

a

BBN

model

for

a

set

of

data,

a

canned

package

such

as

HUGIN

or

DEAL

can

be

used

exclusively
.

These

packages

search

across

some

space

of

likely

models

using

either

an

exhaustive

search

or

possibly

a

heuristic

like

a

greedy

search
.

The

networks

are

scored

in

some

manner

to

determine

the

best

candidates
.

These

packages

are

useful

for

structural

learning

that

is,

understanding

which

nodes

are

connected

to

which
.

Using

these

packages,

the

probability

of

an

edge

being

included

in

the

graph

is

not

given,

and

neither

are

standard

error

estimates

provided

for

parameter

(i
.
e
.
,

node

probability)

estimates
.

Furthermore,

for

efficiency,

these

canned

packages

all

require

frequent

triangulation

of

a

graph,

which

is

known

to

be

an

NP
-
hard

problem
.

Although

restriction

to

decomposable

models

only

requires

a

single

triangulation
.

(Deshpande

et

al
.
,

2001
)
.



An

extension

of

this

method

is

possible

by

first

using

HUGIN

or

another

package

first

to

find

a

set

of

candidate

models

(that

is,

to

perform

the

structural

learning),

and

then

estimate

the

parameters

of

those

models

by

using

a

Markov

Chain

Monte

Carlo

method

(this

is

the

parameter

learning

phase)
.

In

this

way,

varaibility

estimates

for

the

model

parameters

can

be

calculated
.

However,

this

approach

is

two
-
stage,

and

in

it,

probabilities

for

an

edge

being

included

in

the

model

remain

uncalculated
.

Others

have

simply

assumed

a

structure

for

a

BBN,

and

then

used

extant

software

to

estimate

the

strength

of

the

edges

(Borsuk

et

al
.
,

2003
)
.



Reversible

jump

Markov

chain

Monte

Carlo

(RJMCMC
;

Green,

1995
)

is

a

generalization

of

the

Metropolis
-
Hastings

algorithm

to

models

with

varying

dimension

of

the

parameter

space
.

This

makes

it

well

suited

for

graphical

model

estimation,

where

the

number

(and

direction)

of

edges

may

be

unknown

a

priori
.

That

is,

using

the

RJMCMC

algorithm,

we

can

accomplish

both

structural

and

parameter

learning
.

Furthermore,

edge

probabilities

and

estimate

variability

are

calculated

as

a

matter

of

course
.

RJ

MCMC

works

well

for

discrete

undirected

graphical

models

(Dellaportas

&

Forster,

1999
),

but

is

computationally

demanding
.

Efficiency

is

improved

by

restricting

the

search

to

decomposable

models
.

Algorithms

for

undirected

decomposable

graphs

(UDGs)

exist

in

purely

continuous

(Giudici

&

Green,

1999
)

and

purely

discrete

settings

(Giudici

et

al
.
,

2000
)
.

Fronk

&

Giudici

(
2000
)

provided

an

algorithm

for

directed

acyclic

graphs

(DAGs)
.



The RJ MCMC algorithms for graphical models involve three steps:

1.

Select an edge for addition, removal or reversal.

2.

Decide whether to make the structural change and update the structure and parameters.

3.

Update the parameters in some fashion without making a structural change.


RJ

MCMC

algorithms

will

often

require

the

same

re
-
triangulation

of

the

graph

as

the

traditional

algorithms,

making

the

structural

updating

quite

computationally

intensive
.

Restriction

to

decomposable

models

makes

the

computation

easier,

as

the

model

structure

can

be

updated

in

polynomial

time

(Desphande

et

al
.
,

2001
)
.

Updating

the

parameters

(step

3

above)

is

a

simple

Metropolis
-
Hastings

step,

which

is

rather

quick
.

Recommendations


Depending on the nature of prior information regarding the ecological systems one wishes to model
using BBN, there are several computational approaches from which to choose.


Assume “known” nodes, perform
structural learning

(e.g., DEAL, HUGIN).


Assume “known” edges, perform
parameter learning

(e.g., WINBUGS).


Perform a two
-
step approach

first learn the structure, then the parameters.


Perform
structural and parameter learning

simultaneously (RJMCMC; e.g., BayesX).

Tradeoffs for these approaches include computational expenses (in terms of programming costs and CPU time)
and the feasibility (really, availability) of probability and uncertainty estimation.


Four Node Example: Structural Learning vs RJMCMC Methods

or “This is all very nice, Steve, but will it work?”


In this example we consider a simple data set of four continuous variables, to which we will fit a directed graphical model.
Th
is model is fit by using a traditional algorithm (in this case DEAL), and also by an RJ
MCMC algorithm (using the package BayesX). The data are Mid
-
Atlantic Integrated Assessment (MAIA) data set for 1997
-
1998. Specif
ically, the variables are
BUGIBI,
an index of biotic integrity for macro
-
invertebrates;
LAT,
latitude of the sample point;

ANC,
acid neutralizing capacity; and
SO4,
sulfur dioxide
.

References

Andersen, S.K., Olesen, K.G., Jensen, F.V. and Jensen, F. (1989). HUGIN


A shell for building Bayesian belief universes for exp
ert systems, in Shafer, G. and

Pearl, J. (eds) (1990).
Readings in Uncertain Reasoning
, Morgan Kaufmann

Publishers, San Mateo, CA.

BayesX. http://www.stat.uni
-
muenchen.de/~lang/bayesx/bayesx.html

Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion).
J. Roy. Statist. Soc. Ser. B

36

302
-
309.

Borsuk, M.E., Stow, C.A., Reckhow, K.H. (2003) Integrated approach to total maximum daily load development for Neuse


River Estu
ary using Bayesian

probability network model.
J. Water Res. Pl. & Mgmt
.
129,4
271
-
282.

Bøttcher, S.G. and Dethlefsen, C. (2003).
DEAL: A package for learning Bayesian networks
. Technical report, R
-
2003
-
03. Department of Mathematical

Sciences, Aalborg University.

BUGS. http://www.mrc
-
bsu.cam.ac.uk/bugs/welcome.shtml

Darroch, J.N., Lauritzen, S.L. and Speed, T.P. (1980). Markov fields and log
-
linear interaction models for contingency tables.
Ann. Stat.

8

522
-
539.

Dawid, A. and Lauritzen, S. (1993). Hyper Markov laws in the statistical analysis of decomposable graphical models.
Ann. Stat.,
21

1272
-
1317.

DEAL. http://www.math.auc.dk/novo/deal/

Dellaportas, P., Forster, J.J. (1999). Markov chain Monte Carlo model determination for hierarchical and graphical log
-
linear m
odels.
Biometrika

86

615
-
633.

Deshpande, A., Garofalakis, M. and Jordan, M.I. (2001). In J.Breese and D. Koller (eds).,
Uncertainty in Artificial Intelligence (UAI), Proceedings of the

Seventeenth Conference 2001.

Fronk, E. and Giudici, P. (2000). Markov chain Monte Carlo model selection for DAG models. Technical report #118, Dept. of Po
lit
ical Economy and

Quantitative Methods, University of Pavia.

Gansner, E., Koutsofis, E., and North, S. Drawing graphs with
dot.

http://www.research.att.com/sw/tools/graphviz/dotguide.pdf

Gibbs, W. (1902).
Elementary Principles of Statistical Mechanics
. Yale University Press, New Haven, Connecticut.

Giudici, P., Green, P.J. and Tarantola, C. (2000). Efficient model determination for discrete graphical models.
Biometrika
. To appear.

Giudici, P. and Green, P.J. (1999). Decomposable graphical Gaussian model determination.
Biometrika

86

785
-
801.

Green, P.J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.
Biometrika

82

711
-
732.

HUGIN. http://www.hugin.com/

Lauritzen, S.L. (2003).
gRaphical models
--

A Software Perspective
. Slides from DSC
-
2003.

Lauritzen, S.L. and Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical

structures and their applica
tion to expert systems (with

discussion).
J. Roy. Stat. Soc. Ser. B

50

157
-
224.

Pearl, J. (1986). Fusion, propagation and structuring in belief networks.
Artificial Intelligence

29

241
-
288.

Sprites, P., Glymour, C. and Scheines, R. (1993).
Causation, Prediction and Search
. Springer
-
Verlag, New York.

TETRAD. http://www.phil.cmu.edu/projects/tetrad/

Waagepetersen, R. and Sorensen, D. (2001). A tutorial on reversible jump MCMC with a view toward QTL
-
mapping.
International Statistical Review
69
49
-
61.

Wright, S. (1921). Correlation and causation.
J. Agric. Res.

20

557
-
585.

Structural Learning Method

Reversible Jump MCMC Method

Probability: 0.130

Probability: 0.125

Directed Graph

Undirected Graph

Chain Graph

This research is funded by

U.S.EPA


Science To Achieve

Results (STAR) Program

Cooperative

Agreement


#
CR
-
829095

The

work

reported

here

was

developed

under

the

STAR

Research

Assistance

Agreement

CR
-
829095

awarded

by

the

U
.
S
.

Environmental

Protection

Agency

(EPA)

to

Colorado

State

University
.

This

poster

has

not

been

formally

reviewed

by

EPA
.


The

views

expressed

here

are

solely

those

of

the

authors

and

the

STARMAP,

the

Program

they

represents
.

EPA

does

not

endorse

any

products

or

commercial

services

mentioned

in

this

presentation
.

Probability: 0.107

Probability: 0.638

Relative Score = 1

Relative Score = 0.8211

Relative Score = 0.7234

Relative Score = 0.7999

Tit le:
deal1
Creator:
dot v ersion 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS pict ure was not saved
wit h a preview included in it.
Comment:
This EPS pict ure will print t o a
Post Script print er, but not to
ot her t ypes of print ers.
Tit le:
deal3
Creator:
dot version 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS pict ure was not saved
wit h a prev iew included in it.
Comment:
This EPS pict ure will print to a
Post Script print er, but not t o
ot her t ypes of print ers.
Tit le:
deal4
Creator:
dot v ersion 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS pict ure was not saved
wit h a preview included in it.
Comment:
This EPS pict ure will print t o a
Post Script print er, but not to
ot her t ypes of print ers.
Tit le:
deal2
Creator:
dot version 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS picture was not sav ed
wit h a prev iew included in it.
Comment:
This EPS picture will print t o a
Post Script print er, but not t o
ot her t ypes of print ers.
This poster is brought to you by: