Bayesian Belief Networks: Computational Considerations for the Environmental
Researcher
Stephen Jensen & Alix I Gitelman
Statistics Department
Oregon State University
August, 2003
Bayesian Belief Networks (BBN)
Bayesian
Belief
Networks
are
a
class
of
models
which
can
be
used
to
describe
succinctly
the
dependencies
and
interactions
between
large
sets
of
variables
.
BBN
have
been
used
extensively
in
fields
such
as
artificial
intelligence
and
decision
sciences
.
BBN
are
also
well

suited
to
environmental
research
with
its
large
numbers
of
variables
and
extensive
dependencies
and
interactions
.
In
fact,
BBN
are
ideally
suited
to
studying
relationships
between
effluent
limitations
and
water
quality
(e
.
g
.
,
Borsuk
et
al
.
,
2003
)
.
BBN
are
graphical
models
—
a
set
of
nodes
represents
the
random
variables
and
a
set
of
vertices
(edges)
represents
relationships
between
the
nodes
.
A
BBN
can
contain
directed
or
undirected
vertices,
and
even
a
mixture
of
the
two
(some
examples
are
shown
below)
:
Undirected
BBN
have
their
genesis
in
statistical
physics
work
by
Gibbs
(
1902
)
.
Additionally,
they
have
been
used
through
the
years
for
modeling
spatial
interactions
(Besag,
1974
),
interpreting
hierarchical
log

linear
models
by
analogy
to
Markov
random
fields
(Darroch
et
al
.
,
1980
)
.
Directed
BBN
originated
in
path
analysis
(Wright
1921
)
.
They
have
seen
much
more
use
recently
in
the
guise
of
causal
networks
(Pearl,
1986
;
Lauritzen
&
Speed,
1988
)
.
Estimating BBN
Several
directed
BBN
algorithms
have
been
devised,
including
HUGIN
(Andersen
et
al
.
,
1989
),
TETRAD
(Sprites
et
al
.
,
1993
)
and
DEAL
(Bøttcher
&
Dethlefsen,
2003
)
.
To
determine
a
BBN
model
for
a
set
of
data,
a
canned
package
such
as
HUGIN
or
DEAL
can
be
used
exclusively
.
These
packages
search
across
some
space
of
likely
models
using
either
an
exhaustive
search
or
possibly
a
heuristic
like
a
greedy
search
.
The
networks
are
scored
in
some
manner
to
determine
the
best
candidates
.
These
packages
are
useful
for
structural
learning
—
that
is,
understanding
which
nodes
are
connected
to
which
.
Using
these
packages,
the
probability
of
an
edge
being
included
in
the
graph
is
not
given,
and
neither
are
standard
error
estimates
provided
for
parameter
(i
.
e
.
,
node
probability)
estimates
.
Furthermore,
for
efficiency,
these
canned
packages
all
require
frequent
triangulation
of
a
graph,
which
is
known
to
be
an
NP

hard
problem
.
Although
restriction
to
decomposable
models
only
requires
a
single
triangulation
.
(Deshpande
et
al
.
,
2001
)
.
An
extension
of
this
method
is
possible
by
first
using
HUGIN
or
another
package
first
to
find
a
set
of
candidate
models
(that
is,
to
perform
the
structural
learning),
and
then
estimate
the
parameters
of
those
models
by
using
a
Markov
Chain
Monte
Carlo
method
(this
is
the
parameter
learning
phase)
.
In
this
way,
varaibility
estimates
for
the
model
parameters
can
be
calculated
.
However,
this
approach
is
two

stage,
and
in
it,
probabilities
for
an
edge
being
included
in
the
model
remain
uncalculated
.
Others
have
simply
assumed
a
structure
for
a
BBN,
and
then
used
extant
software
to
estimate
the
strength
of
the
edges
(Borsuk
et
al
.
,
2003
)
.
Reversible
jump
Markov
chain
Monte
Carlo
(RJMCMC
;
Green,
1995
)
is
a
generalization
of
the
Metropolis

Hastings
algorithm
to
models
with
varying
dimension
of
the
parameter
space
.
This
makes
it
well
suited
for
graphical
model
estimation,
where
the
number
(and
direction)
of
edges
may
be
unknown
a
priori
.
That
is,
using
the
RJMCMC
algorithm,
we
can
accomplish
both
structural
and
parameter
learning
.
Furthermore,
edge
probabilities
and
estimate
variability
are
calculated
as
a
matter
of
course
.
RJ
MCMC
works
well
for
discrete
undirected
graphical
models
(Dellaportas
&
Forster,
1999
),
but
is
computationally
demanding
.
Efficiency
is
improved
by
restricting
the
search
to
decomposable
models
.
Algorithms
for
undirected
decomposable
graphs
(UDGs)
exist
in
purely
continuous
(Giudici
&
Green,
1999
)
and
purely
discrete
settings
(Giudici
et
al
.
,
2000
)
.
Fronk
&
Giudici
(
2000
)
provided
an
algorithm
for
directed
acyclic
graphs
(DAGs)
.
The RJ MCMC algorithms for graphical models involve three steps:
1.
Select an edge for addition, removal or reversal.
2.
Decide whether to make the structural change and update the structure and parameters.
3.
Update the parameters in some fashion without making a structural change.
RJ
MCMC
algorithms
will
often
require
the
same
re

triangulation
of
the
graph
as
the
traditional
algorithms,
making
the
structural
updating
quite
computationally
intensive
.
Restriction
to
decomposable
models
makes
the
computation
easier,
as
the
model
structure
can
be
updated
in
polynomial
time
(Desphande
et
al
.
,
2001
)
.
Updating
the
parameters
(step
3
above)
is
a
simple
Metropolis

Hastings
step,
which
is
rather
quick
.
Recommendations
Depending on the nature of prior information regarding the ecological systems one wishes to model
using BBN, there are several computational approaches from which to choose.
•
Assume “known” nodes, perform
structural learning
(e.g., DEAL, HUGIN).
•
Assume “known” edges, perform
parameter learning
(e.g., WINBUGS).
•
Perform a two

step approach
—
first learn the structure, then the parameters.
•
Perform
structural and parameter learning
simultaneously (RJMCMC; e.g., BayesX).
Tradeoffs for these approaches include computational expenses (in terms of programming costs and CPU time)
and the feasibility (really, availability) of probability and uncertainty estimation.
Four Node Example: Structural Learning vs RJMCMC Methods
or “This is all very nice, Steve, but will it work?”
In this example we consider a simple data set of four continuous variables, to which we will fit a directed graphical model.
Th
is model is fit by using a traditional algorithm (in this case DEAL), and also by an RJ
MCMC algorithm (using the package BayesX). The data are Mid

Atlantic Integrated Assessment (MAIA) data set for 1997

1998. Specif
ically, the variables are
BUGIBI,
an index of biotic integrity for macro

invertebrates;
LAT,
latitude of the sample point;
ANC,
acid neutralizing capacity; and
SO4,
sulfur dioxide
.
References
Andersen, S.K., Olesen, K.G., Jensen, F.V. and Jensen, F. (1989). HUGIN
–
A shell for building Bayesian belief universes for exp
ert systems, in Shafer, G. and
Pearl, J. (eds) (1990).
Readings in Uncertain Reasoning
, Morgan Kaufmann
Publishers, San Mateo, CA.
BayesX. http://www.stat.uni

muenchen.de/~lang/bayesx/bayesx.html
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion).
J. Roy. Statist. Soc. Ser. B
36
302

309.
Borsuk, M.E., Stow, C.A., Reckhow, K.H. (2003) Integrated approach to total maximum daily load development for Neuse
River Estu
ary using Bayesian
probability network model.
J. Water Res. Pl. & Mgmt
.
129,4
271

282.
Bøttcher, S.G. and Dethlefsen, C. (2003).
DEAL: A package for learning Bayesian networks
. Technical report, R

2003

03. Department of Mathematical
Sciences, Aalborg University.
BUGS. http://www.mrc

bsu.cam.ac.uk/bugs/welcome.shtml
Darroch, J.N., Lauritzen, S.L. and Speed, T.P. (1980). Markov fields and log

linear interaction models for contingency tables.
Ann. Stat.
8
522

539.
Dawid, A. and Lauritzen, S. (1993). Hyper Markov laws in the statistical analysis of decomposable graphical models.
Ann. Stat.,
21
1272

1317.
DEAL. http://www.math.auc.dk/novo/deal/
Dellaportas, P., Forster, J.J. (1999). Markov chain Monte Carlo model determination for hierarchical and graphical log

linear m
odels.
Biometrika
86
615

633.
Deshpande, A., Garofalakis, M. and Jordan, M.I. (2001). In J.Breese and D. Koller (eds).,
Uncertainty in Artificial Intelligence (UAI), Proceedings of the
Seventeenth Conference 2001.
Fronk, E. and Giudici, P. (2000). Markov chain Monte Carlo model selection for DAG models. Technical report #118, Dept. of Po
lit
ical Economy and
Quantitative Methods, University of Pavia.
Gansner, E., Koutsofis, E., and North, S. Drawing graphs with
dot.
http://www.research.att.com/sw/tools/graphviz/dotguide.pdf
Gibbs, W. (1902).
Elementary Principles of Statistical Mechanics
. Yale University Press, New Haven, Connecticut.
Giudici, P., Green, P.J. and Tarantola, C. (2000). Efficient model determination for discrete graphical models.
Biometrika
. To appear.
Giudici, P. and Green, P.J. (1999). Decomposable graphical Gaussian model determination.
Biometrika
86
785

801.
Green, P.J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.
Biometrika
82
711

732.
HUGIN. http://www.hugin.com/
Lauritzen, S.L. (2003).
gRaphical models

A Software Perspective
. Slides from DSC

2003.
Lauritzen, S.L. and Spiegelhalter, D.J. (1988). Local computations with probabilities on graphical
structures and their applica
tion to expert systems (with
discussion).
J. Roy. Stat. Soc. Ser. B
50
157

224.
Pearl, J. (1986). Fusion, propagation and structuring in belief networks.
Artificial Intelligence
29
241

288.
Sprites, P., Glymour, C. and Scheines, R. (1993).
Causation, Prediction and Search
. Springer

Verlag, New York.
TETRAD. http://www.phil.cmu.edu/projects/tetrad/
Waagepetersen, R. and Sorensen, D. (2001). A tutorial on reversible jump MCMC with a view toward QTL

mapping.
International Statistical Review
69
49

61.
Wright, S. (1921). Correlation and causation.
J. Agric. Res.
20
557

585.
Structural Learning Method
Reversible Jump MCMC Method
Probability: 0.130
Probability: 0.125
Directed Graph
Undirected Graph
Chain Graph
This research is funded by
U.S.EPA
–
Science To Achieve
Results (STAR) Program
Cooperative
Agreement
#
CR

829095
The
work
reported
here
was
developed
under
the
STAR
Research
Assistance
Agreement
CR

829095
awarded
by
the
U
.
S
.
Environmental
Protection
Agency
(EPA)
to
Colorado
State
University
.
This
poster
has
not
been
formally
reviewed
by
EPA
.
The
views
expressed
here
are
solely
those
of
the
authors
and
the
STARMAP,
the
Program
they
represents
.
EPA
does
not
endorse
any
products
or
commercial
services
mentioned
in
this
presentation
.
Probability: 0.107
Probability: 0.638
Relative Score = 1
Relative Score = 0.8211
Relative Score = 0.7234
Relative Score = 0.7999
Tit le:
deal1
Creator:
dot v ersion 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS pict ure was not saved
wit h a preview included in it.
Comment:
This EPS pict ure will print t o a
Post Script print er, but not to
ot her t ypes of print ers.
Tit le:
deal3
Creator:
dot version 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS pict ure was not saved
wit h a prev iew included in it.
Comment:
This EPS pict ure will print to a
Post Script print er, but not t o
ot her t ypes of print ers.
Tit le:
deal4
Creator:
dot v ersion 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS pict ure was not saved
wit h a preview included in it.
Comment:
This EPS pict ure will print t o a
Post Script print er, but not to
ot her t ypes of print ers.
Tit le:
deal2
Creator:
dot version 1.10 (Wed Jul 9 23:09:17 EDT 2003)
Prev iew:
This EPS picture was not sav ed
wit h a prev iew included in it.
Comment:
This EPS picture will print t o a
Post Script print er, but not t o
ot her t ypes of print ers.
This poster is brought to you by:
Comments 0
Log in to post a comment