# bayesian networks - Artificial Intelligence Laboratory

AI and Robotics

Nov 7, 2013 (4 years and 6 months ago)

95 views

BAYESIAN NETWORKS

Ivan Bratko

Faculty of Computer and Information Sc.

University of Ljubljana

BAYESIAN NETWORKS

Bayesian networks, or belief networks: an approach to
handling uncertainty in knowledge
-
based systems

Mathematically well
-
founded in probability theory, unlike
many other, earlier approaches to representing uncertain
knowledge

Type of problems intended for belief nets: given that
some things are known to be true, how likely are some
other events?

BURGLARY EXAMPLE

We have an alarm system to warn about burglary.

We have received an automatic alarm phone call; how
likely it is that there actually was a burglary?

We cannot tell about burglary for sure, but characterize it

BURGLARY EXAMPLE

There are a number of events involved:

burglary

sensor

that may be triggered by burglar

lightning
that may also trigger the sensor

alarm
that may be triggered by sensor

call

that may be triggered by sensor

BAYES NET REPRESENTATION

There are variables (e.g. burglary, alarm) that can take
values (e.g. alarm = true, burglary = false).

There are probabilistic relations among variables, e.g.:

if burglary = true

then it is more likely that alarm = true

EXAMPLE BAYES NET

burglary lightning

sensor

alarm call

PROBABILISTC DEPENDENCIES

AND CAUSALITY

Belief networks define probabilistic dependencies (and
independencies) among the variables

They may also reflect causality (burglar triggers sensor)

EXAMPLE OF

REASONING IN BELIEF NETWORK

In normal situation, burglary is not very likely.

We receive automatic warning call; since sensor causes
warning call, the probability of sensor being on
increases; since burglary is a cause for triggering the
sensor, the probability of burglary increases.

Then we learn there was a storm. Lightning may also
trigger sensor. Since lightning now also explains how
the call happened, the probability of burglary decreases.

TERMINOLOGY

Bayes network =

belief network =

probabilistic network =

causal network

BAYES NETWORKS, DEFINITION

Bayes net is a DAG (direct acyclic graph)

Nodes ~ random variables

“X has direct influence on Y”

For each node: conditional probability table quantifying
effects of parent nodes

MAJOR PROBLEM IN HANDLING
UNCERTAINTY

In general, with uncertainty, the problem is the handling
of dependencies between events.

In principle, this can be handled by specifying the
complete probability distribution over all possible
combinations of variable values.

However, this is impractical or impossible: for
n

binary
variables, 2
n

-

1 probabilities
-

too many!

Belief networks enable that this number can usually be
reduced in practice

BURGLARY DOMAIN

Five events: B, L, S, A, C

Complete probability distribution:

p( B L S A C) = ...

p( ~B L S A C) = ...

p( ~B ~L S A C) = ...

p( ~B L ~S A C) = ...

...

Total: 32 probabilities

WHY BELIEF NETS BECAME SO
POPULAR?

If some things are mutually independent then not all
conditional probabilities are needed.

p(XY) = p(X) p(Y|X), p(Y|X) needed

If X and Y independent:

p(XY) = p(X) p(Y), p(Y|X) not needed!

Belief networks provide an elegant way of stating
independences

EXAMPLE FROM J. PEARL

Burglary Earthquake

Alarm

John calls Mary calls

Burglary causes alarm

Earthquake cause alarm

When they hear alarm, neighbours John and Mary phone

Occasionally John confuses phone ring for alarm

Occasionally Mary fails to hear alarm

PROBABILITIES

P(B) = 0.001, P(E) = 0.002

A P(J | A) A P(M | A)

T 0.90 T 0.70

F 0.05 F 0.01

B E P(A | BE)

T T 0.95

T F 0.95

F T 0.29

F F 0.001

HOW ARE INDEPENDENCIES STATED IN
BELIEF NETS

A

B

C

D

If C is known to be true, then prob. of D independent of A, B

p( D | A B C) = p( D | C)

A1, A2, .....
non
-
descendants
of C

B1 B2 ...
parents

of C

C

D1, D2, ...
descendants
of C

C is independent of C's non
-
descendants given C's parents

p( C | A1, ..., B1, ..., D1, ...) = p( C | B1, ..., D1, ...)

INDEPENDENCE ON

NONDESCENDANTS REQUIRES CARE

EXAMPLE

a

parent of c b

c e nondescendants of c

d f

descendant of c

p(c|ab) = p(c|b)

Because: c independent of c's nondesc. a given c's
parents (node b)

INDEPENDENCE ON

NONDESCENDANTS REQUIRES CARE

But, for this Bayesian network:

p(c|bdf)

p(c|bd)

Athough f is c's nondesc., it cannot be ignored:

knowing f, e becomes more likely;

e may also cause d, so when e becomes more likely, c
becomes less likely.

Problem is that descendant d is given.

SAFER FORMULATION OF
INDEPENDENCE

C is independent of C's nondescendants given

C's parents (only) and not C's descendants.

STATING PROBABILITIES

IN BELIEF NETS

For each node X with parents Y1, Y2, ..., specify
conditional probabilities of form:

p( X | Y1

Y2

...)

for all possible states of Y1, Y2, ...

Y1 Y2

X

Specify:

p( X | Y1, Y2)

p( X | ~Y1, Y2)

p( X | Y1, ~Y2)

p( X | ~Y1, ~Y2)

BURGLARY EXAMPLE

p(burglary) = 0.001

p(lightning) = 0.02

p(sensor | burglary

lightning) = 0.9

p(sensor | burglary

~lightning) = 0.9

p(sensor | ~burglary

lightning) = 0.1

p(sensor | ~burglary

~lightning) = 0.001

p(alarm | sensor) = 0.95

p(alarm | ~sensor) = 0.001

p(call | sensor) = 0.9

p(call | ~sensor) = 0.0

BURGLARY EXAMPLE

10 numbers plus structure of network

are equivalent to

2
5

-

1= 31 numbers required to specify
complete probability distribution (without
structure information).

EXAMPLE QUERIES FOR BELIEF
NETWORKS

p( burglary | alarm) = ?

p( burglary

lightning) = ?

p( burglary | alarm

~lightning) = ?

p( alarm

~call | burglary) = ?

Probabilistic reasoning in belief nets

Easy in forward direction, from ancestors to
descendents, e.g.:

p( alarm | burglary

lightning) = ?

In backward direction, from descendants to ancestors,

apply Bayes' formula

p( B | A) = p(B) * p(A | B) / p(A)

BAYES' FORMULA

A variant of Bayes' formula to reason about probability
of hypothesis
H

given evidence
E

in presence of
background knowledge
B
:

)
(
)
|
(
)
(
)
|
(
Y
p
X
Y
p
X
p
Y
X
p

)
|
(
)
|
(
)
|
(
)
|
(
B
E
p
B
H
E
p
B
H
p
B
E
H
p

REASONING RULES

1. Probability of conjunction:

p( X1

X2 | Cond) = p( X1 | Cond) * p( X2 | X1

Cond)

2. Probability of a certain event:

p( X | Y1

...

X

...) = 1

3. Probability of impossible event:

p( X | Y1

...

~X

...) = 0

4. Probability of negation:

p( ~X | Cond) = 1

p( X | Cond)

5. If condition involves a descendant of X then use Bayes' theorem:

If Cond0 = Y

Cond where Y is a descendant of X in belief net

then p(X|Cond0) = p(X|Cond) * p(Y|X

Cond) / p(Y|Cond)

6. Cases when condition Cond does not involve a descendant of X:

(a) If X has no parents then p(X|Cond) = p(X), p(X) given

(b) If X has parents Parents then

)
(
_
)
|
(
)
|
(
)
|
(
Parent
states
possible
S
Cond
S
p
S
X
p
Cond
X
p
A SIMPLE IMPLEMENTATION IN PROLOG

In: I. Bratko, Prolog Programming for Artificial Intelligence,
Third edition, Pearson Education 2001(Chapter 15)

An interaction with this program:

?
-

prob( burglary, [call], P).

P = 0.232137

Now we learn there was a heavy storm, so:

?
-

prob( burglary, [call, lightning], P).

P = 0.00892857

Lightning explains call, so burglary seems less likely.
However, if the weather was fine then burglary becomes
more likely:

?
-

prob( burglary, [call,not lightning],P).

P = 0.473934

Complexity of reasoning in belief networks grows
exponentially with the number of nodes.

Substantial algorithmic improvements required for large
networks for improved efficiency.

d
-
SEPARATION

Follows from basic independence assumption of Bayes
networks

d
-
separation

= direction
-
dependent separation

Let E = set of “evidence nodes” (subset of variables in
Bayes network)

Let V
i
, V
j

be two variables in the network

d
-
SEPARATION

Nodes V
i

and V
j

are conditionally independent given set
E if E
d
-
separates

V
i

and V
j

E d
-
separates V
i
, V
j

if all (undirected) paths (V
i
,V
j
) are
“blocked” by E

If E d
-
separates V
i
, V
j
, then V
i

and V
j

are conditionally
independent, given E

We write I(V
i
,V
j

| E)

This means: p(V
i
,V
j

| E) = p(V
i

| E) * p(V
j

| E)

BLOCKING A PATH

A path between V
i

and V
j

is blocked by nodes E if there is a

“blocking node” V
b

on the path. V
b

blocks the path if one of

the following holds:

V
b

in E and both arcs on path lead out of V
b
, or

V
b

in E and one arc on path leads into V
b

and one out, or

neither V
b

nor any descendant of V
b

is in E, and both arcs
b

CONDITION 1

V
b

is a common cause:

V
b

V
i

V
j

CONDITION 2

V
b

is a “closer, more direct cause” of V
j

than V
i

is

V
i

Vb

V
j

CONDITION 3

V
b

is not a common consequence of V
i
, V
j

V
i

V
j

V
b

V
b

not in E

V
d

V
d

not in E