bayesian networks - Artificial Intelligence Laboratory

lettuceescargatoireΤεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

79 εμφανίσεις

BAYESIAN NETWORKS


Ivan Bratko

Faculty of Computer and Information Sc.

University of Ljubljana

BAYESIAN NETWORKS


Bayesian networks, or belief networks: an approach to
handling uncertainty in knowledge
-
based systems



Mathematically well
-
founded in probability theory, unlike
many other, earlier approaches to representing uncertain
knowledge



Type of problems intended for belief nets: given that
some things are known to be true, how likely are some
other events?

BURGLARY EXAMPLE


We have an alarm system to warn about burglary.



We have received an automatic alarm phone call; how
likely it is that there actually was a burglary?



We cannot tell about burglary for sure, but characterize it
probabilistically instead


BURGLARY EXAMPLE


There are a number of events involved:


burglary


sensor

that may be triggered by burglar


lightning
that may also trigger the sensor


alarm
that may be triggered by sensor


call

that may be triggered by sensor


BAYES NET REPRESENTATION


There are variables (e.g. burglary, alarm) that can take
values (e.g. alarm = true, burglary = false).



There are probabilistic relations among variables, e.g.:


if burglary = true


then it is more likely that alarm = true

EXAMPLE BAYES NET




burglary lightning





sensor





alarm call

PROBABILISTC DEPENDENCIES

AND CAUSALITY


Belief networks define probabilistic dependencies (and
independencies) among the variables



They may also reflect causality (burglar triggers sensor)

EXAMPLE OF

REASONING IN BELIEF NETWORK



In normal situation, burglary is not very likely.


We receive automatic warning call; since sensor causes
warning call, the probability of sensor being on
increases; since burglary is a cause for triggering the
sensor, the probability of burglary increases.


Then we learn there was a storm. Lightning may also
trigger sensor. Since lightning now also explains how
the call happened, the probability of burglary decreases.

TERMINOLOGY

Bayes network =

belief network =

probabilistic network =

causal network




BAYES NETWORKS, DEFINITION


Bayes net is a DAG (direct acyclic graph)



Nodes ~ random variables



Link X Y intuitively means:


“X has direct influence on Y”



For each node: conditional probability table quantifying
effects of parent nodes

MAJOR PROBLEM IN HANDLING
UNCERTAINTY


In general, with uncertainty, the problem is the handling
of dependencies between events.


In principle, this can be handled by specifying the
complete probability distribution over all possible
combinations of variable values.


However, this is impractical or impossible: for
n

binary
variables, 2
n

-

1 probabilities
-

too many!


Belief networks enable that this number can usually be
reduced in practice

BURGLARY DOMAIN


Five events: B, L, S, A, C



Complete probability distribution:


p( B L S A C) = ...

p( ~B L S A C) = ...

p( ~B ~L S A C) = ...

p( ~B L ~S A C) = ...

...


Total: 32 probabilities

WHY BELIEF NETS BECAME SO
POPULAR?


If some things are mutually independent then not all
conditional probabilities are needed.


p(XY) = p(X) p(Y|X), p(Y|X) needed



If X and Y independent:


p(XY) = p(X) p(Y), p(Y|X) not needed!



Belief networks provide an elegant way of stating
independences

EXAMPLE FROM J. PEARL


Burglary Earthquake



Alarm



John calls Mary calls



Burglary causes alarm


Earthquake cause alarm


When they hear alarm, neighbours John and Mary phone


Occasionally John confuses phone ring for alarm


Occasionally Mary fails to hear alarm



PROBABILITIES

P(B) = 0.001, P(E) = 0.002


A P(J | A) A P(M | A)

T 0.90 T 0.70

F 0.05 F 0.01


B E P(A | BE)

T T 0.95

T F 0.95

F T 0.29

F F 0.001


HOW ARE INDEPENDENCIES STATED IN
BELIEF NETS



A



B



C



D


If C is known to be true, then prob. of D independent of A, B


p( D | A B C) = p( D | C)





A1, A2, .....
non
-
descendants
of C




B1 B2 ...
parents

of C





C



D1, D2, ...
descendants
of C




C is independent of C's non
-
descendants given C's parents

p( C | A1, ..., B1, ..., D1, ...) = p( C | B1, ..., D1, ...)


INDEPENDENCE ON

NONDESCENDANTS REQUIRES CARE

EXAMPLE



a

parent of c b


c e nondescendants of c


d f


descendant of c


By applying rule about nondescendants:


p(c|ab) = p(c|b)

Because: c independent of c's nondesc. a given c's
parents (node b)

INDEPENDENCE ON

NONDESCENDANTS REQUIRES CARE

But, for this Bayesian network:


p(c|bdf)


p(c|bd)


Athough f is c's nondesc., it cannot be ignored:

knowing f, e becomes more likely;


e may also cause d, so when e becomes more likely, c
becomes less likely.


Problem is that descendant d is given.


SAFER FORMULATION OF
INDEPENDENCE

C is independent of C's nondescendants given

C's parents (only) and not C's descendants.

STATING PROBABILITIES


IN BELIEF NETS

For each node X with parents Y1, Y2, ..., specify
conditional probabilities of form:


p( X | Y1

Y2


...)

for all possible states of Y1, Y2, ...



Y1 Y2


X

Specify:


p( X | Y1, Y2)


p( X | ~Y1, Y2)


p( X | Y1, ~Y2)


p( X | ~Y1, ~Y2)

BURGLARY EXAMPLE


p(burglary) = 0.001


p(lightning) = 0.02


p(sensor | burglary


lightning) = 0.9


p(sensor | burglary


~lightning) = 0.9


p(sensor | ~burglary


lightning) = 0.1


p(sensor | ~burglary


~lightning) = 0.001


p(alarm | sensor) = 0.95


p(alarm | ~sensor) = 0.001


p(call | sensor) = 0.9


p(call | ~sensor) = 0.0

BURGLARY EXAMPLE


10 numbers plus structure of network


are equivalent to


2
5

-

1= 31 numbers required to specify
complete probability distribution (without
structure information).


EXAMPLE QUERIES FOR BELIEF
NETWORKS




p( burglary | alarm) = ?



p( burglary


lightning) = ?



p( burglary | alarm


~lightning) = ?



p( alarm


~call | burglary) = ?

Probabilistic reasoning in belief nets


Easy in forward direction, from ancestors to
descendents, e.g.:



p( alarm | burglary


lightning) = ?



In backward direction, from descendants to ancestors,

apply Bayes' formula



p( B | A) = p(B) * p(A | B) / p(A)

BAYES' FORMULA



A variant of Bayes' formula to reason about probability
of hypothesis
H

given evidence
E

in presence of
background knowledge
B
:

)
(
)
|
(
)
(
)
|
(
Y
p
X
Y
p
X
p
Y
X
p

)
|
(
)
|
(
)
|
(
)
|
(
B
E
p
B
H
E
p
B
H
p
B
E
H
p



REASONING RULES

1. Probability of conjunction:


p( X1


X2 | Cond) = p( X1 | Cond) * p( X2 | X1


Cond)


2. Probability of a certain event:


p( X | Y1


...


X


...) = 1


3. Probability of impossible event:


p( X | Y1


...


~X


...) = 0


4. Probability of negation:


p( ~X | Cond) = 1


p( X | Cond)

5. If condition involves a descendant of X then use Bayes' theorem:


If Cond0 = Y


Cond where Y is a descendant of X in belief net


then p(X|Cond0) = p(X|Cond) * p(Y|X

Cond) / p(Y|Cond)


6. Cases when condition Cond does not involve a descendant of X:


(a) If X has no parents then p(X|Cond) = p(X), p(X) given



(b) If X has parents Parents then




)
(
_
)
|
(
)
|
(
)
|
(
Parent
states
possible
S
Cond
S
p
S
X
p
Cond
X
p
A SIMPLE IMPLEMENTATION IN PROLOG


In: I. Bratko, Prolog Programming for Artificial Intelligence,
Third edition, Pearson Education 2001(Chapter 15)


An interaction with this program:

?
-

prob( burglary, [call], P).

P = 0.232137

Now we learn there was a heavy storm, so:

?
-

prob( burglary, [call, lightning], P).

P = 0.00892857


Lightning explains call, so burglary seems less likely.
However, if the weather was fine then burglary becomes
more likely:


?
-

prob( burglary, [call,not lightning],P).

P = 0.473934



COMMENTS




Complexity of reasoning in belief networks grows
exponentially with the number of nodes.



Substantial algorithmic improvements required for large
networks for improved efficiency.

d
-
SEPARATION


Follows from basic independence assumption of Bayes
networks


d
-
separation

= direction
-
dependent separation


Let E = set of “evidence nodes” (subset of variables in
Bayes network)


Let V
i
, V
j

be two variables in the network

d
-
SEPARATION


Nodes V
i

and V
j

are conditionally independent given set
E if E
d
-
separates

V
i

and V
j


E d
-
separates V
i
, V
j

if all (undirected) paths (V
i
,V
j
) are
“blocked” by E


If E d
-
separates V
i
, V
j
, then V
i

and V
j

are conditionally
independent, given E


We write I(V
i
,V
j

| E)


This means: p(V
i
,V
j

| E) = p(V
i

| E) * p(V
j

| E)


BLOCKING A PATH

A path between V
i

and V
j

is blocked by nodes E if there is a


“blocking node” V
b

on the path. V
b

blocks the path if one of


the following holds:


V
b

in E and both arcs on path lead out of V
b
, or


V
b

in E and one arc on path leads into V
b

and one out, or


neither V
b

nor any descendant of V
b

is in E, and both arcs
on path lead into V
b

CONDITION 1

V
b

is a common cause:




V
b




V
i

V
j

CONDITION 2


V
b

is a “closer, more direct cause” of V
j

than V
i

is



V
i





Vb





V
j

CONDITION 3


V
b

is not a common consequence of V
i
, V
j




V
i

V
j



V
b

V
b

not in E



V
d

V
d

not in E