School of Computer Science

and Informatics

Cardiff University

Artiﬁcial Intelligence

IV.Uncertain Knowledge and Reasoning

3.Bayesian Networks

F.C.Langbein

1.4

Overview

Bayesian networks

Syntax

Global Semantics

Local semantics

Markov blanket

Constructing Bayesian networks

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 1

Bayesian Networks

A simple,graphical notation for conditional independence

assertions

Compact speciﬁcation of full joint distributions

Syntax

a set of nodes,one per variable X

l

a directed,acyclic graph (link ≈ “directly inﬂuences”)

a conditional probability distribution (CPD) for each node

given its parents

P(X

l

|Parents(X

l

))

Simplest case:CPD is a conditional probability table (CPT)

Giving distribution over X

l

for each combination of parent

values

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 2

Dentistry Example

Topology of network encodes conditional independence

assertions

Weather is independent of the other variables

Toothache and Catch are conditionally independent

given Cavity

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 3

Burglar AlarmExample

I’m at work.Is there is a burglary at home?

Neighbour John calls to say my alarm is ringing.

Neighbour Mary does not call.

Sometimes alarm is set off by minor earthquakes.

Boolean variables:

Burglary,

Earthquake,

Alarm,

JohnCalls,

MaryCalls

Construct network to reﬂect

causal knowledge

A burglar may set the alarm off

An earthquake may set the alarm off

The alarm may cause Mary to call

The alarm may cause John to call

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 4

Burglar AlarmExample

Less space:Max.k parents ⇒O(nd

k

) numbers vs.O(d

n

)

Faster to answer queries

Simpler to ﬁnd CPTs

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 5

Global Semantics

Global semantics deﬁnes the full joint distribution as the

product of the local conditional distributions

P(X

1

,...,X

n

) =

n

l=1

P(X

l

|Parents(X

l

))

Combines chain rule and independence

Examples

P(j ∧m∧a ∧¬b ∧¬e) =

P(¬b)P(¬e)P(a|¬b ∧¬e)P(j|a)P(m|a)

P(J|B,E) = α

a

m

P(J,B,E,a,m)

= α

a

m

P(J|a)P(B)P(E)P(a|B,E)P(m|A)

P(B|J) = α

e

a

m

P(B,J,e,a,m)

= α

e

a

m

P(B)P(J|a)P(e)P(a|B,e)P(m|a)

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 6

Local Semantics

Local semantics:each node is conditionally independent

of its non-descendants given its parents

Theorem:

local semantics ⇔global semantics

(Proof:apply chain rule with ordering of parents before children)

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 7

Markov Blanket

Each node is conditionally independent of all others given

its

Markov blanket

Markov blanket = parents + children + children’s parents

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 8

Constructing Bayesian Networks

Need a method such that a series of locally testable

assertions of conditional independence guarantees the

required global semantics

CHOOSE ordering of variables X

1

,...,X

n

For l ←1 to n

ADD X

l

to the network

SELECT parents from X

1

,...,X

l−1

such that

P(X

l

|Parents(X

l

)) = P(X

l

|X

1

,...,X

l−1

)

This choice of parents guarantees global semantics

P(X

1

,...,X

n

) =

n

l=1

P(X

l

|X

1

,...,X

l−1

) (chain rule)

=

n

l=1

P(X

l

|Parents(X

l

)) (by construction)

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 9

Burglar AlarmExample

Suppose we choose the ordering M,J,A,B,E

P(J|M) = P(J)?

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 10

Burglar AlarmExample

Suppose we choose the ordering M,J,A,B,E

P(J|M) = P(J)?

No

P(A|J,M) = P(A|J)?P(A|J,M) = P(A)?

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 11

Burglar AlarmExample

Suppose we choose the ordering M,J,A,B,E

P(J|M) = P(J)?No

P(A|J,M) = P(A|J)?P(A|J,M) = P(A)?

No

P(B|A,J,M) = P(B|A)?

P(B|A,J,M) = P(B)?

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 12

Burglar AlarmExample

Suppose we choose the ordering M,J,A,B,E

P(J|M) = P(J)?No

P(A|J,M) = P(A|J)?P(A|J,M) = P(A)?No

P(B|A,J,M) = P(B|A)?

Yes

P(B|A,J,M) = P(B)?

No

P(E|B,A,J,M) = P(E|A)?

P(E|B,A,J,M) = P(E|A,B)?

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 13

Burglar AlarmExample

Suppose we choose the ordering M,J,A,B,E

P(J|M) = P(J)?No

P(A|J,M) = P(A|J)?P(A|J,M) = P(A)?No

P(B|A,J,M) = P(B|A)?Yes

P(B|A,J,M) = P(B)?No

P(E|B,A,J,M) = P(E|A)?

No

P(E|B,A,J,M) = P(E|A,B)?

Yes

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 14

Burglar AlarmExample

For non-causal (∼diagnostic) directions

Deciding conditional independence is hard

(causal models and conditional independence seem hardwired for

humans)

Assessing conditional probabilities is hard

Network is less compact:1 +2 +4 +2 +4 = 13 numbers

instead of 1 +1 +4 +2 +2 = 10 numbers needed

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 15

Car Diagnosis Example

Initial evidence:car does not start

Testable variables(green);“broken,so ﬁx it” variables (orange)

Hidden variables (grey) ensure sparse structure,reduce parameters

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 16

Car Insurance Example

Note,arcs do not deny independence

However,absence of arc asserts independence

F.C.Langbein,Artiﬁcial Intelligence – IV.Uncertain Knowledge and Reasoning;3.Bayesian Networks 17

## Comments 0

Log in to post a comment