# Bayes Rule and Bayesian Networksx - iitk.ac.in

Τεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 4 χρόνια και 6 μήνες)

103 εμφανίσεις

Presentation by:

Ravikiran

Gunale

Y7159

BAYES RULE AND

BAYESIAN NETWORKS

Bayes Rule:

If conditional probability of event A, given event B is
known,

How to determine conditional probability of event B,
given event A.

P (A/B)= P(B/A)P(A)/P(B)

Provides
a mathematical rule
to explain
how you
should
change
your existing beliefs in the light of new
evidence.

Propositional logic fully formalized.

Common situation in human reasoning :

To perform inference from incomplete and

uncertain knowledge.

Propositional logic fails in this situation.

Hence,

Bayes Rule and Bayesian Networks

Bayesian Networks:

Probabilistic graphic model that represents a set of
random variables and their conditional dependencies via
a directed acyclic graph.

Car example directed graph.

Fuel meter
standing

start

Fuel

Clean spark
plug

G=(V,E) ,a directed acyclic graph.

X

is a Bayesian network with respect to

G

if its
joint

probability density
function can
be written as a
product of the individual density functions, conditional on
their parent
variables.

X

is a Bayesian network with respect to

G

if it satisfies
the

local Markov property: each variable is

conditionally
independent

of its non
-
descendants given its parent
variables.

An Example

P(
fo
)=0.15 P(U)=0.1

P(do/
fo

U )=0.99

P(do/
fo

not U)=0.9

P(do/ not
fo

U)=0.97

P(lo/
fo
)=0.6 P(do/ not
fo

not U)=0.3

P(lo/ not
fo
)=0.15 P(H/do)=0.7 P(H/not do)=0.01

Family out (
fo
)

Unhealthy(U)

Lights on(lo)

Dog out(do)

Hear bark(H)

Independence assumption:

If there are n random variables ,the complete

distribution is specified by 2^n
-
1 joint

probabilities.

n=5 . 2^n
-
1 =31 .But we needed only 10 values.

If n=10 , we need 21 values.

Where is this savings coming from?

Bayesian Networks have built in independence
assumptions.

Family out and hear bark example.

d
-
seperation

path
:

Let P be a trail from u to v.

Then

P

is said to be

d
-
separated by a set of nodes

Z

iff

one
of the
following holds
:

P

contains a

chain,

i

m

j, such that the middle node

m

is in

Z,

P

contains a

chain,

i

m

j, such that the middle node

m

is in

Z,

P

contains a

fork,

i

m

j, such that the middle node

m

is in

Z, or

P contains a fork i

m

j,such

that middle m is not in Z and no
descendant of m is in Z.

If u and v are not d
-
seperated

then they are

d
-

connected.

Consistent probabilities:

Consider a system in which we have P(A/B)=0.7,

P(B/A)=0.3, P(B)=0.5

Above values are inconsistent.

Following property of Bayesian networks comes to rescue:

If you specify probabilities of all the nodes given all parent
combinations.

1)
The numbers will be consistent.

2)
The network will uniquely define a distribution.

Inference and learning:

Parameter learning:

Case1 :

Complete Data

Each data set is a configuration over all the variables in a
network.

To ensure that the parameters are learned independently, we
make two assumptions:

1.
Global independence

2.
Local independence

Maximium likelihood estimation
:

Likelihood of M given D is

L(M/D)=
Π

P(d/M), d belongs to D.

M is the network, D the set of cases.

Then we choose the parameter that maximizes the likelihood:

a`=
arg

max L(Ma/D).

Bayesian estimation:

MLE has drawbacks when using for sparse database.

For
B
use data to update the distribution.

Incomplete Data:

Some values may be missing , intentionally removed and in
extreme case some variables may simply not be observable.

Approximate techniques are used for parameter estimation.

Structure Learning:

In simple case, Bayesian network can be specified by expert and
used for inference.

In other applications, network structure and parameters must be
learned from data.

X→Y→Z Type
1 and 2 represent same
dependency

X

Y→Z Type 3 can be uniquely identified.

X→Y←Z

S
ystematically the
skeleton of the underlying graph
is
determined

Then all
arrows
are oriented, whose
directionality is dictated
by the conditional independencies
observed.

Applications:

Most common application is medical diagnosis. For
e.g. PATHFINDER, a program to diagnose disease
of the lymph node.

Modeling knowledge in computational biology,
decision support system, bioinformatics ,image
processing etc.