# class15

AI and Robotics

Nov 7, 2013 (4 years and 8 months ago)

232 views

Bayesian Inference

Artificial Intelligence

CMSC 25000

February 26, 2002

Agenda

Motivation

Reasoning with uncertainty

Medical Informatics

Probability and Bayes’ Rule

Bayesian Networks

Noisy
-
Or

Decision Trees and Rationality

Conclusions

Motivation

Uncertainty in medical diagnosis

Diseases produce symptoms

In diagnosis, observed symptoms => disease ID

Uncertainties

Symptoms may not occur

Symptoms may not be reported

Diagnostic tests not perfect

False positive, false negative

How do we estimate confidence?

Motivation II

Uncertainty in medical decision
-
making

Physicians, patients must decide on treatments

Treatments may not be successful

Treatments may have unpleasant side effects

Choosing treatments

People are BAD at reasoning intuitively

Provide systematic analysis

Probabilities Model Uncertainty

The World
-

Features

Random variables

Feature values

States of the world

Assignments of values to variables

Exponential in # of variables

possible states

}
,...,
,
{
2
1
n
X
X
X

}
...,
,
{
,
2
1
i
ik
i
i
x
x
x

n
i
i
k
1
n
i
k
2
;
2

Probabilities of World States

: Joint probability of assignments

States are distinct and exhaustive

Typically care about SUBSET of assignments

aka “Circumstance”

Exponential in # of don’t cares

})
,
,
,
({
)
,
(
4
3
}
,
{
}
,
{
2
1
4
2
f
X
v
X
t
X
u
X
P
f
X
t
X
P
f
t
u
f
t
v

)
(
i
S
P
)
(
1
1

n
i
i
k
j
j
S
P
A Simpler World

2^n world states = Maximum entropy

Many variables independent

P(strep,ebola) = P(strep)P(ebola)

Conditionally independent

Depend on same factors but not on each other

P(fever,cough|flu) = P(fever|flu)P(cough|flu)

Probabilistic Diagnosis

Question:

How likely is a patient to have a disease if they have
the symptoms?

Probabilistic Model: Bayes’ Rule

P(D|S) = P(S|D)P(D)/P(S)

Where

P(S|D) : Probability of symptom given disease

P(D): Prior probability of having disease

P(S): Prior probability of having symptom

Modeling (In)dependence

Bayesian network

Nodes = Variables

Arcs = Child depends on parent(s)

No arcs = independent (0 incoming: only a priori)

Parents of X =

For each X need

)
(
X

))
(
|
(
X
X
P

Simple Bayesian Network

MCBN1

A

B

C

D

E

A = only a priori

B depends on A

C depends on A

D depends on B,C

E depends on C

Need:

P(A)

P(B|A)

P(C|A)

P(D|B,C)

P(E|C)

Truth table

2

2*2

2*2

2*2*2

2*2

Simplifying with Noisy
-
OR

How many computations?

p = # parents; k = # values for variable

(k
-
1)k^p

Very expensive! 10 binary parents=2^10=1024

Reduce computation by simplifying model

Treat each parent as possible independent cause

Only 11 computations

10 causal probabilities + “leak” probability

“Some other cause”

Noisy
-
OR Example

A

B

Pn(b|a) = 1
-
(1
-
ca)(1
-
l)

Pn(b|a) = (1
-
ca)(1
-
l)

Pn(b|a) = 1
-
(1
-
l) = l = 0.5

Pn(b|a) = (1
-
l)

P(B|A)

b b

a

a

0.6 0.4

0.5 0.5

Pn(b|a) = 1
-
(1
-
ca)(1
-
l)=0.6

(1
-
ca)(1
-
l)=0.4

(1
-
ca) =0.4/(1
-
l)

=0.4/0.5=0.8

ca = 0.2

Noisy
-
OR Example II

A

B

C

Full model: P(c|ab)P(c|ab)P(c|ab)P(c|ab) & neg

Noisy
-
Or: ca, cb, l

Pn(c|ab) = 1
-
(1
-
ca)(1
-
cb)(1
-
l)

Pn(c|ab) = 1
-
(1
-
cb)(1
-
l)

Pn(c|ab) = 1
-
(1
-
ca)(1
-
l)

Pn(c|ab) = 1
-
(1
-
l)

Assume:

P(a)=0.1

P(b)=0.05

P(c|ab)=0.3

ca= 0.5

P(c|b) = 0.7

= l = 0.3

Pn(c|b)=Pn(c|ab)Pn(a)+Pn(c|ab)P(a)

1
-
0.7=(1
-
ca)(1
-
cb)(1
-
l)0.1+(1
-
cb)(1
-
l)0.9

0.3=0.5(1
-
cb)0.07+(1
-
cb)0.7*0.9

=0.035(1
-
cb)+0.63(1
-
cb)=0.665(1
-
cb)

0.55=cb

Graph Models

Bipartite graphs

E.g. medical reasoning

Generally, diseases cause symptom (not reverse)

d1

d2

d3

d4

s1

s2

s3

s4

s5

s6

Topologies

Generally more complex

Polytree: One path between any two nodes

General Bayes Nets

Graphs with undirected cycles

No directed cycles
-

can’t be own cause

Issue: Automatic net acquisition

Update probabilities by observing data

Learn topology: use statistical evidence of indep,
heuristic search to find most probable structure

Decision Making

Design model of rational decision making

Maximize expected value among alternatives

Uncertainty from

Outcomes of actions

Choices taken

To maximize outcome

Select maximum over choices

Weighted average value of chance outcomes

Gangrene Example

Medicine

Amputate foot

Live 0.99

Die 0.01

850

0

Die 0.05

0

Full Recovery 0.7

1000

Worse 0.25

Medicine

Amputate leg

Die 0.4

0

Live 0.6

995

Die 0.02

0

Live 0.98

700

Decision Tree Issues

Problem 1: Tree size

k activities : 2^k orders

Solution 1: Hill
-
climbing

Choose best apparent choice after one step

Use entropy reduction

Problem 2: Utility values

Difficult to estimate, Sensitivity, Duration

Change value depending on phrasing of question

Solution 2c: Model effect of outcome over lifetime

Conclusion

Reasoning with uncertainty

Many real systems uncertain
-

e.g. medical
diagnosis

Bayes’ Nets

Model (in)dependence relations in reasoning

Noisy
-
OR simplifies model/computation

Assumes causes independent

Decision Trees

Model rational decision making

Maximize outcome: Max choice, average outcomes

Holmes Example (Pearl)

Holmes is worried that his house will be burgled. For

the time period of interest, there is a 10^
-
4 a priori chance

of this happening, and Holmes has installed a burglar alarm

to try to forestall this event. The alarm is 95% reliable in

sounding when a burglary happens, but also has a false

positive rate of 1%. Holmes’ neighbor, Watson, is 90% sure

to call Holmes at his office if the alarm sounds, but he is also

a bit of a practical joker and, knowing Holmes’ concern,

might (30%) call even if the alarm is silent. Holmes’ other

neighbor Mrs. Gibbons is a well
-
known lush and often

befuddled, but Holmes believes that she is four times more

likely to call him if there is an alarm than not.

Holmes Example: Model

There a four binary random variables:

B: whether Holmes’ house has been burgled

A: whether his alarm sounded

W: whether Watson called

G: whether Gibbons called

B

A

W

G

Holmes Example: Tables

B = #t B=#f

0.0001 0.9999

A=#t A=#f

B

#t

#f

0.95 0.05

0.01 0.99

W=#t W=#f

A

#t

#f

0.90 0.10

0.30 0.70

G=#t G=#f

A

#t

#f

0.40 0.60

0.10 0.90