Graphical Causal Models: Determining

lettuceescargatoireAI and Robotics

Nov 7, 2013 (4 years and 1 month ago)

73 views

Graphical Causal Models: Determining
Causes from Observations

William Marsh

Risk Assessment and Decision Analysis
(RADAR)

Computer Science

RADAR Group, Computer Science


Risk Assessment and Decision Analysis


Research areas


Software engineering, safety, finance, legal


A new initiative in medical data analysis: DIADEM

Norman Fenton

Group leader

Martin Neil

http://www.dcs.qmul.ac.uk/researchgp/radar/


Outline


Graphical Causal Models


Bayesian networks:
prediction or diagnosis


Causal induction:
learning causes from data


Causal effect estimation:
strength of causal
relationships from data



DIADEM project

Bayesian Nets

Detecting Asthma Exacerbations


Aim to assist early
detection of asthma
episodes in Paediatric
A&E



Using only data
already available
electronically


Network created by


Experts


Data

Bayes’ Theorem

)
(
).
|
(
)
(
).
|
(
)
,
(
A
P
A
B
P
B
P
B
A
P
B
A
P


Joint probability

)
(
).
|
(
)
|
(
A
P
A
B
P
B
A
P

Revised belief
about A, given
evidence B

Prior probability of A

Factor to update belief
about A, given evidence B

Bayes’ Theorem (Made Easy)


A person has a positive test result


How likely is it they are infected?


17%

Infection

Test

yes, no

pos, neg

False positive P(T=pos|I=no) = 5%

Negligible false negative

Infection rate: P(I) = 1%

Medical Uses of BNs


Diagnosis


Differential diagnosis from symptoms


Prediction


Likely outcome


Building a BN


From expert knowledge


expert system


From data


data mining

Beyond Bayesian Networks

Cause versus Association


Both represent fever


infection
association


‘Causal model’ has arrow from cause to effect

Infection

Fever

Infection

Fever

or

?

)
(
).
|
(
)
(
).
|
(
)
,
(
F
P
F
I
P
I
P
I
F
P
F
I
P


Joint probability same:

Causal Induction


Discover causal relationships from data


Sometimes distinguishable







… different conditional independence

A

B

C

A

B

C

Causal Induction


Application


Discover causal relationships from data


Need lots of data



Applied to gene regulatory networks


Data from micro
-
array experiments


Recent explanation of limitations

Estimating Causal Effects


Suppose A is a cause of B





What is the causal effect?


Is it p(B | A) ?

A

B

Benefits of Sports?


Is there a relationship between sport and
exam success?


Data available


‘Intelligence’ correlate


Is this the correct test?

intelligence

sport

exam result

P(exam=pass|sport) > P(exam=pass| no
-
sport)

Benefits of Sports?


When we condition on ‘sport’


Probability for ‘exam result’


Probability for ‘intelligence’ changes


What if I
decide

to start sport?

p(pass|sport) > p(pass| no
-
sport)

73%

67%

observe

intelligence

sport

exam result

Intervention v Observation


Causal effect

differs from conditional probability




Mostly interested in
consequence of change


Causal effects

can be measured by a Randomised
Control Trial


Causal effect of sport on exam results not identifiable

change

P(pass|do(sport)) < P(pass| do(no sport))

intelligence

sport

exam result

Benefit of Sport


New observable variable ‘attendance at
lectures’


Causal effect of sport on exam results now
identifiable


sport (S)

exam result (E)

intelligence

attendance (A)







S
A
S
P
A
S
E
P
S
A
P
S
do
E
P
)
(
).
,
|
(
)
|
(
))
(
|
(
Estimating Causal Effects


Rules to convert
causal

to
statistical

questions


Generalises e.g. stratification, potential outcomes


Assumptions: a causal model


Some assumptions may be testable


Causal model


Some variables observed, others not measured


Some
causal effects

identifiable


Challenges


Causal models for complex applications


Statistical implications

Example Application


Royal London trauma service


Criteria for activation of the trauma team


Aim to prevent unnecessary trauma team calls


Extensive records of trauma patient outcomes


US study of 1495 admissions proposed new
‘triage’ criteria


Significant decrease in overtriage 51%


29%


In
significant increase in undertriage 1%


3%


None of the patients undertriaged by new criteria died


Does this show safety of new criteria?

DIADEM Project

Digital Economy in Healthcare


Data Information and Analysis for clinical
DEcision Making


EPSRC Digital Economy


Cluster


Partnership between solution providers and
clinical data analysis problem holders


Summarise unsolved data analysis needs, in
relation to the analysis techniques available

Join the DIADEM cluster

Cluster Activities and Outcomes


Engage stakeholders and build a
community:


Creation of a community web
-
site and forum


Meetings with potential ‘problem holders’


Workshops


A road map: data and information


Follow
-
up proposal


A self
-
sustaining website


health data
analytics

Summary


Bayesian networks


Prediction and diagnosis


Causal induction


Identify (some) causal relationships from (lots
of) data


Causal effects


Experimental results from …


… non
-
experimental data


… assumptions (causal model)

Join the
DIADEM
cluster