Quantitative and qualitative approaches to reasoning under uncertainty in medical decision making

brewerobstructionAI and Robotics

Nov 7, 2013 (3 years and 7 months ago)


Quantitative and qualitative approaches to reasoning under
uncertainty in medical decision making

John Fox , David Glasspool and Jonathan Bury

Imperial Cancer Research Fund Labs

Lincoln's Inn Fields

London WC2A 3PX

United Kingdom

jf@acl.icnet.uk; dg@acl.ic
net.uk; jb@acl.icnet.uk

Medical decision making frequently requires the effective
management and communication of uncertainty and risk. However a tension
exists between classical probability theory, which is precise and rigorous but
which peopl
e find non
intuitive and difficult to use, and qualitative approaches
which are
ad hoc

but can be more versatile and easily comprehensible. In this
paper we review
a range of approaches to uncertainty management, then
describe a logical approach, argumenta
tion, which subsumes qualitative as well
as quantitative representations and has a clear formal semantics. The approach
is illustrated and evaluated in five decision support applications.



Representing and managing uncertainty is central to un
derstanding and supporting
much clinical decision
making and is extensively studied in AI, computer science and
psychology. Implementing effective decision models for practical clinical applications
presents a dilemma. On the one hand, informal and qualit
ative representations of
uncertainty may be natural for people to understand but they often lack formal rigour.
On the other hand formal approaches based on probability theory are precise but can
be awkward and non
intuitive to use. While in many cases pr
obability theory is an
ideal for optimal decision
making it is often impractical.

In this paper we review a range of approaches to uncertainty representation, then
describe a logical approach, argumentation, which can subsume both qualitative and
ive approaches within the same formal framework. We believe that this may
enable improvements in both the scope and comprehensibility of decision support
systems and illustrate the approach with five decision support applications developed
by our group.


ecision Theory and Decision Support

It is generally accepted that human reasoning and decision
making can exhibit various
shortcomings when compared with accepted prescriptive theories derived from
mathematical logic and statistical decision
making, and th
ere are systematic patterns
of distortion and error in people's use of uncertain information [1, 2, 3, 4]. In the
1970s and 1980s cognitive scientists generally came to the view that these
characteristic failures come about because people revise their beli
efs by processes that
bear little resemblance to formal mathematical calculation. Kahneman and Tversky
developed a celebrated account of human decision
making and its weaknesses in
terms of what they called heuristics and biases [5]. They argued that peopl
e judge
things to be highly likely when, for example, they come
easily to mind

or are

of a class rather than by means of a proper calculation of the relative probabilities.
Such heuristic methods are often reasonable approximations for practical de
making but they can also lead to systematic errors.

If people demonstrate imperfect reasoning or decision
making then it would
presumably be desirable to support them with techniques that avoid errors and comply
with rational rules.
Mathematical lo

(notably "classical" propositional logic and the
predicate calculus) is traditionally taken as the gold standard for logical reasoning and
deduction, while
expected utility theory

(EUT) plays the equivalent role for decision
making. A standard view on
the "correct" way to take decisions is summarised by
Lindley as follows:

"... there is essentially only one way to reach a decision sensibly. First, the
uncertainties present in the situation must be quantified in terms of values
called probabilities. Seco
nd, the consequences of the courses of actions
must be similarly described in terms of utilities. Third, that decision must
be taken which is expected on the basis of the calculated probabilities to
give the greatest utility. The force of 'must' used in th
ree places there is
simply that any deviation from the precepts is liable to lead the decision
maker in procedures which are demonstrably absurd" [6], p.vii.

However, many people think that this overstates the value of mathematical methods
and understates

the capabilities of human decision makers. There are a number of
problems with EUT as a practical method for decision making, and there are
indications that, far from being "irrational", human decision processes depart from
EUT because they are optimised
to make various tradeoffs to address these problems
in practical situations.

An expected
utility decision procedure requires that we know, or can estimate
reasonably accurately, all the required probability and utility parameters. This is
frequently diffic
ult in real
world situations since a decision may still be urgently
required even if precise quantitative data are not available. Even when it is possible to
establish the necessary parameters, the cost of obtaining good estimates may
outweigh the expected

benefits. Furthermore, in many situations a decision is needed
before the decision options, or the relevant information sources, are fully known. The
complete set of options may only emerge as the decision making process evolves. The
potential value of ma
thematical decision theory is frequently limited by the lack of
objective quantitative data on which to base the calculations, the limited range of
functions that it can be used to support, and the problem that the underlying numerical
representation of th
e decision is very different from the intuitive understanding of
human decision makers.

Human decision
making may also not be as "absurd" as the normative theories
appear to suggest. One school of thought argues that many apparent biases and

are actually artefacts of the highly artificial situations that researchers
create in order to study reasoning and judgement in controlled laboratory conditions.
When we look at real
world decision
making we see that human reasoning and
making is

more impressive than the research implies.

Shanteau [7] studied

making, “investigating factors that lead to
competence in experts, as opposed to the usual emphasis on incompetence". He
identified a number of important positive characteris
tics of expert decision
First, they know what is relevant to specific decisions, what to attend to in a busy
environment, and they know when to make exceptions to general rules
. Secondly,
experts know a lot about what they know, and can make decis
ions about their own
decision processes: they know which decisions to make and when, and which to skip,
for example. They often have good communication skills and the ability to articulate
their decisions and how they arrived at them. They can adapt to cha
nging task
conditions, and are frequently able to find novel solutions to problems. Classical
deduction and probabilistic reasoning do not deal with these meta
cognitive skills.

It has also been strongly argued that people are well adapted for making dec
under adverse conditions: time pressure, lack of detailed information and knowledge
etc. Gigerenzer, for instance, has suggested that people make decisions in a "fast and
frugal" way, which is to say human cognition is rational in the sense that it
optimised for speed at the cost of occasional and usually inconsequential errors [4].


Tradeoffs in Effective Decision

The strong view expressed by Lindley and others is that the only rational or
"coherent" way to reason with uncertainty is to req
uire that we comply with certain
mathematical axioms

the axioms of EUT. In practice compliance with these axioms
is often difficult because there is insufficient data to permit a valid calculation of the
expected utility of the decision options, or a val
id calculation may require too much
time. An alternative perspective is that human decision
making is rational in that it
incorporates practical tradeoffs that, for example, trade a lower cost (e.g. errors in
making) against a higher one (e.g. the

amount of input data required or the
time taken to calculate expected utility).

Tradeoffs of this kind not only simplify decision
making but in practice may
entail only modest costs in the accuracy or effectiveness of decision
Consequently the cl
aim that we should not model decision
support systems on human
cognitive processes is less compelling than it may at first appear.

This possibility has been studied extensively in the field of medical decision

In the prediction of sudden infant
death, for example, Carpenter et al. [8]
attempted to predict death from a simple linear combination of eight variables. They
found that weights can be varied across a broad range without decreasing predictive

In diagnosing patients suffering f
rom dyspepsia, Fox et al. [9] found that giving
all pieces of evidence equal weights produced the same accuracy as a more precise
statistical method (and also much the same pattern of errors). In another study [10]
we developed a system for the interpreta
tion of blood data in leukaemia diagnosis,
using the EMYCIN expert system shell. EMYCIN provided facilities to attach
numerical "certainty factors" to inference rules. Initially a system was developed


“Good surgeons, the saying goes, know how to operate, better surgeons know when to
operate and the best surgeons know when not to operate. That's true for all of medicine”
Richard Smith, Editor of the
British Medical Journal

using the full range of available values (
1 to +1). Th
ese values were then replaced
with just two: if the rule made a purely categorical inference the certainty factor was
set to be 1.0 while if there was

uncertainty associated with the rule the certainty
factor was set to 0.5. The effect was to

diagnostic accuracy by 5%!

In a study of whether or not to admit patients with suspected heart attacks to
hospital by O’Neil and Glowinski [11] no advantage was found of a precise decision
procedure over simply "adding up the pros and cons". A similar comp
arison by
et al

in a diagnosis task [12] showed a slight increase in accuracy of
diagnosis with precise statistical reasoning, but the effect was so small it would have
no practical clinical value.

While the available evidence is not conclusive, a
provisional hypothesis is that for
some decisions strict use of quantitatively precise decision
making methods may not
add much practical value to the design of decision support systems over simpler, more
“ad hoc” methods.


Qualitative Methods for Decision

Work in artificial intelligence raises an even more radical option for the designers of
decision support systems. The desire to develop versatile automata has stimulated a
great deal of research in new methods of decision making under uncertainty,
from sophisticated refinements of probabilistic methods such as Bayesian networks
and Dempster
Shafer belief functions to non
probabilistic methods such as fuzzy
logic and possibility theory. Good overviews of the different approaches and their
lications are [13, 14].

These "non
standard" approaches are similar to probability methods in that they
treat uncertainty as a matter of
. However, even this apparently innocuous
assumption has also been widely questioned on the practical grounds tha
t they also
demand a great deal of quantitative input data and also that decision
makers often find
them difficult to understand because they do not capture intuitions about uncertainty

the nature of "belief", "doubt" and the form of natural justificatio
ns for decision

Consequently, interest has grown in AI in the use of non
numerical methods that
seem to have some "common
sense" validity for reasoning under uncertainty but are
ad hoc

from a formal point of view. These include
classical l

monotonic logics, default logic and defeasible reasoning. Cognitive approaches,
sometimes called

decision making, are also gaining ground, including
the idea of using informal endorsements for alternative decision options [
15] and
formalisations of everyday strategies of reasoning about competing beliefs and
actions based on logical

[16, 17, 18]


is a formalisation of the idea that decisions are made on the basis
of arguments for or against a claim. Fo
x and Das [19] propose that argumentation
may be the basis of a generalised decision theory, embracing standard probability
theory as a special case, as well as other qualitative and semi
approaches. To take an example, suppose we wished to m
ake the following informal

If three or more first degree relatives of a patient have contracted breast
cancer, then this is one reason to believe that the patient carries a gene
predisposing to breast cancer.

In the scheme of [19] arguments are
defined as logical structures having three


In the example the

term would be the proposition "this patient carries a
gene predisposing to breast cancer" and the

"three or more first degree
relatives of this p
atient have contracted breast cancer" is the justification for the
argument. The final term,
, specifies the nature and strength of the argument
which can be drawn from the grounds to the claim. In the example the qualifier is
informal "this is on
e reason to believe", but it could be more conventional, as in
"given the

are true the

is true with a conditional probability of 0.43".

Qualifiers are specified with reference to a particular

of terms, along
with an
aggregation f

that specifies how qualifiers in multiple arguments are
to be combined. One possible dictionary is the set of real numbers from 0.0 to 1.0,
which would allow qualifiers to be specified as precise numerical probability values
and, with an appropriat
e aggregation function based on the theorems of classical
probability, would allow the scheme to reduce to standard probability theory.

In many situations little can be known of the strength of arguments, other than
that they indicate an increase or decrea
se in the overall confidence in the claim. In this
case the dictionary and aggregation function might be simpler. In [20] for example we
describe a system which assesses carcinogenicity of novel compounds using a
dictionary comprising the symbols + (argume
nt in favour),

(argument against). The
aggregation function in this case simply sums the number of arguments in favour of a
proposition minus those against. Other dictionaries can include additional qualifiers,
such as ++ (the argument confirms the claim
) and

(the argument refutes the claim).

Another possible dictionary adopts

confidence terms (e.g.

etc.) and requires a logical aggregation function that
provides a formalised semantics for combining s
uch terms according to common
usage. Such terms have been formally categorised by their logical structure for
example [21, 22].

A feature of argumentation theory is that it subsumes these apparently diverse
approaches within a single formalism which has a

clear consistent formal semantics
[19]. Additionally, argumentation gives a different perspective on the decision
making process which we believe to be more in line with the way people naturally
think about probability and possibility than standard probab
ility theory. Given the
evidence reviewed above that people do not naturally use EUT in their everyday
decision making, our tentative position is that expressing a preference or confidence
in terms of arguments for and against each option will be more acc
essible and
comprehensible than providing a single number representing its aggregate probability.

We have developed a number of decision support systems to explore this idea. In
the next sections we describe a technology which supports an argument
ecision procedure, then outline several applications which have been built using it,
and finally consider quantitative evaluations of the applications.


Practical Applications of the Argumentation Method

We have used the argumentation framework in designing

five decision support
systems to date.



The CAPSULE (Computer Aided Prescribing Using Logic Engineering) system was
developed to assist GPs with prescribing decisions [19,23]. CAPSULE analyses
patient notes and constructs a list of relevant candida
te medications, together with
arguments for and against each option (based on nine different criteria, including
efficacy, contra
indications, drug interactions, side effects, costs etc).



The CADMIUM radiology workstation is an experimental packa
ge for combining
image processing with logic
based decision support. The main use that CADMIUM
has been applied to is in screening for breast cancer, in which an argument based
decision procedure has the dual function of controlling image processing functi
which extract and describe micro
calcifications in breast x
rays and interprets the
descriptions in terms of whether they are likely to indicate benign or malignant
abnormalities. The decision procedure assesses the arguments and presents them to
the u
ser in a structured report.



The RAGs (Risk Assessment in Genetics) system allows the user to describe a
patient's family tree incorporating information on the known incidence of cancers
within the family. This information is evaluated to assess th
e likelihood that a genetic
predisposition to a particular cancer is present. The software makes recommendations
for patient managementin language which is comprehensible to both clinician and

A family tree graphic is used for incremental data ent
ry (Figure 1). Data about
relatives are added by clicking on each relative’s icon, and completing a simple form.
RAGs analyses the data and provides detailed assessments of genetic risk by
weighing simple arguments like the example above rather than using
probabilities. Based on the aggregate genetic risk level the patient is classified as at
high, moderate or low risk of carrying a BrCa1 genetic mutation, which implies an
80% lifetime risk of developing breast cancer. Appropriate referral advice
can be
given for the patient based on this classification. The software can provide a
comprehensible explanation for its decisions based on the arguments it has applied
(Figure 1).

RAGs uses a set of 23 risk arguments (e.g. if the client has more than two
degree relatives with breast cancer under the age of 50 then this is a risk factor) and a
simple dictionary of qualifiers which allows small positive and negative integer
values as well as plus and minus infinity (equivalent to confirming or refuting

claim). This scheme thus preserves relative rather than absolute weighting of different
factors in the analysis.



Patients with symptoms or signs which may indicate cancer should be
investigated quickly so that treatment may commence as soon as po
ssible. The ERA
(Early Referrals Application) system has been designed in the context of the UK
Department of Health’s (DoH) “2 week” guideline which states that patients with
suspected cancer should be seen and assessed by an appropriate specialist withi
n 2
week of their presentation. Referral criteria are given for each of 12 main cancer
groups (e.g. Breast, Lung, Colorectal).

The practical decision as to whether or not to refer a particular patient is treated in
ERA as based on a set of patient
c arguments (see figure 2). The published
referral criteria have been specifically designed to be unambiguous and easy to apply.
Should any of the arguments apply to a particular patient, an early referral is
warranted. Qualifiers as to the weight of ea
ch argument are thus unnecessary, with
arguments generally acting categorically in an
all or nothing

fashion. An additional
feature of this domain is that there are essentially no counter
arguments. The value of
argumentation in this example lies largely

in the ability to give meaningful
explanations in the form of reasons for referral expressed in English.



The applications described so far involve making a single decision

what drug to
prescribe, or whether to refer a patient. Most decisions are m
ade in the context of
plans of action, however, where they may interact or conflict with other planned
actions or anticipated events. REACT (Risk, Events, Actions and their Consequences
over Time) is being developed to provide decision support for extended

plans. In
effect REACT is a logical spreadsheet that allows a user to manipulate graphical
widgets representing possible clinical events and interventions on a timeline interface
and propagates their implications (both qualitative and quantitative) to num
displays of risk (or other parameters) and displays of arguments and counter

While the REACT user creates a plan, a knowledge
based DSS analyses it according
to a set of definable rules and provides feedback on interactions between event
Rules may specify, for example, that certain events are mutually exclusive, that
certain combinations of events are impossible, or that events have different
consequences depending on prior or simultaneous events). Global measures (for
example the predi
cted degree of risk or predicted cost or benefit of combinations of
events) can be displayed graphically alongside the planning timeline. Qualitative
arguments for and against each individual action proposed in the plan can be
reviewed, and can be used to
generate recommended actions when specified
combinations of plan elements occur.

Figure 1: The RAGs software application. A family tree has been input for
the patient Karen, who has had three relatives affected by relevant cancers. The
risk assessment system has determined a pattern of inheritance that may indicate
a genetic factor,

and provides an explanation in the left
hand panel. Referral
advice for the patient is also available.

Figure 2: The ERA early referral application. After completing patient
details on a form, the program provides referral recommendations with advice,
d can contact the hospital to automatically make an appointment.



A number of these applications have been used to carry out controlled evaluations
(with the exception of ERA and REACT which are
still in development.

In the case of the CAPSULE prescribing system a controlled study with
Oxfordshire general practitioners showed the potential for substantial improvements
in the quality of their prescribing decisions [23]. With decision support there
was a
70% increase in the number of times the GPs decisions agreed with those of experts
considering the cases, and a 50% reduction in the number of times that they missed a
cheaper but equally effective medication.

The risk classifications generated by th
e RAGS software was compared for 50
families with that provided by the leading probabilistic risk assessment software. This
uses the Claus model [27], a mathematical model of genetic risk of breast cancer
based on a large dataset of cancer cases. Despite t
he use of a very simple weighting
scheme the RAGs system produced exactly the same risk classification (into high,
medium or low risk, according to established guidelines) for all cases as the
probabilistic system [24, 25]. RAGs resulted in more accurate p
edigree taking and
more appropriate management decisions than either pencil and paper or standard
probabilistic software [26].

In the CADMIUM image
processing system users are provided with assistance in
interpreting mammograms using an argument
d decision making component.
Radiographers who were trained to interpret mammograms were asked to make
decisions as to whether observed abnormalities were benign or malignant, with and
without decision support. The decision support condition produced clear

improvements in the radiographers’ performance, in terms of increased hits and
correct rejections and reduced misses and false positives [27].


Summary and conclusions

Human reasoning and decision
making can exhibit various shortcomings when
compared with

accepted prescriptive theories. Good decision
making is central to
many important human activities and much effort is directed at developing decision
support technologies to assist us. One school of thought argues that these systems
must be based on presc
riptive axioms of decision
making since to do otherwise leads
inevitably to irrational conclusions and choices. Others argue that decision
making in
the real world demands practical tradeoffs and a failure to make those tradeoffs would
be irrational. Altho
ugh the debate remains inconclusive, we have described a number
of examples of medical systems in which the use of simple logical argumentation
appears to provide effective decision support together with a versatile representation
of uncertainty which fits

comfortably with people's intuitions.



The RAGs project was supported by the Economic and Social Research Council, and
much of the development and evaluation work was carried out by Andrew Coulson
and Jon Emery. Michael Humber carried out

much of the implementation of the ERA



Kahneman D., Slovic P. and Tversky A. (eds): Heuristics and Biases. Cambridge
University Press, Cambridge (1982)


Evans J. st B. and Over D.E.: Rationality and reasoning. Psychology press, London (19


Wright G. and Ayton P. (eds): Subjective Probability. Wiley, Chichester, UK (1994)


Gigerenzer G. and Todd P.M.: Simple heuristics that make us smart. Oxford University
Press, Oxford (1999)


Kahneman, D. and Tversky, A.: Judgement under Uncertainty: Heur
istics and Biases.
Science, Vol. 185, 27 (1974) pp. 1124


Lindley DV.: Making Decisions (2

Edition). Wiley, Chichester, UK (1985)


Shanteau, J.: Psychological characteristics of expert decision makers. In , J Mumpower
(ed) Expert judgement and exper
t systems. NATO ASI Series, volume F35 (1987)


Carpenter R G, Garnder A, McWeeny P M. and Emery J L.: Multistage scoring system
for identifying infants at risk of unexpected death. Archives of disease in childhood, 53(8),
(1977) pp.600


Fox J. Barber DC
. and Bardhan KD.: Alternatives to Bayes? A quantitative comparison
with rule
based diagnosis. Methods of Information in Medicine, 10 (4) (1980) pp.210


Fox, J. Myers CD., Greaves MF. and Pegram S.: Knowledge acquisition for expert
systems: experience i
n leukaemia diagnosis. Methods of Information in Medicine, 24 (1)
(1985) pp.65


O'Neil MJ. and Glowinski AJ.: Evaluating and validating very large knowledge
systems. Medical Informatics, 15 (3) (1990) pp.237


Pradhan M.: The sensitivity of belie
f networks to imprecise probabilities: an experimental
investigation. Artificial Intelligence Journal, 84 (1
2) (1996) pp.365


Krause P. and Clark C.: Uncertainty and Subjective Probability in AI Systems In Wright
G & Ayton P (Eds) Subjective Probab
ility, Wiley J & Sons (1994) pp.501


Hunter, A. and Parsons, S. (Editors) Applications of uncertainty formalisms, Springer
Verlag, LNAI 1455, (1998)


Cohen P.R.: Heuristic Reasoning: An Artificial Intelligence Approach. Pitman Advanced
Publishing Progra
m, Boston (1985)


Fox J.: On the necessity of probability: Reasons to believe and grounds for doubt. In
Wright G, Ayton, P, eds., Subjective Probability. John Wiley, Chichester (1994)


Fox J, Krause P. and Ambler S.: Arguments, contradictions and practical
reasoning. In
Neumann B, ed. Proceedings of the 10

European Conference on AI (ECAI92), Vienna,
Austria (1992) pp.623


Curley SP. and Benson PG.: Applying a cognitive perspective to probability construction.
In G Wright & P Ayton (eds.), Subjective Pro
bability. John Wiley & Sons. Chichester,
England (1994) pp. 185


Fox J. and Das S K.: Safe and Sound: Artificial Intelligence in Hazardous Applications,
American Association of Artificial Intelligence and MIT Press (2000)


Tonnelier CAG., Fox J., Judson
P., Krause P., Pappas N. and Patel M.: Representation of
Chemical Structures in Knowledge
Based Systems. J. Chem. Inf. Sci. 37 (1997) pp.117


Goransson M, Krause P.J and Fox J.: Acceptability of Arguments as Logical
Uncertainty. In: Clarke M,
Kruse R and Moral S, eds. Symbolic and Quantitative
Approaches to Reasoning and Uncertainty. Proceedings of European Conference
ECSQARU93. Lecture Notes in Computer Science 747. Springer
Verlag (1993) pp.85


Glasspool DW. and Fox J.: Understanding prob
ability words by constructing concrete
mental models. In Hahn M, Stoness, SC, eds, Proceedings of the 21

Annual Conference
of the Cognitive Science Society (1999) pp.185


Walton RT, Gierl C, Yudkin P, Mistry H, Vessey MP & Fox J.: Evaluation of compu
support for prescribing (CAPSULE) using simulated cases.
British Medical Journal

(1997) pp.791


Emery J, Watson E, Rose P and Andermann, A. A systematic review of the literature
exploring the role of primary care in genetic services. Fam. Prac
. 16 (1999) pp.426



AS, Glasspool

DW, Fox

J and Emery J.

Computerized Genetic Risk Assessment
from Family Pedigrees. MD computing, in press.


Emery J.
, Walton, R., Murphy, M., Austoker, J., Yudkin, P., Chapman, C., Coulson, A.,
Glasspool, D. and

, J.: Computer support for interpreting family histories of breast and
ovarian cancer in primary care: comparative study with simulated cases.
British Medical
Journal, 321 (2000) pp.28


Claus E, Schildkraut J, Thompson WD and Risch, N. The genetic a
ttributable risk of
breast and ovarian cancer.

1996; 77, pp. 2318


Taylor, P, Fox J and Todd
Pokropek, A “The development and evaluation of CADMIUM:
a prototype system to assist in the interpretation of mammograms”
Medical Image

1999, 3

(4), 321