LINKÖPINGS UNIVERSITET
732A45 Statistical Evidence Evaluation
Institutionen för datavetenskap
Fall semester 2012
Avd. för Statistik/ANd
Exercises
Exercises on activity level propositions and combination of
evidence
1
.
Shoeprints and dirty shoes
Assume a burglary was made into a private home. The offender has obviously entered the
house from the backdoor and the evidence for that assumption is a number of
strange
footmarks left outside the backdoor in the soil.
Very soon after the burglary a susp
ect is apprehended wearing shoes with substantial amount
of soil on the soles. A comparison between the soil on the shoes and the soil outside the
backdoor of the house shows enough
coincidences
in compounds
to state there is an
uncontroversial match.
Furt
her this type of soil is not that common, the prevalence is about
8% on the average.
We wish to evaluate the match against a p
a
ir of propositions stating that (1) The suspect was
walking around close to the backdoor and; (2) The suspect has never been in
the
neighbourhood of the backdoor.
There are two studies available to help in this evaluation. The first is concerned with the
transfer of soil material to shoe soles and consists of 1200 experiments using different
footwear and a variety of soil types.
The shoes of this study are initially carefully cleaned and
then inspected 30 minutes after they have been in contact with the soil. The soil used in the
experiment was dyed to facilitate the identification of residues.
The results are g
iven in the
file
s
oiltranfer.txt
. The second is as survey of shoe soles with respect of residues from soil and
comprises results from 1500 sampled footwear of different types. The results are gi
ven in the
file
soilresidues.txt
.
Both files can be downloaded
from the course w
eb
(www.ida.liu.se/~732A45/info/diary.en.shtml
)
When evaluating the match it can be taken for granted that there was a contact between the
shoes of the offender and the soil outside the backdoor, thus no consideration of contact needs
to be done.
a)
Use all
available data in an efficient way to evaluate the match with a BN taking into
account transfer and background nodes.
Use the “Match form” of the network.
b)
Reconstruct the network to an “X

Y form”. See pages 201ff in the book. What are the
immediate conseq
uences for the background node(s)?
c)
Now, suppose we also identify the suspect’s shoe as being of type F. Would that affect
the probabilities tables of the obtained network? Does the evidence value change?
2.
Shoeprints: Combining findings
Now besides the
recovery of soil residues on the suspect’s shoes there are also matches in
pattern between all shoe marks and the suspect’s shoes. This pattern is classified as number
113 from an internal classification list. In the file
shoeresidues.txt
there is also in
formation
about the classified pattern of each shoe in the survey.
LINKÖPINGS UNIVERSITET
732A45 Statistical Evidence Evaluation
Institutionen för datavetenskap
Fall semester 2012
Avd. för Statistik/ANd
Exercises
a)
Use the information in
shoeresidues.txt
and construct a BN with which the match(es)
in pattern can be evaluated against source level propositions.
b)
Combine the network obtained in a) with the network obtained in task 1 to get a
combined evidence value of the match in soil and the match in pattern. What
assumptions need to be
made to make this network valid?
3.
Transfer of fibres
Let’s assume that in
all bank robberies where an escape car is used the driver of this car does
seldom take part in the very robbery (although he may be convicted for assistance to robbery).
A provisional estimate is that only in 2% of all bank robberies the car driver would
take active
part. Now assume that a witness has pointed out a suspect to have taken active part in a bank
robbery. Inspecting the escape
car we find 6
strange wool fibres on the driver’s seat, but no
strange fibres are secured anywhere else in the car. The
suspect has a wool made pull

over
with exactly the same kind of fibres that were secured on the driver’s car seat. The fibres are
quite common, though, and are estimated to be found in 23% of all wool

made pull

overs
on
the market. The witness has however
stated that the suspect wore that particular pull

over
when committing the robbery.
There is also a vast study made on car seats to investigate the prevalence of persisting fibres.
This study shows that one may expect to find one group of strange fibres
on 43% of all car
seats, two groups of (different) strange fibres on 22% of all car seats and more than two
groups on 10% of all car seats. Further an experimental study shows that woollen garments
tend to leave groups of between 2 and 10 fibres on car se
ats in 2 out of 5 cases when there is a
contact between the garment and the seat and more than 10 fibres in 1 out of 15 cases. One
may also assume that once some fibres have been left they are very persistent and can
therefore with a probability close to 1
00% be recovered at an inspection.
Use all this information to put up a BN with which it is possible to evaluate the match
between the recovered fibres and the fibres of the suspect’s pullover against the pair of
propositions:
H
p
:
The suspect drove the escape car
H
d
: The suspect has never sat on the driver’s seat of the escape car
Discuss the validity of the way the evidence is evaluated. Are there serious drawbacks with
the procedure?
4. DNA in a crime case
Assume a stain wit
h the genotype AB has been found, and a suspect has been tested to have
the genotype AB also. Assume the population frequencies of A and B are 0.03 and 0.12,
respectively. Construct a Bayesian Network for this situation, where you use one node for
each gen
otype, one node for each of the involved paternal alleles, and one node for each of
the involved maternal alleles. Use also one node for the hypotheses of the prosecutor/defence.
a)
If you assume the prior probability that the suspect deposited the stain wa
s 0.01, what
is the posterior probability given the genetic data? (Use the network to compute)
LINKÖPINGS UNIVERSITET
732A45 Statistical Evidence Evaluation
Institutionen för datavetenskap
Fall semester 2012
Avd. för Statistik/ANd
Exercises
b)
Use instead the network to compute the probability of the observed data under each of
the two hypotheses.
c)
Set up an equation computing the answer in (a) from th
e answer in (b).
5. DNA in a disputed paternity case
Using the same type of nodes as in exercise 1, set up a simple BN for paternity cases, using a
single locus with three possible alleles: A, B, and C. Use nodes for the genotype of the
mother, the genotype of the father, and the genotype for the child, and
in addition nodes for
paternal and maternal alleles. Assume the allele frequencies are 0.3, 0.45, and 0.25,
respectively. Use also a node for the two usual hypotheses in paternity cases.
a)
Compute the posterior for the hypothesis that the putative father i
s the father,
assuming that the prior is 0.5, and that the data is AB, AC, AC for the mother, father,
and child, respectively.
b)
Compute the likelihood ratio in the same case for the putative father being the real
father.
c)
Assume you also have data from an
independent locus, with possible alleles X, Y, and
Z, and frequencies 0.1, 0.1, 0.8, respectively. Make a network combining information
from both loci, and compute the result if the data is XX, XY, XY for the mother,
father, and child, respectively.
Comments 0
Log in to post a comment