LINKÖPINGS UNIVERSITET
732A45 Statistical Evidence Evaluation
Institutionen för datavetenskap
Fall semester 2012
Avd. för Statistik/ANd
Exercises
Exercises on activity level propositions and combination of
evidence
1
.
Shoeprints and dirty shoes
Assume a burglary was made into a private home. The offender has obviously entered the
house from the backdoor and the evidence for that assumption is a
number of
strange
footmarks left outside the backdoor in the soil.
Very soon after the burglary a suspect is apprehended wearing shoes with substantial amount
of soil on the soles. A comparison between the soil on the shoes and the soil outside the
backdo
or of the house shows enough
coincidences
in compounds
to state there is an
uncontroversial match.
Further this type of soil is not that common, the prevalence is about
8% on the average.
We wish to evaluate the match against a p
a
ir of propositions statin
g that (1) The suspect was
walking around close to the backdoor and; (2) The suspect has never been in the
neighbourhood of the backdoor.
There are two studies available to help in this evaluation. The first is concerned with the
transfer of soil material
to shoe soles and consists of 1200 experiments using different
footwear and a variety of soil types.
The shoes of this study are initially carefully cleaned and
then inspected 30 minutes after they have been in contact with the soil. The soil used in th
e
experiment was dyed to facilitate the identification of residues.
The results are g
iven in the
file
soiltranfer.txt
. The second is as survey of shoe soles with respect of residues from soil and
comprises results from 1500 sampled footwear of different ty
pes. The results are gi
ven in the
file
soilresidues.txt
.
Both files can be downloaded
from the course web
(www.ida.liu.se/~732A45/info/diary.en.shtml
)
When evaluating the match it can be taken for granted that there was a contact between the
shoes of the
offender and the soil outside the backdoor, thus no consideration of contact needs
to be done.
a)
Use all available data in an efficient way to evaluate the match with a BN taking into
account transfer and background nodes.
Use the “Match form” of the networ
k.
b)
Reconstruct the network to an “X

Y form”. See pages 201ff in the book. What are the
immediate consequences for the background node(s)?
c)
Now, suppose we also identify the suspect’s shoe as being of type F. Would that affect
the probabilities tables of the
obtained network? Does the evidence value change?
2.
Shoeprints: Combining findings
Now besides the recovery of soil residues on the suspect’s shoes there are also matches in
pattern between all shoe marks and the suspect’s shoes. This pattern is classi
fied as number
113 from an internal classification list. In the file
shoeresidues.txt
there is also information
about the classified pattern of each shoe in the survey.
LINKÖPINGS UNIVERSITET
732A45 Statistical Evidence Evaluation
Institutionen för datavetenskap
Fall semester 2012
Avd. för Statistik/ANd
Exercises
a)
Use the information in
shoeresidues.txt
and construct a BN with which the match(es)
i
n pattern can be evaluated against source level propositions.
b)
Combine the network obtained in a) with the network obtained in task 1 to get a
combined evidence value of the match in soil and the match in pattern. What
assumptions need to be
made to make th
is network valid?
3.
Transfer of fibres
Let’s assume that in all bank robberies where an escape car is used the driver of this car does
seldom take part in the very robbery (although he may be convicted for assistance to robbery).
A provisional estimate
is that only in 2% of all bank robberies the car driver would take active
part. Now assume that a witness has pointed out a suspect to have taken active part in a bank
robbery. Inspecting the escape
car we find 6
strange wool fibres on the driver’s seat, b
ut no
strange fibres are secured anywhere else in the car. The suspect has a wool made pull

over
with exactly the same kind of fibres that were secured on the driver’s car seat. The fibres are
quite common, though, and are estimated to be found in 23% of a
ll wool

made pull

overs
on
the market. The witness has however stated that the suspect wore that particular pull

over
when committing the robbery.
There is also a vast study made on car seats to investigate the prevalence of persisting fibres.
This study
shows that one may expect to find one group of strange fibres on 43% of all car
seats, two groups of (different) strange fibres on 22% of all car seats and more than two
groups on 10% of all car seats. Further an experimental study shows that woollen garm
ents
tend to leave groups of between 2 and 10 fibres on car seats in 2 out of 5 cases when there is a
contact between the garment and the seat and more than 10 fibres in 1 out of 15 cases. One
may also assume that once some fibres have been left they are v
ery persistent and can
therefore with a probability close to 100% be recovered at an inspection.
Use all this information to put up a BN with which it is possible to evaluate the match
between the recovered fibres and the fibres of the suspect’s pullover
against the pair of
propositions:
H
p
: The suspect drove the escape car
H
d
: The suspect has never sat on the driver’s seat of the escape car
Discuss the validity of the way the evidence is evaluated. Are there serious drawbacks with
the procedure?
4.
DNA in a crime case
Assume a stain with the genotype AB has been found, and a suspect has been tested to have
the genotype AB also. Assume the population frequencies of A and B are 0.03 and 0.12,
respectively. Construct a Bayesian Network for this situat
ion, where you use one node for
each genotype, one node for each of the involved paternal alleles, and one node for each of
the involved maternal alleles. Use also one node for the hypotheses of the prosecutor/defence.
a)
If you assume the prior probability
that the suspect deposited the stain was 0.01, what
is the posterior probability given the genetic data? (Use the network to compute)
LINKÖPINGS UNIVERSITET
732A45 Statistical Evidence Evaluation
Institutionen för datavetenskap
Fall semester 2012
Avd. för Statistik/ANd
Exercises
b)
Use instead the network to compute the probability of the observed data under each of
the two hypotheses.
c)
Set up an equa
tion computing the answer in (a) from the answer in (b).
5. DNA in a disputed paternity case
Using the same type of nodes as in exercise 1, set up a simple BN for paternity cases, using a
single locus with three possible alleles: A, B, and C. Use nodes
for the genotype of the
mother, the genotype of the father, and the genotype for the child, and in addition nodes for
paternal and maternal alleles. Assume the allele frequencies are 0.3, 0.45, and 0.25,
respectively. Use also a node for the two usual hyp
otheses in paternity cases.
a)
Compute the posterior for the hypothesis that the putative father is the father,
assuming that the prior is 0.5, and that the data is AB, AC, AC for the mother, father,
and child, respectively.
b)
Compute the likelihood ratio in
the same case for the putative father being the real
father.
c)
Assume you also have data from an independent locus, with possible alleles X, Y, and
Z, and frequencies 0.1, 0.1, 0.8, respectively. Make a network combining information
from both loci, and com
pute the result if the data is XX, XY, XY for the mother,
father, and child, respectively.
Comments 0
Log in to post a comment