BINF 101 Introduction to Bioinformatics Spring 2008 Final Exam
May 1, 2008
This is an open book, open notes test.
Please write all your answers on the blue books.
Write your name on the blue books.
1.
(15 points) For the following distance matrix (o
f some gene sequences s1, s2, s3 , s4):
s1
s2
s3
s4
s1
0
5
4
4
s2
5
0
3
7
s3
4
3
0
6
s4
4
7
6
0
(a) Is it
additive
? explain why or why not.
(b) Apply neighbor joining algorithm (my version) to construct a phylogenetic tree.
2.(15 points) For t
he following multiple sequence alignment of four sequences s1, s2, s3,
s4, of length 3, find the maximum parsimony tree; that is, find the phylogenetic tree
based on the parsimony method.
s1
: GGT
s2
: AGG
s3
: AAC
s4
: GAA
3.(15
points)
Assume that th
e score of match is 1 and the score of mismatch is 0, and that
there is no penalty for gaps. Find the optimal alignment of the following two sequences
using dynamic programming method:
s1
:
HVADLVAL
s2
:
ADLHTVAD
NOTE: No credit for visual
match
ing
(co
mputer can’t see)
; you must follow the
algorithm step

by

step.
4.(15
points)
In the maximum likelihood method for constructing the phylogenetic tree,
f
or the following tree topology
T
of three
given
leaves
x
1
, x
2
, x
3
(which are nucleotides)
with
given br
anch length
t
= (
t
1
, t
2
, t
3
, t
4
)
; for example,
x
1
=A
,
x
2
=G
,
x
3
=C, and
t
1
=6,
t
2
=3,
t
3
= 4
,
t
4
= 2,
what is the likelihood (or the probability) of the tree
:
P
(
x
1
, x
2
, x
3

T
,
t
)
in terms of the
probability assignment
s
q
(
a
) and
p
(
x
i

x
j
,
t
)
, where
q
(
a
) is the
probability of assigning a nucleotide
a
to a node and
p
(
x
i

x
j
,
t
) is the
(conditional)
probability of mutation from
x
j
to
x
i
after time
t
?
x
1
x
2
x
3
5.(1
5
points)
In the
BLAST search, what is the meaning of the p

value?
H
ow does the
increasing or deceasing of the word size
w
affect the results?
How does the increasing
or deceasing of the threshold
T
affect the results?
6.(1
5
points)
How is
PAM

1 scoring matrix constructed? What is the meaning of PAM

n,
for example PAM

250? How is BLOSUM scoring matrix constructed? What is the
meaning of BLOSUM

n, for example, BLOSUM

61?
7.(15
points)
For multiple sequence alignment, we discussed two
methods: multi

dimensional dynamic programming and Feng

Doolittle’s progressive alignment using a
guide tree (
as used in
CLUSTALX). What are the pros and cons for each of them?
◦
◦
t
1
t
4
t
2
t
3
T
:
Comments 0
Log in to post a comment