Bioinformatics - Multiple sequence comparisons


Multiple sequence comparisons

Alignments of more than two sequences are dubbed "multiple sequence alignments". This
technique is used for the determination of the relationships among several sequences and the
construction of phylogenetic

Determining family history

is a favorite pastime, generating trees of the relationships among
family members. In science and medicine, researchers often have to determine the relationships
and descendence of DNA sequences and proteins to hunt down

diseases or to determine the level
of relatedness among organisms, proteins, genes, or plain sequences. Bioinformatically, this
problem is solved by first identifying the two sequences that are most closely related, and
aligning them. Then, increasingly d
ivergent sequences are added to the alignment to generate a
multiple sequence alignment of all sequences. It is obvious that this process can take quite some
time, depending on the number, length, and degree of similarity of the sequences.


is one

of the more well
known tools for multiple sequence alignments. It has been
around for several years and has not been surpassed by newer tools. In this exercise perform a
multiple sequence alignment to determine the relationship among the
genes from HI
V and
SIV DNA isolated from a variety of primates.


Open The DNALC BioServer at


Underneath 'Sequence Server' select 'Enter'


Click 'Manage Groups'


Wait until the
'Classes' window has loaded then, on the upper right hand corner, find
'Sequence sources:' and click on the arrow head right underneath it (to the right of the
word 'Classes')


Select 'Public'


Find 'HIV / SIV env', check the check box to the left of it, a
nd select 'OK' on the bottom


On the workspace you will now find one sequence displayed, view it by clicking
'OPEN'. (You will not be able to edit the sequence unless you are the one who has
entered it as a registered user. )


Select 'DONE' when you are do
ne viewing the sequence


In order to pull more sequences onto the workspace move your cursor to the arrow head
to the right of the word 'None', then click it.


Select another sequence until you pulled all sequences onto the workspace


Check the sequences y
ou wish to align (all), make sure that 'Align: CLUSTALW' shows
in the window next to 'COMPARE', and click 'COMPARE'


Wait for the alignment to be displayed. Please be patient as this may take several


View the alignment and determine where the seq
uences differ from each other.


How are differences indicated in the output?


How many differences can you identify?


Are the differences distributed evenly among the sequences or are there some that are
more alike among each other with others?


Try to ide
ntify the sequences which deviate a lot from the others and redo the ClustalW
alignment after unchecking those.


Which sequences are closer related to each other, the ones with more or with less
differences? Try to determine which two sequences are closest
to each other, then
determine the relationships of the other sequences to these two.