Introduction to Bioinformatics

moredwarfΒιοτεχνολογία

1 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

302 εμφανίσεις

MBBC 324 Home
work #2 Spring 2011



1. Weblem 1.7 (IB
-

Sontum)

Recover and align the pancreatic ribonuclease sequences from sperm whale, horse and
hippopotamus. Are the results consistent with the relationships shown by the SINES?
(Figure 1.5)


Some hin
ts for this problem: Sequence alignment is an important bioinformatics tool for
looking at evolutionary relationships. This problem can be done at the NCBI website but
it is easier to do it using the NCBI Genome Workbench.

1)

Setting up your project: Ope
n the Genome Workbench in the
Search View

window choose for the
Search Tool:

Search NCBI Public Databases” and for
the
Selected NCBI Databa
se:


Protein”. We are going to load in pancreatic
ribonuclease protein sequences from horse, whale and a hippo. In

the search
window type RNAS1_HORSE and hit return or press the green Start arrow.
When the search sequence returns double click on it select “Create ne Project”
and OK. This will add the sequence to your project work space. Repeat these
steps with RNA
S1_BALAC and RNAS1_HIPAM to load the whale and hippo
sequences into your project file.

2)

Align the sequences using the Needleman
-
Wunsch Alignment. To align these
sequences first Select P00674 and P00673 sequences in your project window by
cntrl/clicking

on them. Then use “
Tools/Run Tools
” and select “
Needleman
-
Wunsch Alignment
” and press
Next

then
Finish

to add a Global alignment
record to your project file. Repeat this step for P00674 and P00672 and then again
for P00673 and P00672. This will add th
ree Global alignment records to your
project window.

3)

View the alignments by choosing one of the global alignment files and use
View/Open View

to select “Multi
-
pane Cross Alignment View”. This will give
you a dot
-
plot alignment as well as a sequence al
ignment. Mouse over the red
and blue aligned sequence to find the percentage of identical matches and the
number of gaps. Use the percentage of identical sequences to answer Weblem 1.7

.

2. Problem 5.1 (IB
-

Sontum)

Draw a dot plot of the following sequen
ce from the wheat dwarf virus genome:
ttttcgtgagtgcgcggaggctttt against itself. In what respects is this not a perfect palindrome?

Dot plots are the simplest way to see sequence alignment. This will give you one
example of their use.


3. Problem 2.2 (IB
-

Ward)

For
M. genitalium

and
H. influenzae
, what are the values of (a) gene density in genes /kb,
(b) average gene size in bp, (c) number of genes. Which factor contributes most to the
reduction of genome size in
M. genitalium

relative to
H. influenzae
?


4. (Ward) Mycoplasmas have recently made the news in work by J. Craig Venter. Why?
Please access the appropriate paper and give a short account of what Venter’s group did
and some specifics about changes that were made to the
Mycoplasma

genome. What ar
e
some of the ethical and social implications of this work? Please provide references for
any material you use to answer the question other than you textbook.


5. Weblem 2.8 (IB
-

Ward)

(a) How many predicted ORFs are there on Saccharomyces cerevisiae c
hromosome X?
(b) How many tRNA genes?


6. Weblem 2.19 (IB
-

Ward)

(a) What is the normal function of the protein that is defective in Menke disease? (b) Is
there a homologue of this gene in the
A. thaliana

genome
? (c) If so, what is its function in
A. thal
iana.


7. Problem 4.2 (IG
-

Ward)


8. Problem 4.3 (IG
-

Ward)


9. (Ward) You have obtained the following sequence reads from an environmental DNA
genome sample:


Read 1: ATGCGATCTGTGAGCCGAGTCTTTA

Read 2: AACAAAAATGTTGTTATTTTTATTTCAGATG

Read 3: TTCAGATGCGAT
CTGTGAGCCGAG

Read 4: TGTCTGCCATTCTTAAAAACAAAAATGT

Read 5: TGTTATTTTTATTTCAGATGCGA

Read 6: AACAAAAATGTTGTTATT


(a)

use these sequence reads to create a sequence contig

(b)

translate the sequence into all possible reading frames

(c)

identify the gene and organism using
BLASTp


10. (Ward) Compare and contrast the approaches of Roche/454 pyrosequencing vs.
Illumina/Solex NGS in a short paragraph. What are the advantages and disadvantages of
both? Please reference your work.