CSE4990/6990 Bioinformatics

moredwarfBiotechnology

Oct 1, 2013 (3 years and 8 months ago)

98 views

CVM 8890
Final Project

Fall 2005

BioPerl and BLAST

Due December 9
, 2005



In this project, we will be using BioPerl and BLAST so you need to make sure that both
of these are properly loaded on your computer and that both work. Make sure you can
compile an
d
execute some of the simple programs on the BioPerl web site that use
BioPerl modules. After you have BioPerl working, you should install BLAST using the
instructions distributed in class. In order to query a database, you must index the
database for BL
AST using formatdb. These instructions were also given out in class.


You get to define your own project (within limits). Your program
should use a set of
sequences to query a database using Blast. Here are some possible examples:


1.
Suppose you are

interested in an organism that is not sequenced and a close relative of
your organism has been sequenced.
Download and index a protein set for
the sequenced
organism (
http://www.ncbi.nlm.nih.gov/Ftp/

)
.
Download a set of assembled ESTs for
your organism from TIGR

(TCs)

http://www.tigr.org/tdb/tgi/index.shtml
. Blast each EST
against the protein database and generate a file that contains the best hit

for each EST.
Make sure that you set some limits on the
hits.


Zebra fish (sequenced)


Catfish (ESTs)


Rice (sequenced)



Maize (ESTs)



2. Same as # 1 except you download the sequence for one chromosome (FASTA format)
of a sequenced organis
m and index the chromosome. Blast each EST
of a related,
unsequenced organism,
against the chromosome database
.


3. Same as #1 or #2 above except Blast peptides against a protein set or a chromosome.


Submit

the following
:

1.

The code for y
our program

2.

The q
uery file that you used

3.

The files for your indexed database

4.

An output file that lists the relevant hits (or the results of your analysis if you
program is doing more than just giving hits).

5.

A relatively short (one or two pages) report that describes the pr
oblem you have
solved, the database and query sequences you have used, explains why it is an
interesting problem, gives the parameters that you used for your Blast search, and
the criteria you have used to determine which results to generate.

6.

A short quer
y file that I can use to test your program. The query file should have
10 to 20 sequences and should generate some hits when I run it.