Download - WV IDeA Network of Biomedical Research Excellence

weinerthreeforksΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

69 εμφανίσεις

Perl Programming:
Developing Key Tools for
Bioinformatics

An Informative Look Behind the Importance of Programming Skills
and Brief Tutorial on Getting Started With Perl and Bioperl

Andrew C. Rieser

4
-
1
-
04

“Computers are powerful devices for understanding any
system that can be described in a mathematical way” (Gibas,
2001).






At one time Computer Skills for Biologists weren’t
important



Mass Quantities of resources on the Internet



Numerous tools to manipulate and discover genomic info



Programming Skills can be very important



The Rapid Growth
of GenBank and
Other Online
Databases



Different Formats
that these sequences
are stored in


Cynthia Gibas



an assistant professor in biology at




Virginia Tech

James Tisdall



Consultant for Biocomputing Associates
of Kimberton, PA, was one of the first people to use Perl in
bioinformatics, and he is also the developer of DNA
WorkBench, a parallel
-
processing bioinformatics Perl
program used worldwide.

http://biowb.sdsc.edu/register.cgi



The majority of biological
researchers are not programmers


Biologists often feel that basic computing skills
are all that they need to fulfill their everyday
research tasks.


In many cases Tisdall observes, “you can
accomplish quite a bit using existing tools.”


“What happens when you want to do something a
preexisting tool doesn’t do? What happens when
you can’t find a tool to accomplish a particular
task, and you can’t find someone to write it for
you?” (Tisdall, 2001).



Skill is in Strong Demand



Makes You More Marketable as a
Biologist



Saves Time



Easy Skill to Pick Up

“The only chance biologists have of keeping up with the job of
analyzing data is by developing libraries of reusable software tools.”


Cynthia Gibas
-

2001

Common Languages Relevant To Bioinformatics:

C/C++, Python, and
Perl


Perl has become a very popular bioinformatics programming language
because of it’s suitability for rapid prototyping


the ability to quickly
write a working program

Also, Perl is excellent for string manipulation, and bioinformatics deals
mainly with large strings of genetic sequences and base pairs.


Perl is mainly for use with Unix/Linux based operating
systems; however, you can install both for every windows
based operating system.

I will be showing you how to install Perl/Bioperl on Win2k
and XP systems, because this is what I installed it on and
realized that there weren’t any real good installation
directions for Windows based systems.



Visit
http://www.activestate.com/Products/ActivePerl/?_x=1



Download Perl Activestate Click to download




Run the Setup and Follow the Directions



Verify that Perl was Installed





This will setup Perl on your
computer.


It should setup a Folder on your hard
-
drive
(Typically C:
\
Perl) unless otherwise changed.
This folder contains all the needed modules and
libraries to run Perl on your system.


Running Perl on Windows operating systems
requires the use of MS_Dos… so get used to the
command line because this is what you will be
using to run all your Perl Scripts.


All of your Perl Scripts can be easily written in a
basic word processing program such as NOTEPAD


Then saved with the
.pl

ending

** make sure that the “Save As Type” is set for “All
Files”



Let’s develop a simple “Hello
World” Test Program…


In Notepad simply Type: print “Hello World”;


Then Save as test.pl


Now let’s test to see if the script worked. 1.) Open up MS
-
DOS prompt and type cd
\

HIT ENTER


2.) Type in

perl C:
\
windows
\
desktop
\
test.pl (or
wherever you saved your test.pl) HIT ENTER


3.) Should print out “Hello World” on the screen! It’s as
easy as that!


Now it will get a little bit more complicated … Next we
will install Bioperl
-

this is what caused me the most
trouble! However, once I figured it out, it was fairly
simple.


http://www.bioperl.org/Core/Latest/

Click Here for Newest Release



Download BioPerl and Unpack using Winzip
(or another extracting tool) Extract to Perl Lib
Directory (typically C:
\
Perl
\
lib)



Download Nmake and Save in Perl Lib
Directory




Install Bioperl and it’s modules

http://download.microsoft.com/download/vc15/
Patch/1.52/W95/EN
-
US/Nmake15.exe

FOR MORE INFO



http://www.bioperl.org/Core/windows
-
bioperl.html

Detailed Instructions


1. Open Up MS
-
Dos prompt and type: cd
C:
\
Perl
\
Lib HIT ENTER


2. Now Type: perl Makefile.PL: You will have to
specify the directory HIT ENTER


3. You type : nmake ENTER


4. NEXT: nmake test ENTER


5. Finally: nmake install ENTER


It’s that easy…now to install other modules not
included in the bioperl package follow the
directions below using PPM!!!!



To use PPM


Just go to the DOS command line and type "PPM"
(without the quotes). You will be at the PPM command
prompt. (I should mention that you need to be connected to
the Internet at this point).


At the PPM prompt, enter "install YOUR::MODNAME"
-

you will be prompted if you want to continue. At that
point, PPM will connect to ActiveState and see if the
module you requested is available in a pre
-
compiled form.
If it is, it will install the module and you are all done! PPM
is especially nice because it will even install other required
modules for you.


If you get an error message that the module was not found,
then it's not available from ActiveState and you will have
to find it elsewhere.


NOW BIOPERL IS SUCCESSFULLY INSTALLED!!!

Perl (BioPerl) Examples:



Easy to learn



Rapid Prototyping

FASTA Format:

Fasta.pl Only 9 Lines of Code

use strict;

use warnings;

my @file_data = ();

my $dna ='';

@file_data =
get_data("C:
\
\
Perl
\
\
bin
\
\
BioInfo
\
\
sample.dna.
txt");

$dna = extract_data(@file_data);

print_sequence($dna,25);

exit;

Hardcode Location

Run Perl
Script

Revcomp.pl

Reverse
Complimentary
Strands of the
FASTA format
we just made
!

http://jje.uchicago.ed
u/revcomp.pl

To develop Computer Programming Skills and the
abilities to develop your own scripts, you must first
learn how to program and to do this I recommend
reading …








Ezzell, C. (2000). Hooking up Biologists.
Scientific American 283

(5), 22.



Gibas, C., and Jambeck, P. (2001).
Developing Bioinformatics Computer Skills.

Sebastopol: O’Reilly.



Roos, D. (2001). Bioinformatics


Trying to Swim in a Sea of Data.
Science 291

(5507), 1260

1261.



Stewart, B (2001, December 7). An Interview with Lincoln Stein. Retrieved April
14, 2002 from the World Wide Web:
http://www.oreillynet.com/pub/a/network/2001/12/07/stein.html



Tisdall, J. (2001, October 15). Why Biologists Want to Program Computers.
Retrieved April 12, 2002 from the World Wide Web:
http://www.oreilly.com/news/perlbio_1001.html




Tisdall, J. (2001).
Beginning Perl for Bioinformatics.

Sebastopol: O’Reilly.