Galaxy User Manual

crashclappergapSoftware and s/w Development

Dec 13, 2013 (3 years and 5 days ago)

75 views


Galaxy User Manual


1. Overview

Galaxy is

a framework for integrating computational tools. It allows nearly any tool that
can be run from the command line to be wrapped in a structured well defined interface.

On top of these tools, Galaxy provides an a
ccessible environment for interactive analysis
that transparently tracks the details of analyses, a workflow system for convenient reuse,
data management, sharing, publishing, and more.

Galaxy Can be used by the user for executing the already existing too
ls or for adding new
tools to Galaxy and executing them. For executing any of the tool, you have to provide
with the input dataset as needed by the tools. Each tools is provided with the description
and help for using it. Clicking on the “execute” button w
ill invoke the tool with the given
dataset to be run on the grid.

2. Invoking Galaxy:
Load a web browser and access

http://<ip
-
address>:8080
. Here <ip
-
address> is the ip address of the grid headnode on which Galaxy is installed. The Galaxy

is installed on GARUDA on the OSDD Headnode.


3. Steps for Tool integration in Galaxy:

1)

Create a directory with newtool name: $HOME/galaxy
-
central/tools/newtool

2)

Create newtool.xml, and newtool.py or newtool.perl or newtool.sh files in the
above directory
. The File newtool.xml will have details of the tool to be added
and the format of the input and output files etc. The file newtool.py or .perl or .sh
will have the details of executable to be invoked, the arguments to be passed, and
the job to be done on
the given input files.

3)

Change the $HOME/galaxy
-
central/tool_config.xml to provide details of new tool
to be added. Once this file is changed, the new tool is visible in Galaxy


EXAMPLE 1: For integration of Fasta for sequence alignment:


Step 1:

Created a
directory with name fastatool in $HOME/galaxy
-
central/tools i.e.
$HOME/galaxy
-
central/tools/fastatool


Step 2:

An xml file (fastatool.xml) and a tool wrapper (fastatool.py, a python code) are created in
$HOME/galaxy
-
central/tools/fastatool folder.







Co
ntents of fastatool.xml:


<tool id="fastatool" name="fastatool" >


<description>For sequence alignment</description>


#

Invoking fastatool.py with relevant parameters..







**What it does**

This tool is for sequence Alignment. User can upload qu
ery sequence file and library
(Database) file.

Once Fasta_Alignment tool is invoked, user can select the uploaded query and database
files in the drop down list and execute Fasta.

Fasta job gets scheduled using PBS scheduler. Once the job is done, Alignmen
t file is
returned back to the user.


</help>

</tool>










<command interpreter="python"> fastatool.py $input_file1 $input_file2 $output_file
</command>


<inputs>


<param name="input_file1" type="data" format="fasta" label="Query File" />


<param name="input_file2" type="data" format="fasta" label="Li
brary File" />


</inputs>


<outputs>


<data name="output_file" format="txt">


</data>


</outputs>


<help>


Specify
Input file
Formats

Parameters
to be
passed

Specify Output
format

Contents of fastatool.py:


#!/usr/bin/env python

import sys,os


def __main__():


#Assigning the input file and output files into temp variables



outfile= sys.argv[3]



arg1 = sys
.argv[1]


arg2 = sys.argv[2]



# Command to run Fasta tool. Here the Fasta executable is getting invoked along with
arguments to be passed.



cmd=”/usr/local/GARUDA/OSDD/fasta34
-
Q
-
O "+outfile+" "+arg1+" " +arg2


os.system(cmd)



if __name__
== "__main__" : __main__()


Step 3:


Add tool name in $HOME/galaxy
-
central/tool_config.xml file:









Argument to be
passed

Command to execute fasta
with require
d parameters


</section>


<section name="
fastatool
" id="mTools1">


<tool file="
fastatool/fastatool.xml
" />


</section>

<!

=


Figure 1: Snapshot of Fasta tool integrated in Galaxy


Fasta Tool for
Sequence
Alignment



Figure 2: Snapshot to show execution of Fasta from Galaxy





















Download
Fasta
Alignment
Output

EXAMPLE
2:
To add Autodock Tool in Galaxy


The files autodock.py and autodock.xml were created in ~/galaxy
-
central/tools/autodock
folder


Contents of autodock.py:


---------------------------------------------------------------------------------------------------
---------

#!/usr/bin/env python


import sys,os,shutil,tempfile,warnings


import sys,os,shutil,tempfile


def __main__():



arg = sys.argv[1]



inputfile=sys.argv[2]



inputfile=inputfile.rstrip(".dat")



inputfile=inputfile+".d
pf"


#create dpf file


f1=open(inputfile, "w")



f1.close()


shutil.copyfile(sys.argv[2],inputfile)



outfile= sys.argv[3]



cmd="tar
-
xf "+arg+"
-
C . "



os.system(cmd)



template ="""


cp /usr/local/GARUDA/OSDD/autodock4 .


STR1=`echo %s | cut
-
f 1
-
d "."`


./autodock4
-
p %s
-
l $STR1.dlg


cp $STR1.dlg %s


"""



script = template % (inputfile,inputfile,outfile)


Script_file ="autodock_file.sh"




FILE = open(Script_file,"w")

Autodock understands
only .dpf extension but
the files up
loaded
through Galaxy have .dat
extension. Code to
convert .dat file to .dpf
file

To untar all supported
files in current directory



FILE.write(script)



FILE.close()



cmd = " sh " + Script_file



os.system(cmd)



if __name__ == "__main__" : __main__()

--------------------------------------------------------------------
----------------------------------------



Contents of autodock.xml


------------------------------------------------------------------------------------------------------------

<tool id="autodock4" name="Autodock" version="4.2.3">


<description>from your

system</description>


<command interpreter="python">autodock.py $input_file $input_file1
$output_file</command>


<inputs>


<param name="input_file" type="data" format="txt" label="All Input files (tar)" />


<param name="input_file1" type=
"data" format="txt" label="Input dpf file" />


</inputs>


<outputs>


<data name="output_file" format="txt">


</data>


</outputs>


<help>


**What it does**


This tool is for docking. User needs to required input files as tar file and provide
.dpf file
(docking parameter file) as inputs


</help>

</tool>

------------------------------------------------------------------------------------------------------------





Figure 3: Screenshot of Execution of au























Fig1: OSDD
-
Garu
da Interface Architecture (Method 1)






3. Prerequisites

The only prerequisite to run your own Galaxy is a Python interpreter, version 2.4 or
greater.

$
python
--
version

Python 2.6.4

4. Creating a Galaxy instance

Cloning the Galaxy repository : The de
velopment and release repositories are
available through the
bitbucket hosting service
.


To create a local clone of the release repository run the following:

$ hg clone http://www.bx.psu.edu/hg/galaxy galaxy_dist



In
itial Setup :
Galaxy includes a setup script that can be run to configure a new instance:

$
cd ~/gridmon/galaxy
-
dist

/galaxy
-
dist$
sh setup.sh

This script performs two main actions:



Creates initial configuration files, including the main file
universe_w
sgi.ini
,
and empty directories for storing data files



Fetches all of the Galaxy framework's
dependencies
, packaged as Python eggs,
for the current platform.

5.
Running

Galaxy

No
w that initial configuration is complete, you can start your Galaxy instance by running:

gridmon@g8 galaxy
-
dist$
sh run.sh

When running a new instance for the first time, Galaxy first initializes its database.
Galaxy uses a database migration system to au
tomatically handle any changes to the
database schema. On first load it runs all migrations to ensure the database is in a known
state, which may take a little time.

Once the database is initialized, the normal startup process proceeds, loading tool
confi
gurations, starting the job runner, and finally initializing the web interface on port.
You can now access your Galaxy at
http://localhost:8080


See
GetGalaxy

for more information on setting up Galaxy on other platforms (e.g. Mac
OS X).

Running analyses with Galaxy

1. Access your new Galaxy instance

Load a web browser and access
http://localhost:8080
.