What is Seahawk?
Seahawk is a browser for Moby Web services, which are online tools using a shared
registry and data formats.
This commonality lets users chain together multiple services into an
analysis pipeline, commonly
referred to as a workflow when visual and executable by a
The goal of Seahawk is to help biologists automate their analyses (i.e. create
workflows) without needing to explicitly write a program.
To make a wider array of tools available within Se
ahawk, the Daggoo system helps users adapt
forms on existing Web sites to Moby's specifications. Seahawk
has been developed
Moby and external Web tools can be browsed to create workflows "by demonstration".
How is Seah
awk different from other M
Seahawk is an
richer user interaction
than Web page
Examples include drag ‘n’ drop utilities and tooltips. Since it is written in Java, this also means
can be embedded in other application
Seahawk is data
centric. There is only one screen
type in Seahawk, the data display. Services
are shown as choices in popup menus. Displayed data can either be your own text files, binary
files (e.g. sequence traces & images), Web pages, or MOBY objects.
. Seahawk interface at left, with cascading menus of services that can be run on data (an NCBI gi # in this
case). At lef
: A workflow
, derived from interactively browsing services
How do I use Seahawk?
A visual guide provided with
this tutorial demonstrates how data
hyperlink navigation work in Seahawk. It also shows how to save a tab’s navigatio
n history as a
Gordon P.M.K., Sensen C.W. (2007) “
Seahawk: Moving Beyond HTML in Web
Soh J., Gordon P.M.K., Taschuk M.L., Dong A., Ah
Seng A.C., Turinsky A.L., Sensen C.W. (2008) Bluejay 1.0:
genome browsing and comparison with rich customization prov
ision and dynamic resource li
Requena V, Ríos J,
García M, Ramírez S, Trelles O. (2010)
jORCA: Easily integrati
Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T. (2006) “Taverna: a tool for building
and running workflows of services.”
Nucleic Acids Res.
34(Web Server issue):W729
The data for these exercises can be found at
Creating a Taverna workflow in Seahawk
Creating MOBY data by highlighting text:
Open Seahawk, then click on the Clipboard tab to give it the focus.
In a Web browser, go to the NCBI homepage and use the text box at the top of
screen to look up a gene of interest to you, or “ferredoxin” if nothing strikes your fancy.
Follow the appropriate links through to a GenBank protein record.
Choose FASTA from
drop down menu in the upper left corner.
Highlight the seque
nce, then d
in onto the clipboard.
imports the data as
a MOBY AminoAcidSequence.
Click the AminoAcidSequence link on the Seahawk Clipboard, and run the service
runNCBIBlastp, using the default parameters.
Run a point mu
tation analysis (PMUT) on the Blast results.
Save the PMUT analysis as an HTML document to your desktop.
Data collation with the clipboard:
’s first DNA sequence in Seahawk and add it to the clipboard.
Do the same with the second and thir
d DNA sequences.
Run a multiple sequence alignment on the collection of three sequence
Creating MOBY data by drag ‘n’ drop:
Drop the ABI trace file (a.k.a. sequence chromatogram) onto the Seahawk window.
Bluejay can’t show the contents of the binary trace file.
Find the chain of services that will do
Do the base calling to turn the trace file into a
DNA sequence “read”
Trim the vector sequence from the read
(the left and right vector sequences for the
experiment are also found on the tutorial Web page).
Trim the read of low quality regions (i.e. sections of 10 or bases with m
ore than 30%
BLAST the trimmed sequence against UniProt (a non
undant reference gene
, setting the e
value threshold to 1.0x10
the trace file exercise as a workflow, then try loading it in Taverna.
reating a Moby Service from an E
xisting Web Service
BioMoby tools and Web Services let you extract lots of information without scripting Perl
(screen scrapping)! By the end of this exercise, you should:
Be able to find Web Services (described in WSDL) for on
line tools you use
Be able to extract spe
cific data from a Web Service using Seahawk
Finding Web Services
BioMoby clients: Seahawk, MOWserv, jORCA…
Google query: “toolname WSDL”
Using Web Services in Seahawk
Find sample input
Put the sample data on the clipboard.
Find the WSDL you need.
Drop the WSDL tab onto Seahawk.
Wait for Seahawk to display a service form in your Web browser.
Drag the sample data from Seahawk onto the appropriate service form field, and
other required fields.
Ensure it says “Paste Noted” in the upper
Click “Execute Service” and be patient!
In these two exercises, we’ll try to get from an
number to functional descriptions. First,
we’ll map from G
I number to the Entrez Gene ID via the GI’s Genbank record. You’ll need to
find the NCBI Entrez WSDL document
(hint: try the Google query tip above)
For convenience, you can use
the NCBI gi in the Seahawk “H
tab, accessed by clicking the
icon. If you followed the steps
above, Seahawk should recognize the
Entrez Gene ID
(and some other data) in the output. You can click on the link in the Seahawk browser and
create a Moby service producing Entrez Gene IDs. All you need to do is fi
ll in some metadata to
label and describe the service.
The NCBI WSDL main page lists the values you can use for the “db” field. Try “protein”.
Once you’ve created the service, you will be able to see it in the Seahawk service menu
whenever any GI
number is highlights. Go ahead, try another GI!
Entrez Gene ID
format is shown in red, just above the Summary section. Create a
service that extracts
s (concise functional descriptions of a gene) from the Entrez Gene’s
Web Service XML output. You will need to find the “generif” XML link, and on the next page pick
the appropriate item from the dropdown menu to get function sentences.
In case you are not familiar with GeneRIFs, below is how they normally appear in a bro
with yellow highlight added for emphasis.
3. Create a Moby Service from a Web Form
If the resource you are interested in putting in your workflow does not have a WSDL
you can use a nearly identical technique to the last exercise to create a
service out of a standard Web form. Instead of dragging the WSDL URL onto Seahawk,
drag the URL of the Web page containing the form. Note that Web forms can be pretty
o this will not work for
form. You will be asked in class to perform
one of these exercises manually, and the other using Seahawk, to highlight why you
might choose one method over the other.
Given the 9 GI numbers for nucleotide (EST) sequences in the
the sequences. If the sequence contains the promoter motif, design a hybridization
probe using the online version of Primer 3. Output the motif containing sequences a
GO:0080112), find the related genes in the
Arabidopsis genome, then BLAST the sequences to find the equivalent rice gene.