The Canadian Bioinformatics Resources – An Overview - TERENA ...

clumpfrustratedBiotechnology

Oct 2, 2013 (3 years and 8 months ago)

66 views

10/3/2013 6:51:11 AM

The Canadian Bioinformatics
Resources


An Overview

10/3/2013 6:51:11 AM

What are we?


A distributed computational biology
resource with nodes across the country,
accessed through the Web and computer
terminals via CA*Net3

10/3/2013 6:51:11 AM

Distributed

10/3/2013 6:51:11 AM

Networking History


Started with 56Kbps lines for the institutes



Collaborated with CANARIE as a test
-
bed
application for the new CA*net II (OC3
ATM lines @ 155Mbps), with Nova Scotia
being the first site in the world connected to
a national advanced network


10/3/2013 6:51:11 AM

Networking Present


In 1998 CANARIE was mandated to create
the first national optical Internet network


CA*net 3 uses wavelength multiplexing
over fiber to each province’s gigabit point
of presence (GigaPoP) Cisco 12008 router


Nova Scotia was the first province
connected to CA*net 3, and our CBR
servers now run Gigabit Ethernet

10/3/2013 6:51:11 AM

Networking Future


By the end of 2000 all CBR servers should
be on the gigabit backbone


Currently CA*net 3 runs two OC48
(2.5Gbps) wave lengths, preparing to
increase to eight OC192s (total 80Gbps)

10/3/2013 6:51:11 AM

Network Associations



EMBNet: Canadian national node



APBioNet: founding member



Canadian Bioinformatics
Workshops Partner

10/3/2013 6:51:11 AM

Who uses it?


Internal NRC users and their industrial
collaborators use
CBR
-
I

terminal access.


Members of not
-
for
-
profit, research
organizations use
CBR
-
II
. Basic Web
access is free, terminal and SeqWeb access
come with the $195 registration.

I vs. II

Internal/External

$195

10/3/2013 6:51:11 AM


Large redundant disk storage, automated updates


Retrieval of known DNA/Protein sequences


High speed parallel processing, interface features


Alignment of new sequence to the database sequences
(Blast, FastA)


System Stability


Intensive computation, tool integration (ClustalW &
WebPhylip)

What is it used for?

Computational Resources

10/3/2013 6:51:11 AM

What is it used for?


Analysis tool suites (GCG)


Investigate sequences using wide range of tools


Platform specific software access (Sun,
SGI, DEC, Linux)


Protein identification and characterization


Finding novel functions, structures (Expasy,
HyperChem)

Analysis Software

10/3/2013 6:51:11 AM

What is it used for?


Installation of server programs and
infrastructure for tool development



Sharing of users’ programs and expertise



Sample projects using CBR resources:
MAGPIE, Bluejay, Proteomics

Tool Development

10/3/2013 6:51:11 AM

MAGPIE


Intelligent automated DNA analysis using
queued requests for local and Internet
-
accessed programs


Web based reporting with private/public
data views


Automated public database submission
based on human confirmation of MAGPIE
analysis

10/3/2013 6:51:11 AM

MAGPIE: Project Overview

10/3/2013 6:51:11 AM

MAGPIE: DNA/Gene Summary

10/3/2013 6:51:11 AM

MAGPIE: More DNA Info

10/3/2013 6:51:11 AM

MAGPIE: Evaluated Evidence

10/3/2013 6:51:11 AM

MAGPIE: Human Annotation

10/3/2013 6:51:11 AM

Bluejay


Java
-
based visualization of data in XML
format using abstract graphics libraries


Adding XML to CBR services will make
advanced queries and tool integration easier
while staying compatible with old browsers


Proxy server (in mod_perl) for transparent
access to non
-
XML data sources on the
Web


10/3/2013 6:51:11 AM

Bluejay Proxy Server

URI request

from user

URI content

display

URI request

from content handler

URI request

from browser

Client

Proxy

(mod_perl under Apache)

Server

Intervention by

request handler

URI content translation

in content handler

GET http://.../path/to/file

GET /path/to/file

Content
-
type: text/xml

Content
-
type: text/...

GET /path/to/file

Content
-
type:

...

Accept: text/xml, ...

10/3/2013 6:51:11 AM

Bluejay

10/3/2013 6:51:11 AM

Proteomics


Web front
-
end to a relational database
storing 2
-
D gel experiment information


Facilities to upload images and experiment
spot linking/annotation information from
client, then display the annotated image data
dynamically according to user preferences


Integrated PI/MW estimations to search for
known proteins that a spot might represent


10/3/2013 6:51:11 AM

Proteomics

10/3/2013 6:51:11 AM

How much is CBR being used?


14,000

external homepage hits since Aug ’99


550,000

pages served since Feb. ’99

10/3/2013 6:51:11 AM

Examples Users


Internally, a MAGPIE project of a pathogen
Candida albicans) being used by BRI,
requiring tens of thousands of database
searches



Externally, The Pox Virus Resource (UVic)
is using CBR
-
II processors and databases to
keep analysis up to date

10/3/2013 6:51:11 AM

Why use us instead of others?


High speed Internet connectivity


Command line/X Windows access


“One
-
stop” analysis


User support



Less infrastructure on the client end

10/3/2013 6:51:11 AM

Coming up


Small & medium enterprise services


Always more Web services


Genematcher

10/3/2013 6:51:11 AM

Who are we?

User support

Heidi Bishop

Heather Penney

Admin/Graphics

Rob Hutten

UNIX Sys Admin

Christoph Sensen

Project Leader

Sheldon Briand

Application/DB Support

10/3/2013 6:51:11 AM

Ω