ELIXIR in Sweden

educationafflictedBiotechnology

Oct 4, 2013 (3 years and 9 months ago)

89 views

Bioinformatics
Infrastructure
for Life Sciences (BILS)

and

ELIXIR in Sweden

Bengt Persson

BILS


Bioinformatics Infrastructure for Life
Sciences


Distributed national research
infrastructure,

similarly to ELIXIR



Bioinformatics
support


specialised nodes


Large
-
scale sequencing, Proteomics,

Systems biology, Metabolomics, Structural

calculations, Biobank
-
related bioinformatics


general nodes



Bioinformatics network


nodes at each of the 6 large university cities


annual workshop



Bioinformatics computation and
data storage


in collaboration with SNIC (Swedish National

Infrastructure
for Computing)



Swedish node in
ELIXIR

ELIXIR Funding Agencies Planning Meeting 11.10.2010

ELIXIR:
a
sustainable

infrastructure for biological information in Europe
.

3

ELIXIR Scientific & Technical Structure

ELIXIR Funding Agencies Planning Meeting 11.10.2010

Funding from Swedish Research
Council


2010

4 MSEK


2011

8.5 MSEK


2012

13 MSEK

ELIXIR Funding Agencies Planning Meeting 11.10.2010

Planned BILS activities


Provided by the participating groups


leading bioinformatics groups in Sweden


General support


distributed
at the six large university cities


Specialised support


Large
-
scale sequencing


Proteomics


Systems biology


Metabolomics


Structural calculations


Biobank
-
related
bioinformatics


...


Training

Initial BILS activities


Support to users of large
-
scale sequencing facilities


Providing analysis tools and methods


Maintaining databases and storage of primary data


Setting up pipelines for the first analysis steps of data


E
valuate, report on, and set up bioinformatics software


I
n
-
depth bioinformatics support


National data repository for mass spectrometry
proteomics


in close collaboration with proteomics groups


interfacing European efforts


BILS/BBMRI.SE collaboration


Building interfaces to enable for researchers using
biobank

data to get seamless access to bioinformatics tools and
databases.


Support in metagenomics


ELIXIR Funding Agencies Planning Meeting 11.10.2010

ELIXIR Funding Agencies Planning Meeting 11.10.2010

Plans for
autumn 2010


BILS
positions
at
each of the six
large
university towns


Umeå


Uppsala


Stockholm


Gothenburg


Linköping


Lund



BILS technical coordinator


Swedish ELIXIR node


Sweden has a long tradition in providing bioinformatics
tools and databases for the life science community


With the establishment of BILS, long
-
term support of
these will be guaranteed, and Sweden is able to take
responsibility for a number of tools and databases that
are of European interest.



Data resources


Bioinformatics methods/services


Computational and storage resources



ELIXIR Funding Agencies Planning Meeting 11.10.2010

Data resources


Primary databases of data produced in Sweden


Secondary databases developed in Sweden



Human Protein Atlas


O
ne
major Swedish contribution
to

the
international research community.


Localisation

of human proteins in cells, tissues and organs.


Allows
for a systematic exploration of the
human
proteome

using
Antibody
-
Based Proteomics.


Version
6
(March 2010) contains


more
than 9 million high
-
resolution
images


8,400 (40%) protein
-
encoded genes


48
different normal tissues


20
different cancer types


47
different human cell lines.


Focus on interfacing to
various ELIXIR resources in order to

facilitate
the utilisation of this important information resource.



ELIXIR Funding Agencies Planning Meeting 11.10.2010

ELIXIR Funding Agencies Planning Meeting 11.10.2010

Data

resources, cont.


Several
databases with protein
families
and
orthologues, e.g.:


InParanoid/MultiParanoid and OrthoDisease


Comprehensive databases
of
orthologs

in eukaryotes
and disease gene
orthologs


Pfam



The
Sonnhammer

group is partner of the
Pfam

consortium contributing
software tools such as NIFAS and
Pfamalyzer

for
analysing

protein domain
architecture and evolution.


HOPS


FunShift


MolMeth
(The Molecular Methods Database;
http://www.molmeth.org
)


Structured
database developed for the BBMRI project with the aim to provide best
practice
-
based protocols for molecular analyses of different types of
samples


PROPHEY
(
http://prophecy.lundberg.gu.se/
)
,


Database
with quantitative high
-
resolution genome
-
wide

phenotypic
information about genetic perturbations



Microarray
databases at LCB
-
Data
-
Warehouse
(
http://www2.lcb.uu.se/lcbdw.php
)


SDR and MDR databases
(
http://www.sdr
-
enzymes.org
,
http://www.mdr
-
enzymes.org
)


short
-
chain and medium
-
chain dehydrogenases/reductases

including HMMs for family designation

Data resources, cont.


Long
-
term
storage of
primary data
from a variety of
sources,


e.g
.
large
-
scale sequencing


and proteomics


The framework for storage uses GRID storage and is
scalable to fit European nodes.


For
proteomics, the web
-
based analysis system
Proteios

provides complete analysis for several
proteomics workflows and generates XML in PRIDE
format.


The
Proteios

analysis system
works
seamlessly with
files
on
a local storage or on a remote storage, like the
GRID storage.


ELIXIR Funding Agencies Planning Meeting 11.10.2010

ELIXIR Funding Agencies Planning Meeting 11.10.2010

National storage

Swestore

srm://srm.ndgf.org/biogrid/db/uniprot/Un
iProt14.8/uniprot_sprot.fasta.gz

BILS
web pages

Any user

NSC

C3SE

Lunarc

HPC2N

UPPMAX

PDC

Users of SNIC systems

ELIXIR Funding Agencies Planning Meeting 11.10.2010

Bioinformatics

methods/services


Methods/services
developed
and maintained in
Sweden


General European
interest



Examples:


Structural calculations


Pcons, Pfrag


Analysis
of membrane
proteins


TOPPRED
, TMHMM, Phobius, Zpred, GPCRHMM, SHRIMP,
SCAMPI, OCTOPUS, SPOCTOPUS, TOPCONS


Tools for microarray
analysis


BASE (BioArray Software Environment)


Functional
predictions


SFINX
,
FunCoup
,
Dasher



Bioinformatics methods,
cont.


EVALLER


web
-
tool
wherein you can electronically test
a
protein’s potential
allergenicity
/cross
-
reactivity based on its amino acid sequence.


feasible
for scanning purposes and
as
a key part of an integrated
allergenicity

assessment procedure.


UniDomInt


An integrated database of domain

domain interactions


jSquid



A java tool to visualize networks and edge scores.


GPCRHMM



A hidden Markov model for GPCR detection.



SFINX



Integrated functional and structural protein feature prediction.


Dasher


A Java DAS client for displaying annotations on a protein sequence.




ELIXIR Funding Agencies Planning Meeting 11.10.2010

Bioinformatics methods,
cont.


Interfaces towards BBMRI


Development of interfaces between bioinformatics services
(BILS/ELIXIR) and the
biobank

infrastructure
(BBMRI.SE/BBMRI), which will be of importance to make
the huge bioinformatics resources available for large
-
scale
biomedical studies using
biobanks


Science for Life Laboratory in Stockholm/Uppsala


Development of analysis pipelines, computer programs,
methods and standards, that will be of European interest


ELIXIR Funding Agencies Planning Meeting 11.10.2010

Bio
-
compute centres


Provided by BILS together
with
SNIC (Swedish
National Infrastructure for
Computing)


Computational
resources needed
for periodic
calculation
campaigns


e.g
. for sequence comparisons
and HMM
calculations when new major database releases
require updates of the accompanying
information.


Example: dedicate
one of our large clusters to
such campaigns for a number of days every 2

3
months


ELIXIR Funding Agencies Planning Meeting 11.10.2010

Infrastructure collaboration

ELIXIR Funding Agencies Planning Meeting 11.10.2010

National

Regional

European

Computation

Storage

Grid

Cloud

Bioinformation

Other BMS
infrastructures,
e.g. BBMRI

ELIXIR Funding Agencies Planning Meeting 11.10.2010