The MoSGrid Portal - Applied Bioinformatics Group - Eberhard Karls ...

hordeprobableΒιοτεχνολογία

4 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

80 εμφανίσεις

grant: 01IG09006

The
MoSGrid

Portal



A workflow
-
enabled Grid Portal for
Molecular Simulations

Sandra Gesing

Center for Bioinformatics, University of
Tübingen

sandra.gesing@uni
-
tuebingen.de

28.04.2010




www.mosgrid.de

Outline


Motivation


MoSGrid

(Molecular Simulation Grid)


The
MoSGrid

portal


Domain specific workflows


MSML (Molecular Simulation Markup Language)


Future work

MoSGrid

Portal

2

www.mosgrid.de

Motivation



Numerous

applications

for

molecular

simulations

and



docking
, e.g.



Materials
science



Structural

biology




Drug design



Sophisticated

tools

and

algorithms

support

scientists



High
-
performance

computing

facilities

are

available


MoSGrid

Portal

3

www.mosgrid.de

Motivation

Drawbacks of using molecular simulations and docking



Usability of tools is limited



Complexity of methods



Lack
of graphical user interfaces



Complexity of infrastructures



Many end users
lack computer
science
background




Need
for
self
-
explanatory and intuitive
user interfaces



A
portal

for

molecular

simulations

and

docking

MoSGrid

Portal

4

www.mosgrid.de

Portals




Single point of entry



Possibility to customize views and tools



Store user preferences



No installation of software on the user’s side



No firewall issues

MoSGrid

Portal

5

www.mosgrid.de

Unifying Diversity

12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt
12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt
12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct
12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt
12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt
12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt
12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg
12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga
12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc
12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa
12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa


Slide
copied

from
: Stuart Owen „Workflows
with

Taverna


MoSGrid

Portal

6

www.mosgrid.de

MoSGrid

Molecular

Simulation
Grid

(D
-
Grid

project
)

Goal

Providing

users with Grid access to molecular

simulation tools and docking tools via a workflow
-
enabled

portal




Implementation of high
-
performance computing



Workflows



Annotations of results



Data mining



Use of the D
-
Grid
-
infrastructure


MoSGrid

Portal

7

www.mosgrid.de

MoSGrid

Partners



Universität

zu

Köln



Eberhard
-
Karls
-
Universität

Tübingen



Universität

Paderborn



Konrad
-
Zuse
-
Zentrum

für

Informationstechnik


Berlin



Technische

Universität

Dresden



Technische

Universität

Dortmund



Bayer Technology Services GmbH, Leverkusen



Origines

GmbH,
Martinsried



GETLIG&TAR,
Falkensee



BioSolveIT
, Sankt
Augustin



COSMOlogic

GmbH&Co
. KG, Leverkusen


MoSGrid

Portal

8

www.mosgrid.de

MoSGrid

in a Nutshell






XtreemFS

Cloud

File

System

Portal

WS
-
PGRADE

Grid

resources

UNICORE 6

Result

Recipe

Structure

Result

High
-
level

middleware

service

level

gUSE

Workflow

MoSGrid

Portal

9

www.mosgrid.de

Credential Management



User management based on
Liferay

features



Community management



Organization management



X.509 user certificates



SAML (Security Assertion Markup Language)



Minimize credential data transfers



Set of maximum hops for trust delegation



Usable

for single sign
-
on infrastructures (e.g.,
Shibboleth)

MoSGrid

Portal

10

www.mosgrid.de

Credential

Management

MoSGrid

Portal

11

www.mosgrid.de

WS
-
PGRADE

MoSGrid

Portal

12

www.mosgrid.de

WS
-
PGRADE

MoSGrid

Portal

13

www.mosgrid.de

WS
-
PGRADE

MoSGrid

Portal

14

www.mosgrid.de

gUSE Architecture

User
interface

WS
-
PGRADE

Grid

resources


middleware

layer

UNICORE 6

High
-
level

middleware

service

layer

gUSE

grid

User Support Environment

MoSGrid

Portal

15

www.mosgrid.de

gUSE

Submitter

Interface
Grid
Service


actionJobSubmit


actionJobAbort


actionJobOutput


actionJobStatus


actionJobResource

J
OB
n

JOB1

JOB2

JOB3

JOB4

GridService

MoSGrid

Portal

16

www.mosgrid.de

gUSE

Submitter
for

UNICORE


JOBn

JOB1

JOB2

JOB3

JOB4

Uspace

gUSE

UNICORE 6

Resources

4
-

Upload
data

1
-

Security

2
-

Registry

3
-

Submit

job

5
-

Start
job

actionJobSubmit

MoSGrid

Portal

17

www.mosgrid.de

ASM (
Application

Specific

Module)


Library
for

managing

WS
-
PGRADE
workflows



Listing of users and workflows in the local repository


Import of Workflows in the user space


Upload/download of input and output files


Setting the parameters of a job in a workflow


Submission of workflows


Monitoring of workflows


Deletion of workflows


Usable

in
portlets

und Java
tools



Implicit

use

of

gUSE

submitter


18

MoSGrid

Portal

www.mosgrid.de


XtreemFS
is

an
object
-
based

grid

and

cloud

filesystem


Ability

to

minimize

data

transfer


Low
latency
,

local

availability


through


replication



Grid Security

Infrastructure

(GSI)
support

Distributed Data Management


MoSGrid

Portal

19

www.mosgrid.de


XtreemFS

integration




Portlet


UNICORE


GSI
support



Data
flow



WS
-
PGRADE


XtreemFS



Frontend
nodes



Compute

nodes


UNICORE mediates
data

transfers


XtreemFS

UNICORE

TSI

Distributed Data Management

MoSGrid

Portal

20

www.mosgrid.de

Domain Molecular Dynamics



Study and simulation of molecular motion


Provide a molecular dynamics service on multiple
levels


Direct upload of job descriptions


Workflows and standard recipes for repeating
tasks


Analysis of relevant properties

MoSGrid

Portal

21

www.mosgrid.de

Equilibration of Proteins


Proteins from databases (e.g., the Protein Data
Bank, PDB) do not necessarily represent a near
-
native conformation/configuration


For all kind of production runs a minimization and
an equilibration is an indispensable prerequisite


Eases the work of experienced users


Lowers the hurdle for novice users



MoSGrid

Portal

22

www.mosgrid.de

MoSGrid

Portal

UseCase
:
Gromacs_EQ

s
tructure

(pdb/gro)

topology

(top/itp)

EM.mdp

(mdp)

pdb2gmx

structure

(pdb)

editconf

b
ox

(pdb)

genbox

Solvated

(pdb)

grompp

adj. Top.

(top/itp)

t
opol.tpr

m
dout.mdp

23

www.mosgrid.de

MoSGrid

Portal

mdrun

ener.edr

t
raj.trr

t
raj.xtc

md.log

s
tate.cpt

SYSTEM_EM
.pdb

grompp

mdrun

t
opol.tpr

m
dout.mdp

ener.edr

t
raj.trr

t
raj.xtc

md.log

s
tate.cpt

SYSTEM_EQ.

pdb

FULL.mdp

(mdp)

g_energy

xmgrace

Analysis.jpg

g_energy

xmgrace

Analysis.jpg

24

www.mosgrid.de

MD
Portlet

MoSGrid

Portal

25

www.mosgrid.de

Domain Quantum Chemistry



Study and simulation of molecular electronic
behavior relative to their chemical reactivity


Survey
-

MoSGrid

Community


First implementation for Gaussian


Then support for


Turbomole


GAMESS
-
US


Further relevant QC applications


MoSGrid

Portal

26

www.mosgrid.de

Domain Quantum Chemistry


Gaussian Jobs


Single input file


Defines molecular geometry and task


Result


Not structured output


Platform dependent checkpoint file


Integrated multi
-
step job option


Not usable for generalized workflows

MoSGrid

Portal

27

www.mosgrid.de

Domain Quantum Chemistry


First prototype


Workflow controlled by
portlet


Three phases


Pre
-
processing


Job execution


Post
-
processing

MoSGrid

Portal

28

www.mosgrid.de

Domain Quantum Chemistry


Assisted job creation


Guiding GUI


Most common options
available

Pre
-
created job description


Upload of Gaussian job
description file


Monitoring of jobs

Post
-
processing and presentation of results

Workflows

MoSGrid

Portal

29

www.mosgrid.de

Domain Quantum Chemistry


Preprocessing


Portlet

(GUI) supports common options


Automatic generation of job description


Submission

of job

MoSGrid

Portal

30

www.mosgrid.de

Domain Quantum Chemistry


Post
-
processing


Parsing of result file


Python scripts executed by
portlet


Relevant information about molecular properties


Data in CSV
-
Format saved and accessible

MoSGrid

Portal

31

www.mosgrid.de

Domain Docking



CADDSuite

(Computer
-
aided Drug Design)

MoSGrid

Portal

32

www.mosgrid.de



Galaxy available for local
ressources

in
Tübingen


Domain Docking

MoSGrid

Portal

33

www.mosgrid.de

MolDB



Stores molecules in binary format, which allows for fast
export



Automatically creates and stores can. smiles, fingerprints,
and functional groups counts for imported molecules



Automatically saves and restores docking
-
/rescoring
-
results



DB can be filtered to all stored molecule properties before
exporting molecules



Current speed for import/export: ~100 compounds/sec.


MoSGrid

Portal

34

www.mosgrid.de

MSML

Molecular Simulation Markup Language


Based on CML (Chemical Markup Language)


Common interpretation by humans and
computers


Follows the minimum information principle


Description:

http://xml
-
cml.org/convention/dictionary


XSL
transformation


Used

for

validation

purposes

validator.xml
-
cml.org


35

MoSGrid

Portal

www.mosgrid.de

Future Work


WS
-
PGRADE


Integration of the UNICORE IDB to offer drop
-
down
boxes of available tools


MD
-

and QC
-
Portlet


Adoption to
gUSE

workflow engine via the ASM
libraries


CADDSuite


Export of workflows from Galaxy to WS
-
PGRADE


MSML


Further development


MoSGrid

Portal

36

www.mosgrid.de

Involved

Projects

SHIWA

(
SHaring

Interoperable Workflows for Large
Scale Scientific Simulations on Available DCIs)


EU
project


Duration: 01.07.2010


30.06.2012


Tübingen
participates

via
Galaxy

workflow

export


CompChem

Virtual
Organization


EGEE
project


Available

ressources

37

MoSGrid

Portal

www.mosgrid.de

Future Projects

SCI
-
BUS

(
SCIentific

gateway Based User Support)


EU project


Duration: 01.10.2011


30.09.2014


Pan
-
European
ressources


Tübingen
participates

with

the

extension

of

the

MoSGrid

portal

with

an
interactive

molecule

editor

and

a
semantic

search

38

MoSGrid

Portal

www.mosgrid.de

Acknowledgements

39


Oliver Kohlbacher


Ákos

Balaskó


Georg Birkenheuer


Sebastian Breuers


Richard Grunzke


Sonja
Herres
-
Pawlis



Valentina Huber


Miklos
Kozlovszky



Jens Krüger


István

Márton



Patrick Schäfer


Bernd Schuller


Johannes Schuster


Anna
Szikszay

Fabri


Klaus
-
Dieter
Warzecha


Martin
Wewior

MoSGrid

Portal

www.mosgrid.de

40

MoSGrid

Portal