Bioinformatics – A Glimpse of the Future - University of Edinburgh

weinerthreeforksΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

90 εμφανίσεις

3
-
Oct
-
13

Anatoly Sorokin

Existing Standards in

Systems Biology

Anatoly Sorokin



Computation Systems Biology Group

University of Edinburgh

Standard


2000
-
2010 is decade of standards in
biology


31 MIBI standard


56 OBO ontologies


About 80 exchange formats


Scope of interest


Language


Controlled vocabulary


Standards and Languages


CML


description of chemical structure


MathML


representation of mathematical
formulas


PSI


standard description of protein
interaction data


AnatML


language to describe interaction
at organ level


GeneOntology


standard and ontology to
describe gene function and regulation

Standards for Computational System
Biology


BioPAX


language for database of
biological networks exchange


SBML


language of biochemical model
exchange


CellML


l
a
nguage to describe
mathematical models


SBGN


visual language for biological
model description

MI standards


Reporting guidelines specify the minimum
amount of
meta data

(information) and data
required to meet a specific aim



Aim is to provide enough meta data and data to
enable the unambiguous reproduction and
interpretation of an experiment.


Normally informal human readable specifications
that inform the development of formal data
models (e.g.
XML

or
UML
), data exchange
formats

3
-
Oct
-
13

Anatoly Sorokin

Exchange format


Strict structure to exchange data of model


Mainly XML


Well defined meta
-
model, often supported
by software API

3
-
Oct
-
13

Anatoly Sorokin

Ontologies


“ontology deals with questions concerning
what
entities

exist or can be said to exist,
and how such entities can be grouped,
related within a
hierarchy
, and subdivided
according to similarities and differences”
Wikipedia


Often used as controlled vocabulary and
description support framework


GeneOntology

3
-
Oct
-
13

Anatoly Sorokin

BioPAX


“Bio
logical
PA
thway e
X
change
-

A
data exchange ontology and format
for biological pathway
integration,
aggregation and inference”

BioPAX Goals


BioPAX = Biological PAthway eXchange


Data exchange format for pathway data


Include support for these pathway types:


Metabolic pathways


Signaling pathways


Protein
-
protein, molecular interactions


Gene regulatory pathways


Genetic interactions


Accommodate representations used in existing
databases such as BioCyc, BIND, WIT, aMAZE, KEGG,
Reactome, etc.


PathwayCommons


collection of pathways in BioPAX


http://www.pathwaycommons.org

BioPAX


BioPAX ontology and format in OWL (XML)


Ontology built using GKB Editor and Protégé


Semantic mapping still an issue


Level 1 represents metabolic pathway data


Level 2 adds support for molecular interactions,
post
-
translational modifications, experimental
description from PSI
-
MI model (Backwards
compatible)


Level 3 adds support for generics, protein states,
rearrange reaction representation

BioPAX Ontology: Top Level


Pathway


A set of interactions


E.g. Glycolysis, MAPK, Apoptosis


Interaction


A set of entities and some relationship between them


E.g. Reaction, Molecular Association, Catalysis


Physical Entity


A building block of simple interactions


E.g. Small molecule, Protein, DNA, RNA

Entity

Pathway

Interaction

Physical Entity

Subclass (is a)

Contains (has a)

BioPAX Ontology: Interactions

Interaction

Control

Conversion

Catalysis

BiochemicalReaction

ComplexAssembly

Modulation

Transport

TransportWithBiochemicalReaction

Physical Interaction

BioPAX Ontology: Physical Entities

PhysicalEntity

Complex

RNA

Protein

Small Molecule

DNA

BioPAX and other standards


BioPAX

PSI
-
MI 2

SBML,

CellML

Genetic

Interactions

Molecular Interactions

Pro:Pro All:All


Interaction Networks

Molecular Non
-
molecular

Pro:Pro TF:Gene Genetic

Regulatory Pathways

Low Detail High Detail

Database Exchange

Formats

Simulation Model

Exchange Formats

Rate

Formulas

Metabolic Pathways

Low Detail High Detail

Biochemical


Reactions

Small Molecules

Low Detail High Detail

Simulation
-
related standards

3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Exchange

format

Ontology













implements

implements

Makes

sense of

Makes

sense of

SBML



The Systems Biology Markup Language
(SBML) is a computer
-
readable format for
representing
models of biochemical
reaction networks
. SBML is applicable to
metabolic networks, cell
-
signaling
pathways, regulatory networks, and many
others.


SBML


Reaction


container for rate law


Species


reactants, products, or modifiers of reaction


Compartment


container for species


Parameter, Rule, Event


Characteristics of SBML


Many top
-
level types, little nesting


Units, Compartment, Species, Parameter, Reaction, Rule, Function,
Event


Non
-
modular structure


Next SBML ‘Level’ (3) will introduce modularity


Emphasis on reactions


Some math implicit


Explicit rate equations; implicit integration


Implicit concentration conversion between compartments


Compartments are physical containers for species


Spatial dimensions (volume, surface)



Structure of SBML

Structure of SBML


Note

field of SBase intended to store information for
human to read


Annotation

field of SBase
provide a container for
software
-
generated annotations that are not intended to
be seen by humans


The
id

field is usually required for most structures and is
used to identify a component within the model definition.


The
name

field is optional and provide a human
-
readable label for the component.



3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Data model

Ontology













implements

implements

Makes

sense of

Makes

sense of

MIRIAM


Model description require extra information


Biological


Description of elements of model


Mathematical


Definition of math concepts


Referential


Author name


Paper reference etc.


http://www.ebi.ac.uk/compneur
-
srv/miriam/


3
-
Oct
-
13

Anatoly Sorokin

Reference correspondence


The model must be encoded in a public, standardized, machine
-
readable format (SBML, CellML, GENESIS ...)


The model must comply with the standard in which it is encoded!


The model must be clearly related to a single reference description.
If a model is composed from different parts, there should still be a
description of the derived/combined model.


The encoded model structure must reflect the biological processes
listed in the reference description.


The model must be instantiated in a simulation: All quantitative
attributes have to be defined, including initial conditions.


When instantiated, the model must be able to reproduce all results
given in the reference description within an epsilon (algorithms,
round
-
up errors)

3
-
Oct
-
13

Anatoly Sorokin

Attribution annotation


The model has to be named.


A citation of the reference description must be joined
(completecitation, unique identifier, unambigous URL).
The citation should permit to identify the
authors of the
model.


The name and contact of model creators
must be
joined.


The date and time of creation and last modification
should be specified. An history is useful but not required.


The model should be linked to a precise statement about
the terms of distribution. MIRIAM
does not
require
“freedom of use” or “no cost”.

3
-
Oct
-
13

Anatoly Sorokin

External resource annotation


The annotation must permit to unambiguously relate a
piece of knowledge to a model constituent.


The referenced information should be described using a
triplet {data
-
type, identifier, qualifier}


The data
-
type should be written as a Unique Resource Identifier
(URI)


The identifier is analysed within the framework of the data
-
type.


Data
-
type and Identifier can be combined in a single URI

http://www.myResource.org/#myIdentifier
urn:lsid:myResource.org:myIdentifier


Qualifiers (optional) should refine the link between the model
constitutent and the piece of knowledge: “has a”, “is version of”,
“is homolog to” etc.

3
-
Oct
-
13

Anatoly Sorokin


3
-
Oct
-
13

Anatoly Sorokin

3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Data model

Ontology













implements

implements

Makes

sense of

Makes

sense of

SBO


Part of OBO Foundry


Assign meanings to
mathematical elements of
SBML


Allows automatic
validation of semantic
consistency of math part
of model


http://www.ebi.ac.uk/sbo

3
-
Oct
-
13

Anatoly Sorokin

SBO


Types and roles of reaction participants, including terms like
“substrate”, “catalyst” etc., but also “macromolecule”, or “channel”.


Parameter used in quantitative models. This vocabulary includes
terms like “Michaelis constant” , “forward unimolecular rate
constant”etc. A term may contain a precise mathematical expression
stored as a MathML lambda function. The variables refer to other
parameters.


Mathematical expressions. Examples of terms are “mass action
kinetics”, “Henri
-
Michaelis
-
Menten equation” etc. A term may
contain a precise mathematical expression stored as a MathML
lambda function. The variables refer to the other vocabularies.


Modelling framework to precise how to interpret the rate
-
law. E.g.
“continuous modelling”, “discrete modelling” etc.


Event type, such as “catalysis” or “addition of a chemical group”.

3
-
Oct
-
13

Anatoly Sorokin

SBO


3
-
Oct
-
13

Anatoly Sorokin

3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Data model

Ontology













implements

implements

Makes

sense of

Makes

sense of

MIASE


Minimum Information About a Simulation
Experiment


What base model to use & which modifications to
apply


What simulation task to run on those models
(algorithms, see KiSAO; simulation parameters)


How to post
-
process the numerical results and to
present them


http://www.ebi.ac.uk/compneur
-
srv/miase/


Subset of MISE bould be encoded in

SED
-
ML

3
-
Oct
-
13

Anatoly Sorokin

Description of models


3
-
Oct
-
13

Anatoly Sorokin

Description of models


3
-
Oct
-
13

Anatoly Sorokin

Simulations


3
-
Oct
-
13

Anatoly Sorokin

Simulation task


3
-
Oct
-
13

Anatoly Sorokin

Data generation

3
-
Oct
-
13

Anatoly Sorokin

Data generation


3
-
Oct
-
13

Anatoly Sorokin

Production of results


3
-
Oct
-
13

Anatoly Sorokin

3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Data model

Ontology













implements

implements

Makes

sense of

Makes

sense of

KiSAO


Kinetic Simulation Algorithm Ontology



Classification of simulation algorithms &
methods


Definition, literature references


Relations between different simulation
algorithms & methods


http://www.ebi.ac.uk/compneur
-
srv/kisao/index.html

3
-
Oct
-
13

Anatoly Sorokin

KiSAO


3
-
Oct
-
13

Anatoly Sorokin

http://bioportal.bioontology.org/visualize/40844

3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Data model

Ontology













implements

implements

Makes

sense of

Makes

sense of

SBRML


Systems Biology Results Markup
Language


A new markup language for specifying the
results from operations on SBML models


http://www.comp
-
sys
-
bio.org/tiki
-
index.php?page=SBRML

3
-
Oct
-
13

Anatoly Sorokin

SBRML


3
-
Oct
-
13

Anatoly Sorokin

SBRML


3
-
Oct
-
13

Anatoly Sorokin


3
-
Oct
-
13

Anatoly Sorokin


3
-
Oct
-
13

Anatoly Sorokin


3
-
Oct
-
13

Anatoly Sorokin

Dimension example

3
-
Oct
-
13

Anatoly Sorokin

3
-
Oct
-
13

Anatoly Sorokin

Dimension example

3
-
Oct
-
13

Anatoly Sorokin

3
-
Oct
-
13

Anatoly Sorokin

Model

Simulation

Result

?

SED
-
ML

SBRML

Minimal

Requirements

Data model

Ontology













implements

implements

Makes

sense of

Makes

sense of

TEDDY


The TErminology for the Description of
DYnamics (TEDDY) project aims to
provide an ontology for dynamical
behaviours, observable dynamical
phenomena, and control elements of bio
-
models and biological systems in Systems
Biology and Synthetic Biology.


http://www.ebi.ac.uk/compneur
-
srv/teddy/

3
-
Oct
-
13

Anatoly Sorokin

TEDDY top
-
level structure


Temporal Behaviour (concrete behaviours of a model,
more or less the same as trajectories):


Oscillation, Steady State, Fixed Point, Cycle, ...


Behaviour Characteristic (properties to characterise
concrete behaviours):


Period, Amplitude, ...


Behaviour Diversification (system properties describing
the ability of systems to exhibit different behaviours):


Bifurcation, Bi
-
Stability


Functional Motif (structural features of a system
necessary for specific function):



Negative Feedback, FFL, ...


3
-
Oct
-
13

Anatoly Sorokin

TEDDY


3
-
Oct
-
13

Anatoly Sorokin

Questions

3
-
Oct
-
13

Anatoly Sorokin