Rocca-Serra presentation

dasypygalstockingsΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

111 εμφανίσεις

The European Bioinformatics Institute

MGED ontology for consistent
annotation of microarray experiments

Manchester Bioinformatics Week

Ontologies Workshop
1

March 23
-
24
th

2002

Philippe Rocca
-
Serra

Microarray Informatics Team

EBI
-
EMBL, Hinxton Cambridge


The European Bioinformatics Institute

ArrayExpress: a database for


Gene Expression Studies

Samples

Genes

Gene expression
data matrix

The European Bioinformatics Institute

ArrayExpress goals


To create a public repository for gene expression data:





apply a standard format





a
pply curation to the data (high quality control)





easy access to information





search and retrieve information


To compare experiments.


To perform

analysis and data mining using complex querying

The European Bioinformatics Institute

Gene expression
data matrix

Experiment
(platform,
conditions…)

What kind of data should be stored ?

Samples

Genes &
transcription
units

annotations

The European Bioinformatics Institute

Important issues about data annotation



Sufficient annotation of the experiment, genes and samples



Efficient annotation:



Machine processable: effective mining agents


Homogenous: consistent annotation


Unambiguous: accurate description, sample
discrimination.



The European Bioinformatics Institute

MIAME Requirements:

addressing the issue of sufficient annotation



Experimental design
: the set of hybridisation experiments as a
whole


Array design
: each array used and each element (spot) on the
array


Samples
: samples used, extract preparation and labelling


Hybridisations
: procedures and parameters


Measurements
: images, quantitation, specifications


Normalisation controls
: types, values, specifications


(Brazma et al, Nature Genetics, 2001)


Samples:

samples used, extract preparation and labelling

Recorded info should be sufficient to interpret and
replicate the experiment

The European Bioinformatics Institute

Second Challenge

Addressing the issue of annotation efficiency


requires machine understandable annotations:


Avoid free text and natural language:


Avoid synonyms:
adrenaline / epinephrine


General use of CV and Ontologies


Gene annotation using e.g. GO and pathway analysis



Create a new ontology where necessary:


Task assigned to MGED for Biomaterial (sample)
description



One of the main MGED Goal

to facilitate the adoption of
standards

for DNA
-
array experiment annotation and data
representation

The European Bioinformatics Institute



ArrayExpress DB is an implementation of the MAGE
-
OM
model (a UML model)




MAGE model by construction includes the use of ontology
entries :

-
37 locations for an “Ontology Entry”

-
36 cases of simple Controlled Vocabularies: e.g. Image
Format (TIFF, JPEG)

-
1 has required development of specific modelling:

Biomaterial (sample) description

Ontology integration in the object model
describing ArrayExpress database

The European Bioinformatics Institute

MAGE BioMaterial Model

The European Bioinformatics Institute

Facts about MGED biomaterial ontology


Authors:



Developed by Chris Stoeckert, U. Penn and Helen Parkinson, EBI



Coordinated with the ArrayExpress database model (mapping available)




Technical choices
: Use of the OIL Language


A new standard for building ontologies provides support for Formal
Semantics and Reasoning:


Class/property modelling primitives

based on Frame based systems:


Semantics Capturing
based on Description Logics:


Syntax
for encoding primitives and semantics based on existing Web
languages: XML





Availability
:
http://mged.sourceforge.net/Ontologies.shtml

The European Bioinformatics Institute

MGED ontology:
features & complexity


Facts about the ontology:


75 classes


70 slots


98 individuals


more individuals to
be added

The European Bioinformatics Institute

Using MGED Ontology: a Browseable Form

The European Bioinformatics Institute

MGED defined concepts: internal terms

The European Bioinformatics Institute

Linking to external ontologies: an application

The European Bioinformatics Institute






©
-
BioMaterialDescription



©
-
Biosource Property




©
-
Organism




©
-
Age




©
-
DevelopmentStage




©
-
Sex




©
-
StrainOrLine




©
-
BiosourceProvider




©
-
OrganismPart



©
-
BioMaterialManipulation




©
-
EnvironmentalHistory





©
-
CultureCondition






©
-
Temperature






©
-
Humidity






©
-
Light





©
-
PathogenTests





©
-
Water





©
-
Nutrients




©
-
Treatment





©
-
CompoundBasedTreatment






(Compound)






(Treatment_application)






(Measurement)












Instances











7 weeks after birth







Female







Charles River, Japan










22


2

C




55


5%




12 hours light/dark cycle





S
pecified pathogen free conditions




ad libitum




MF, Oriental Yeast, Tokyo, Japan













in vivo
, oral gavage




100mg/kg body weight






MGED Ontology





External References

NCBI Taxonomy

Mouse Anatomical Dictionary


International Committee on Standardized

Genetic Nomenclature for Mice


Mouse Anatomical Dictionary

ChemIDplus

Mus musculus musculus

id: 39442

Stage 28

C57BL/6

Liver

F
enofibrate
,
CAS 49562
-
28
-
9


The European Bioinformatics Institute

Referencing to external ontologies



NCBI taxonomy database


Jackson Lab mouse strains and genes


Edinburgh mouse atlas anatomy


GO Gene Ontology


HUGO nomenclature for Human genes


Chemical and compound Ontologies
-

Merck index


TAIR


Flybase


…..and many more…
www.mged.org/ontology/

The European Bioinformatics Institute

Planning MGED ontology’s future



Making the ontology available where it’s needed:



Develop browser or other interface for the ontology and link to
LIMS



Incorporate the ontology into submission/annotation and
curation tools (MIAMExpress)





The European Bioinformatics Institute

Planning MGED ontology’s future


ArrayExpress DB

Direct Submission
in Mage
-
ML

Large centres LIMS

Submission via
MIAMExpress

Curation DB

Other submitters

Ontology availability made simple ?


MGED/ArrayExpress
ontology

External Ontologies

The European Bioinformatics Institute

Planning MGED ontology’s future



Making the ontology available where it’s needed:


Develop browser or other interface for the ontology and link to
LIMS


Incorporate the ontology into submission/annotation and
curation tools (MIAMExpress)






Further ontology development : new instances, class refinement


Better integration of available ontologies


Writing guidelines on how to use ontologies for annotating data:


Developing Use cases (non trivial task)



The European Bioinformatics Institute

Resources


List of ontology resources from MGED pages


MAGE
-
MIAME
-
ontology mappings, MIAME glossary


Schemas for both ArrayExpress and MIAMExpress


Annotation examples in MAGE
-
ML


URL:



www.mged.org

¦ www.ebi.ac.uk/microarray


mailing lists:

microarray
-
ontol
-
request@ebi.ac.uk




microarray
-
annot
-
request@ebi.ac.uk


The European Bioinformatics Institute

Acknowledgements

EBI
-
EMBL:



University of Pennsylvania:






H. Parkinson



C. Stoeckert



S. Sansone



E. Holloway



A. Brazma




And the Microarray Informatics Team.