Semantic Web Applications in Bioinformatics - NUBIOS

weinerthreeforksΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

86 εμφανίσεις

“Semantic Web” Applications
in Bioinformatics

Amr AL
-
Hossary

M.B.B.Ch

Agenda


Web & Semantic Web


RDF & DRF Schema


Elements


Schema


Name space


Queries


Pain, Design Patters, And limits,


Applications of SW in Bioinformatics

Web Today


Documents for HUMANS


Increasing dramatically


Hard to process on semantic level


e.g. searching for “give her a ring” doesn’t
return “engage her”.


Solution (semantic web)


annotating definitions


abstract representation of classes & relations

What is the Semantic Web?

Myths about Semantic Web…


is Top
-
Down


needs ontologies at the beginning


requires all information to be converted to
RDF


must be centralized


handles only binary relations


requires the entire graph to exist on one
memory store

Truths


A means of describing (a Web of) Data


A system defining and incorporating
semantics


A mechanism for making statements on
things


A format for associating metadata


A strategy for federating data systems
(with or without triplestore)

Needs


shared

definitions

of knowledge
domains, i.e. ontologies,


association

of concepts to existing data,


metadata

information

describing
information sources and contents,


search tools

able to make the
best use of
this additional information
.

RDF


(Directed) graph data model


Set of Binary relations (triples)


Subject Predicate Object


NOT like DBMS: Absence of a relation
does NOT mean it is not present.


XML and RDF/OWL are inherently
different


XML = thesaurus document structure


RDF = thesaurus document content

Recombinant Data Space


RDF is about Graphs rather than
statements


Separate Graphs can be merged easily
into one aggregate graph


Graphs can be filtered and pivoted,
without losing meaning

Works in

Works in

Studied

Sun of

Recombinant Data Space

Amr

Amr

Medicine

Zaynab

Ali

Amr

N U

N U

Abolhouda

Recombinant Data Space

Amr

Medicine

Zaynab

Ali

Abolhouda

N U

Recombinant Data Space

Amr

Medicine

Abolhouda

N U

RDF Elements


Resources R


Properties P


Literal Values L


Assertions "R P L" or "R P R"


Namespaces

RDF Schema


RDF Schema (RDFS) is a vocabulary to create
vocabularies...


Comparable to XML Schema or XML DTD


Used to standardize which “tags” the creator of a
graph is allowed to use for annotating resources


Introduces notions such as "Class" and
"Subclass„


Helps define which
relations

a resource of a
certain type may have

RDFS Namespace Elements


X rdf:type rdfs:class


denotes that resource X is a class


R rdf:type rdf:Property


denotes that resource R is a property


R rdfs:domain X


denotes that the subject of R must be an X


R rdfs:range Y


denotes that the object of R must be a Y

Cited from
http://www.nettab.org/
2007
/slides/Tutorial_Stoermer.pdf

Query Languages


RDQL


SERQL


SPARQL (upcoming W
3
C Standard)


SPARQL Example

PREFIX nettab


<http://www.nettab.org/tutorial
-
ns#>

SELECT ?x ?y ?z

WHERE { ?x nettab:givesTalk ?z }


Matching triple:

Subject:

http://www.nettab.org/tutorial
-
ns#hst

Predicate:

http://www.nettab.org/tutorial
-
ns#givesTalk

Object:

http://www.know
-
who.net/talks/nettab.ppt

Cited from
http://www.nettab.org/
2007
/slides/Tutorial_Stoermer.pdf

RDF Limitation (& Design Patterns)


N
-
ary Relations


It understands only Binary ralations


Amr Leads Pulse in
2009



Exceptions


Human RBCs are Unnucleated

Example Applications of
Semantic Web in Bioinformatics

Vocabularies (Thesauri)


Example Thesauri in Medicine



UMLS


SNOMED


MESH


Galen

OKKAM (for ENS)

Addison’s disease in medical
vocabularies



Synonyms


Addisonian syndrome


Bronzed disease


Addison melanoderma


Asthenia pigmentosa


Primary adrenal deficiency


Primary adrenal insufficiency


Primary adrenocortical insufficiency


Chronic adrenocortical insufficiency

Eponym

Symptoms

Clinical
Varieties

Disease of Endocrine System

Addison’s Disease

SNOMED International

Disease/Diagnosis

Disease of the Adrenal Gland

Endocrine Disease

Addison’s Disease

MeSH

Adrenal Gland Disease

Adrenal Gland Hypofunction

Disease

Endocrine Disorder

Adrenal Disorder

Corticoadrenal insufficiency

Addison’s Disease

AOD

Adrenal Cortical Disorder

Endocrine Diseases

Disorder of Adrenal Gland

Hypoadrenalism

Corticoadrenal insufficiency

Adrenal Hypofunction

Addison’s Disease

Read Codes

Organizing concept

Endocrine Diseases

Adrenal Gland
Diseases

Hypoadrenalism

Adrenal cortical
hypofunction

Adrenal Cortex
Diseases

Adrenal Gland
Hypofunction

Addison’s Disease

SNOMED

MeSH

ADO

Read Codes



UMLS

UMLS vocabularies available in
RDF/OWL



NCI Thesaurus (OWL)


http://ncicb.nci.nih.gov/core/EVS



Gene Ontology


http://www.geneontology.org




Repository of biomedical ontologies (OBO,
OWL)


http://www.bioontology.org/ncbo/faces/index.xhtml


User
-
defined Datatypes


Based on syntax used in Protégé


Semantics derived from XML Schema datatypes


For numbers: min, max, digits, fraction digits


For strings: length (min, max, equal), regular


expression patterns


Class (Teenager complete restriction (age
someValuesFrom (datatype(xsd:int
minInclusive(“13”^^xsd:int)
maxInclusive(“19”^^xsd:int)))))

Biological Pathway eXchange
(BioPAX)


Represent:


Metabolic pathways


Signaling pathways


Protein
-
protein, molecular interactions


Gene regulatory pathways


Genetic interactions


Community effort: pathway databases
distribute pathway information in standard
format

cPath


cPath is a database and software suite for
storing, visualizing, and analyzing biological
pathways

cPath Key Features


Identifier mapping system e.g. proteins


Scalable pathway data aggregation


Simple web interface for browse and query


Standard web service API for application
communication


100
% open source


Java, Tomcat, MySQL, Lucene, Struts, YUI


Local installation and customization

iHOP (information Hyperlink Over Protein)

Adding value via text mining

Pathway Commons

A Genome


Phenome

Integrated Approach for

Mining Disease
-
Causal Genes

using Semantic Web

Gudivada Ranga Chandra

Email : gudx
6
u@cchmc.org

Department of Biomedical Engineering/University
of Cincinnati

Division of Biomedical Informatics/ Cincinnati
Children’s Hospital Medical Center

Questions?

References


RDF standard and technologies (presentation)

Heiko Stoermer, University of Trento, Italy


http://www.nettab.org/
2007
/slides/Tutorial_Stoermer.pdf


The Unified Medical Language System (UMLS) and the Semantic Web (presentation)


Olivier Bodenreider, National Library of Medicine, USA


http://www.nettab.org/
2007
/slides/Tutorial_Bodenreider.pdf


Semantic Web for Health Care and Life Science Interest Group: A Vision for Advancing
Research Communities (presentation)

Eric Neumann, Teranode Co., USA.


http://www.nettab.org/
2007
/slides/SemanticWeb_Neumann.pdf


OKKAM web site


http://www.okkam.org/


Unified Medical Language System (UMLS)


http://www.nlm.nih.gov/research/umls/


Biological Pathway eXchange (BioPAX)


http://www.biopax.org/


cPath: Demo Site


http://cbio.mskcc.org/cpath/


iHOP (information Hyperlink Over Protein)


http://www.ihop
-
net.org/UniPub/iHOP/


Pathway Commons


http://www.pathwaycommons.org/pc/