Department of Bioinformatics - BiGCaT
1
Bioinformatics
with semantic web technologies
Department of Bioinformatics - BiGCaT
2
The context
Department of Bioinformatics - BiGCaT
3
Bioinformatics
1.
Using
information
to
understand
biology
2.
Tools:
1.
Automation
2.
Multivariate statistics
(Cheminformatics: using information to understand biology)
Department of Bioinformatics - BiGCaT
4
What is
information
? #1
•
Human language
“
Protein X binds
to inhibitor Y ...”
Huh?
•
Tables: CSV,
Excel, …
1,5,CDK3,4.1,...
-1,7,CDK3,4.2,...
Huh?
Department of Bioinformatics - BiGCaT
5
What is
information
? #2
•
Names are not enough to understand
the biology or chemistry
•
What did the mean in the first place?
–
What measurement is behind that
metabolite name?
–
What was the genotype of that cell line
used for that dose-response assay?
–
How did they get
that
logP for glucose??
Department of Bioinformatics - BiGCaT
6
Meaning
•
Controlled vocabulary
–
things have a description
•
Thesaurus
–
things have a description, and are
hierarchical organized
•
Ontology
–
things have a description, are hierarchical
organized, and have implications
Department of Bioinformatics - BiGCaT
7
…
a solution
Department of Bioinformatics - BiGCaT
8
Resource Description Framework
•
A simple framework to express
knowledge
–
no prescribed syntax
•
works in HTML, JSON, XML, and many more
–
decouples meaning from assertions
•
Web Ontology Language (OWL) is a layer
–
technology independent
–
dereferencable (
semantic web
)
Department of Bioinformatics - BiGCaT
9
RDF in HTML?
O. Jankowski,
http://www.pharmash.com/posts/2010-09-27-sparql-to-chart.html
, 2010
Department of Bioinformatics - BiGCaT
10
Five Stars (T. Berners-Lee)
http://5stardata.info/
Department of Bioinformatics - BiGCaT
11
Resource Description Framework
•
Two kinds of facts:
<subject> <relatesTo> <object> .
<subject> <hasValue> “value” .
Department of Bioinformatics - BiGCaT
12
…
working out solutions
Department of Bioinformatics - BiGCaT
13
Toxicity Prediction
Department of Bioinformatics - BiGCaT
14
Malaria data set ...
Department of Bioinformatics - BiGCaT
15
OpenTox: running models
models =
opentox
.listModels(ontologyService);
model = models.get(3);
// third model
js
.say(
opentox
.predictWithModel(
service,
model,
molecules
)
);
Department of Bioinformatics - BiGCaT
16
Bayesian statistics
Department of Bioinformatics - BiGCaT
17
Linked (Open) Drug Data
M. Samwald et al. JChemInf, 2011, ...
Department of Bioinformatics - BiGCaT
18
ChEMBL
Department of Bioinformatics - BiGCaT
19
Department of Bioinformatics - BiGCaT
20
Combining data sets
WHERE {
?sider sider:cid [pubchem:std_inchi ?inchi] .
?sider sider:side_effect ?side_effect . FILTER regex(?side_effect,"hypertension","i") .
?mol bo:inchi ?inchi .
?act chembl:forMolecule ?mol .
?act chembl:type ?type ;
cito:citesAsDataSource ?paper ;
chembl:standardValue ?value ;
chembl:standardUnits "nM";
chembl:type "IC50";
chembl:onAssay ?assay .
FILTER (xsd:float(?value)<10000) .
?assay chembl:hasTarget ?target .
?target dc:title ?title .
?target chembl:organism "Homo sapiens" .
?target owl:sameAs ?target_bio2rdf .
?interaction hprd:Uniprot_A [owl:sameAs ?target_bio2rdf] .
?interaction hprd:uniprot_B [owl:sameAs ?target_b] .
}
Department of Bioinformatics - BiGCaT
21
Linked Open Data
Department of Bioinformatics - BiGCaT
22
Domain Ontologies
•
PRotein Ontology (PRO)
•
Chemical Entities of Biological Interest (ChEBI)
•
Cheminformatics Ontology (CHEMINF)
•
Bibliography Ontology (BIBO)
•
Citation Typing Ontology (CiTO)
•
Cell Type Ontology
•
PRotein Ontology
•
Next: BioAssay Ontology (BAO), Cell Line Ontology
(CLO)
Department of Bioinformatics - BiGCaT
23
Units reasoning→
ops:Nanomolar
rdf:type qudt:MolarConcentrationUnit ,
qudt:SIDerivedUnit ;
rdfs:label "Nanomolar"^^xsd:string ;
qudt:abbreviation "nM"^^xsd:string ;
qudt:conversionMultiplier
0.000001 ;
qudt:conversionOffset
"0.0"^^xsd:double ;
qudt:symbol "nmol/dm^3"^^xsd:string .
Department of Bioinformatics - BiGCaT
24
…
summary
Department of Bioinformatics - BiGCaT
25
Summary
•
Complements other bioinformatics
methods
–
replacing XML (huh?), SQL (huh?)
•
Single (set of) standard(s)
–
thus: tools available for any environment
•
Helps us understand biology
–
by supporting explanations of strange
patterns
•
Greatly improved expressiveness!
Department of Bioinformatics - BiGCaT
26
Thanx
•
ChEMBL-RDF contributors (Janna
Hastings, Peter Ansell, Mark Davies)
•
Bioclipse team (Ola Spjuth, Jonathan
Alvarsson, Samuel Lampa, Martin
Eklund, ...)
•
OpenTox team (Nina Jeliazkova, Barry
Hardy, Roland Grafstrom…)
•
W3C HCLS team (many)
Department of Bioinformatics - BiGCaT
27
Further detail
http://www.sciencedirect.com/science/article/pii/S1570826812000376?v=s5
http://www.jcheminf.com/content/3/1/19
http://www.jbiomedsem.com/content/2/S1/S6
http://www.biomedcentral.com/1756-0500/4/487
http://chem-bla-ics.blogspot.com/
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Comments 0
Log in to post a comment