Semantic Web and Web 2.0

farmpaintlickInternet and Web Development

Oct 21, 2013 (3 years and 9 months ago)

84 views

oreChem
:

Linking Chemistry Scholarship into the
Semantic Web and Web 2.0


Carl Lagoze, Cornell University

Prasenjit

Mitra
, William
Brouwer

(Penn State University)

Mark
Borkum

(University of Southampton)

The Fourth Paradigm


Machine
-
actionable Substrate

´
Integration of Datasets

´
Exposure of Process

Scholarship 2.0

Linked Data Cloud

Requirements of Scholarship 2.0

Hubble optical observation

Baltimore, MD

Basic object information

Strasbourg, France

A “data
-
aware document”

text

2006 Astrophysics paper

X
-
MM
-
Newton X
-
ray observation

Vilspa
, Spain

Chandra X
-
ray observation

Cambridge, MA

Reuse, Aggregation, Reuse ...

Identity?

Description?

Identity

Description

Object
-
Centered Sociality

M

Mashup

Reputation

Relationships

Conversation

Groups

Sharing

Collaboration

Actions

Presence

Open Archives Initiative


Object Reuse and Exchange

Triples

describes

aggregation

Resource Map

http://
www.openarchives.org
/ore/




oreChem



The Chemical Semantic Web


At
-
source capture of experiment data and
research process (Electronic Lab Notebook)


Compound object authoring


Retrospective harvesting of chemistry data


Representation/Reuse through common ORE
data model and ontology


Cloud
-
based triple store


Chemical structure search



In the future ideal world …

<?
xml

version
="1.0" ?>

<
cml

version
="3"
convention
="org
-
synth
-
report"
xmlns
="http://www.xml
-
cml.org/schema">


<
molecule

id
="m1">


<
atomArray
>


<
atom

id
="a1"
elementType
="C"
x2
="
-
2.9149999618530273"
y2
="0.7699999809265137" />


<
atom

id
="a2"
elementType
="C"
x2
="
-
1.5813208400249916"
y2
="1.5399999809265137" />


<
atom

id
="a3"
elementType
="O"
x2
="
-
0.24764171819695613"
y2
="0.7699999809265134" />


<
atom

id
="a4"
elementType
="O"
x2
="
-
1.5813208400249912"
y2
="3.0799999809265137" />


<
atom

id
="a5"
elementType
="H"
x2
="
-
4.248679083681063"
y2
="1.5399999809265137" />


<
atom

id
="a6"
elementType
="H"
x2
="
-
2.914999961853028"
y2
="
-
0.7700000190734864" />


<
atom

id
="a7"
elementType
="H"
x2
="
-
4.248679083681063"
y2
="
-
1.907348645691087E
-
8" />


<
atom

id
="a8"
elementType
="H"
x2
="1.0860374036310796"
y2
="1.5399999809265132" />


</
atomArray
>


<
bondArray
>


<
bond

atomRefs2
="a1 a2"
order
="1" />


<
bond

atomRefs2
="a2 a3"
order
="1" />


<
bond

atomRefs2
="a2 a4"
order
="2" />


<
bond

atomRefs2
="a1 a5"
order
="1" />


<
bond

atomRefs2
="a1 a6"
order
="1" />


<
bond

atomRefs2
="a1 a7"
order
="1" />


<
bond

atomRefs2
="a3 a8"
order
="1" />


</
bondArray
>


</
molecule
>

</
cml
>

Chem4Word
-

Chemistry Drawing in Word

Relationships:
Navigate and link
referenced chemistry

Available soon:

http://research.microsoft.com/chem4word/




Data:
Semantics
stored in Chemistry
Markup Language

Intent:
Recognizes
chemical dictionary and
ontology terms

Author/edit 1D and 2D chemistry.
Change chemical layout styles.

Intelligence:
Verifies validity of
authored chemistry

Triple store

data

Unfortunately …

PSU


NMR Spectra and


Structural Data


Experiment data


Bibliographic metadata


Citations


Figures


Tables


Chunks


Reactions


Molecular


Compounds

Cambridge

Indiana



Computational

Chemistry (Gaussian)

triplestore

Southampton

Ontologies

Chemistry

Ontology


(
Nico

Adams


Cambridge)

Experiments Ontology (prototype)

20

Document Ontology

Reaction Ontology

molecules

Data

(capture)

Semantic

Graph

(storage)

Mash
-
up

(reuse)

text

observations

measurements

documents

data

molecules

data

scientists

datument

lab notebook

experiment

“May all your problems be technical”

Scholarly communities behave very differently
(example: preprint server)?

success



1991
Ginsparg

@
LANL



high
-
energy
physics, step
-
wise
expansion




societies hands
-
off, cooperative


modified



1999

Director of NIH



all of biomedicine






societies take
control

failure



2000

Commercial
publisher



all of chemistry





societies adverse

Physics

Biomedicine


Chemistry

arXiv



eBiomed

/
PubMedCentral


CPS

Chemistry is particularly challenging


Commercial value of chemical information
(pharmaceuticals)


Nature of Chemistry research culture


pre
-
dominance of synthesis (creation) overshadows
discovery mode typical of physics or biology


autonomy, successful research with limited reliance on
others



Monopoly of scholarly societies qua publishers


ACS (CAS)


RSC

The Future


Continue work on technical innovations and
infrastructure


Demonstrate through value
-
add applications


Understand socio
-
technical barriers


International workshop/study


Chemistry as “canary in coal mine”


Integrate with larger infrastructure effort


Data Conservancy