Introduction to The Semantic Web

farmpaintlickInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

90 εμφανίσεις

Introduction to

T
he Semantic Web

Rick Bradshaw M.S.

Sr. Data Architect

Office of the Associate VP

Health Sciences IT

Overview


Introduce the Semantic Web


Interactive study of
ClinicalTrials.gov

semantic
web style


Take a closer look at RDF


Run example SPARQL queries


Introduce federation


Run example SPARQL queries against federated
data




Semantic Web Definition


The Semantic Web facilitates applying machine
-
readable semantic data/metadata to resources that
are distributed across the web/internet


Often associated with specific technologies


RDF


Resource Description Framework


RDFS


RDF Schema


OWL


Web Ontology Language


Web 3.0 (?)

http://
en.wikipedia.org
/wiki/
Semantic_Web

Machine
-
readable


A computer can read and “understand” data


A
sk specific questions and get specific answers


A
ggregate specific data, perform calculations,
organize/order returned data


Can Google read and “understand” web data?

Example


Specific Question


How many Spinal Muscular Atrophy trials have been
conducted at the University of Utah and when were they
conducted?


Specific Answer =
?


Google’s Answer



spinal muscular atrophy trial university of
utah



14,500 pages


Top hit is very relevant in content


Is it “computable”?


HTML

<h2>
Enrolling/Ongoing:&
nbsp
;
</h2>

<p>
Clinical and Genetic Studies in Spinal Muscular Atrophy
</p>

<p>Metabolic Dysfunction in SMA: impact of nutritional management</p>

<p>Prospective Study of Bone Abnormalities in SMA</p>

<p>STOP SMA:
Phenylbutyrate

trial in pre
-
symptomatic infants with SMA</p>

<p>

<span>

<span>
Pilot newborn screening project for identification and prospective

followup

of infants with spinal muscular atrophy
</span>

</span>

</p>

<p>

<span>

<span>
Atalauren

extension study in patients with
Duchenne

Muscular

Dystrophy
</span>

</span>



ClinicalTrials.gov

RDF/XML


Semantic Web Data for Clinical Trials


(1) http
://
static.linkedct.org
/


(2) http
://static.linkedct.org/page/trials/NCT00661453


Triples Triples Triples


Triple Statement


<s><p><o>


Subject (s)


the resource


Predicate (p)


the relationship


Often called the “property” in OWL


Object (o)


object of the relationship


Example


(s)
trial:NCT00661453


(p)
linkedct:brief_title


(o) “
CARNIVAL Type I:
Valproic

Acid and
Carnitine

in Infants
With
SMA
Type I ”


Abbreviations


For ease of readability


trial:NCT00661453


“trial:”
-

abbreviation for namespace


“http://
static.linkedct.org
/resource/trials/”



linkedct
:”
-

abbreviation for namespace


“http://
static.linkedct.org
/resource/
linkedct
/”


Triple Notations


There are many


Turtle


RDF


OWL


OBO

Triples Text

Subject

trial:NCT00661453

trial:NCT00661453

trial:NCT00661453

trial:NCT00661453

cond:1237

cond:1237

Predicate

rdf:type

ct:brief_title

ct:start_date

ct:condition

rdf:type

ct:condition_name

Object

ct:trials

“CARNIVAL…”

“April 2008”

cond:12347

ct:condition

“Spinal Muscular…”

Triple Graph

trial:NCT00661453

“CARNIVAL Type I:
Valproic

Acid and
Carnitine


in
Infants With Spinal Muscular Atrophy (SMA) Type I ”

c
t:brief_title

ct:condition

c
t:start_date

cond:12347

“April 2008”

rdf:type

ct:trial

“Spinal Muscular Atrophy Type I ”

c
t:condition_name

rdf:type

c
t:condition

RDF XML


(see file under #2)

<
rdf:RDF
…>


<
rdf:Description

rdf:about
="http://
static.linkedct.org
/resource/trials/NCT00481013">



<
linkedct:brief_title
>
Valproic

Acid in Ambulant Adults With Spinal Muscular



Atrophy
</
linkedct:brief_title
>





<
/
rdf:RDF
>


Observations


RDF is a standard supporting consistent data
representation


Rules about standards apply


Use an existing standards whenever possible

Popular RDF Standards


Friend of a friend


alias=
foaf


describe people and links


Dublin Core


alias=dc


“metadata” standard


Simple Knowledge Organization System


alias=
skos


terminology, thesauri, …

Data Federation


Combine data from more than one data source


Heterogeneous data


All data sources do not use the same standards


ds1.firstName


ds2.first_name


ds3.person_name


Homogeneous data


All data sources use the same standards


ds1.firstName


ds2.
firstName


ds3
.firstName


Property Alignment Assertions


ds1:firstName


owl:equivalentProperty



foaf:firstName


ds2:first_name


owl:equivalentProperty



foaf:firstName

Class Alignment Assertions


ds1:Person


owl:equivalentClass



foaf:Person


ds2:HumanBeing


owl:equivalentClass



foaf:Person


Rule
-
based Assertions


Use rules to evaluate complicated “if
-
then”
scenarios and assert results


SWRL


Semantic Web Rule Language


JRL
-

Jena Rule Language

Reasoning


Compute assertions


Adds new triple statements to the triple graph


I
mplications


Data of interest must be read from all data
sources to compute assertions


When data sources are large this can take a long
time and adequate computational resources are
required

Use Case


Combine clinical
t
rial data with patient data


SMA trial data from
clinicaltrials.gov

(
linkedct.org
) with patient demographics for 5
different trials

Resources


W3 Schools


http://www.w3schools.com/semweb/
default.asp


W3C Web Sites


http://www.w3.org/standards/semanticweb
/


http://www.w3.org/RDF
/


http://www.w3.org/standards/techs/owl#
w3c_all


Safari
Books


http
://
proquest.safaribooksonline.com


Semantic Web Programming


Semantic Web for the Working
Ontologist


Resources


Jena Java API


Protégé


D2R


Entity Relationship Diagram

TRIAL

TRIAL_ID

BRIEF_TITLE

CONDITION_ID

START_DATE

CONDITION

CONDITION_ID

CONDITION_NAME