Semantic Web and Ontology - About CETI

wrendeceitInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

72 εμφανίσεις

Discussion

Technology: Semantic Web and Ontology

“To

create

a

single

knowledge

resource

portal

for

the

clinical

and

translational

research

community

that

would

provide

a

‘front

door’

for

a

variety

of

resources
.


Vision


There

is

a

need

for

an

alternative,

engaging

system

that

will

assist

in

finding

and

leveraging

opportunities

for

collaboration

on

a

broader

scale
.


In

the

context

of

clinical

and

translational

research,

the

ability

to

manage

and

reason

upon

complex

and

large
-
scale

data

sets

is

of

particular

importance,

and

remains

an

area

of

open

research
.


The

goal

of

ResearchIQ

is

to

empower

non
-
technical

domain

experts

to

reason

upon

and

pose

questions

related

to

such

heterogeneous

data

sets
.

The Idea

The

initial

design

and

evaluation

of

Research
-
IQ

was

conducted

in

three

phases

[Fig

1
]
.

Design Phases

Satyajeet

Raje
,
Ankush

Srivastava
,
Omkar

Lele
, Tara Payne

http://www.ceti.cse.ohio
-
state.edu; http://citih.osumc.edu

Figure
2:
System Architecture Diagram

Figure 5:
The Semantic Web


The Semantic Web is a "web of data“.


It extends the network of hyperlinked
human
-
readable web pages by inserting
machine
-
readable metadata about
pages and how they are related to each
other. [Fig 5]

Figure 7:

An example
Ontology


Ontology

-

An

ontology

is

a

standardized

representation

of

knowledge

as

a

set

of

concepts

within

a

domain,

and

the

relationships

between

those

concepts
.



It

can

be

used

to

reason

about

the

entities

within

that

domain,

and

may

be

used

to

describe

the

domain
.


“formal,

explicit

specification

of

a

shared

conceptualization”

[Gruber,

1993
]


Semantic

search

seeks

to

improve

search

accuracy

by


1.
understanding

searcher

intent


2.
the

contextual

meaning

of

terms

as

they

appear

in

the

searchable

data

space


It

can

be

applied

on

the

Web

or

within

a

closed

system


The

goal

is

to

generate

more

relevant

results
.

This

application

was

supported

by

Award

Number

UL
1
RR
025755

from

the

National

Center

For

Research

Resources
.


The

content

is

solely

the

responsibility

of

the

developers

and

does

not

necessarily

represent

the

official

views

of

the

National

Center

For

Research

Resources

or

the

National

Institutes

of

Health
.

A

special

thanks

to

Prof
.

Rajiv

Ramnath

and

Prof
.

Jay

Ramanathan

for

initiating

this

project

at

CETI
.

Introduction

System Architecture

User Interface

Acknowledgements

Figure

1
:


Overview

of

the

three
-
phase

design

(Phases

1

and

2
)

and

initial

evaluation

(Phase

3
)

process

utilized

for

Research
-
IQ
.

ResearchIQ

An ontology
-
anchored integrative query tool


The

system

takes

existing

data

bases

(like

GenBank

and

Pubmed
)

or

free

text

sources

(OSU

pro

web

site)

as

input
.


The

output

is

essentially

an

ontology

anchored

data

store

and

an

interface

to

query

it
.


Visualizing

the

results

is

equally

important

as

this

is

a

portal

to

several

distinct

resources

and

services
.

Figure
3: Annotation Pipeline

Figure
4: Query Pipeline


The

annotation

engine

uses

MetaMap
,

a

tool

provided

by

National

Library

of

Medicine

(NLM)
.


The

“semantic

view”

of

a

document

is

a

list

of

concepts

from

the

different

domain

ontologies
.


The

data

is

then

indexed

using

Lucene
.


At

the

same

time

it

is

pushed

in

a

triple

store
.


A

derived

ontology

is

generated

for

query

purposes

by

inferencing

on

the

acquired

data

based

on

the

UMLS
.


The

query

is

in

the

form

of

list

of

concepts
.


The

initial

solution

set

and

their

scores

are

obtained

by

running

the

query

through

Lucene
.


Using

this

set

as

seeds

we

propagate

the

scores

through

the

triple

store

to

find

conceptually

connected

results
.


The

inferenced

ontology

is

used

as

a

graph

to

generate

these

results
.

Querying

Annotation


RDF

-

The

Resource

Description

Framework

is

a

family

of

W
3
C

specifications

originally

designed

as

a

metadata

data

model
.


OWL

-

The

Web

Ontology

Language

is

a

family

of

knowledge

representation

languages

for

authoring

ontologies
.


SPARQL

-

An

RDF

query

language
;

SPARQL

allows

for

a

query

to

consist

of

triple

patterns,

conjunctions,

disjunctions,

and

optional

patterns
.

Figure
6:

The Semantic
Web stack


Speed of annotating and querying


Evaluation of results


Dependence on other tools


Security


New data sources (New annotation
pipelines)


Change in standards over time (New
versions of the ontology)

Challenges

Significance

To

the

best

of

our

knowledge,

no

such

platforms

have

been

developed

in

the

clinical

and

translational

research

domain
.


The

pilot

study

and

early

results

proved

that


Ontologies

can

be

used

to

implement

Semantic

Search

successfully
.


The

results

were

more

comprehensive

than

pure

syntactic

search
.

Figure
3: Search engine home page

Figure
4: Search engine page with data