Efficient Processing of Semantic

farmpaintlickInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

83 εμφανίσεις

Efficient

Processing
of

Semantic

Information on
the

Web


Georg Lausen

Technische Fakultät

Universität Freiburg


The
amount

of available information on Web still is
increasing

rapidly
.



(Semi
-
)Automatic Data
Extraction

.



Resource Description Framework (RDF)
.



SPARQL
is the standard query language for RDF
.



Efficiency and Scalability of query processing
.

Processing
of

Semantic

Information on
the

Web

Efficiency
and

Scalability
: A
Variety

of

Approaches


Single
machine

RDF
stores



Parallel Database Approach:
Vertica

and

others




Approaches
based

on
Hadoop

(
MapReduce

Paradigm
)


Hadoop


Hadoop
++


Integration
of

databases
:
HadoopDB


Language
translation


Mapping SPARQL
to

Hadoop
/
HBase

directly


Mapping SPARQL
to

Pig

Latin




Non
Hadoop

clusters


Cluster
-
based

Parallelism

vs

Parallel Database/Single
Machine

RDF
-
Store


Each

technology

has

its

own

advantages

and

problems
.


Rough

characterization
:

Querying

Loading

Parallel Database / Single
Machine

RDF
-
Store

+

-

Cluster
-
based

Parallelism

-

+

Loading

in
the

context

of

Web
research
:
E
xtract

T
ransform
L
oad

schema
.


SPARQL
provides

a
declarative

way

for

specifying

the

transformation

and

querying
.

ETL
and

Querying

in
the

context

of

Web
research

Web
documents

Initial RDF
graph

RDF
store

E

L

T

Efficient

Loading

Efficient

querying

SPARQL

PigSPARQL
: Mapping SPARQL
to

PigLatin
;
to

appear

Semantic

Web Information Management


SWIM 2011