Authors: Li Ding and Tim Finin

wrendeceitInternet and Web Development

Oct 21, 2013 (3 years and 9 months ago)

69 views

Authors:

Li Ding and Tim
Finin



Presented by:

Lohith

Ram


Introduction


The Conceptual Model of the Semantic Web on the Web


Creating a Global Catalog


Measuring Semantic Web documents


Measuring Semantic Web Terms


Conclusion


9/16/2013

Characterizing the Semantic Web on the Web

2

9/16/2013

Characterizing the Semantic Web on the Web

3

Three important parts for Characterizing the semantic web on the web are:


Designing a Conceptual Model

Creating a Global Catalog

Measuring Data

Web Of Belief Ontology:


Foundation for Semantic Web Characterization


Captures not only the semantic structure of RDF graph but also its provenance in
terms of the Web and the Agent world

9/16/2013

Characterizing the Semantic Web on the Web

4

Important notions from the Model:

1.
Semantic Web Document (SWD)


Pure
Semantic Web
Document (PSWD
)


Embedded
Semantic Web Document
(ESWD)

2.
URIreference
(
URIref
)


URIref

of an
rdfs:Resource

conveys dual semantics

I.
A unique identifier for the resource

II.
The web address of the SWD defining the resource


URIrefs

are widely used to merge RDF graphs distributed on the semantic web

3.
Semantic Web terms (SWT)


Named resources that have meta
-
usages in SWD’s


9/16/2013

Characterizing the Semantic Web on the Web

5

9/16/2013

Characterizing the Semantic Web on the Web

6

Six Types of meta
-
usages are defined below

Two additional concepts are used studying ontologies


Semantic Web Ontology:


Sub
-
class of Semantic web document and physically groups definition of SWT’s.


SWO is identified either by containing DEF
-
C, DEF
-
P, REF
-
C, REF
-
P meta usages or by
instances of
owl:Ontology


Semantic Web Namespace:


Sub
-
class of
rdfs:Resource

and logically groups SWT’s and enables distributed definition.


SWN is identified as the namespace part of an SWT.

9/16/2013

Characterizing the Semantic Web on the Web

7

Estimating the number of online SWDs

A Hybrid Semantic Web Harvesting Framework

Harvesting result and performance

9/16/2013

Characterizing the Semantic Web on the Web

8


In order to effectively harvest as many as possible SWDs on the Web with minimum
cost, a automatic, hybrid Semantic web harvesting framework that integrates
several harvesting methods was developed. The figure 2 below illustrates its work
flow

9/16/2013

Characterizing the Semantic Web on the Web

9

9/16/2013

Characterizing the Semantic Web on the Web

10

Swoogle

System

Bootstrapping

Google
-
based
Meta Crawling

Bounded
HTML
Crawling

RDF Crawling

Inductive
learner and
Swoogle

Sample
Dataset

9/16/2013

Characterizing the Semantic Web on the Web

11

9/16/2013

Characterizing the Semantic Web on the Web

12

9/16/2013

Characterizing the Semantic Web on the Web

13

9/16/2013

Characterizing the Semantic Web on the Web

14

9/16/2013

Characterizing the Semantic Web on the Web

15


SWD Top
-
level Domains: Analyzing the top level domains(TLDs) of SWDs suggests
the degree to which semantic web data is published by region and type of
organization

9/16/2013

Characterizing the Semantic Web on the Web

16


SWD Source Websites

9/16/2013

Characterizing the Semantic Web on the Web

17


SWD Age

9/16/2013

Characterizing the Semantic Web on the Web

18


SWD Size

9/16/2013

Characterizing the Semantic Web on the Web

19


SW06MAR dataset has 1,576,927 distinct Semantic web terms defined with respect
to14,488 Semantic Web namespaces


4 SWT
-
usage patterns derived by analyzing the combination of six basic types of
meta
-
usages are


9/16/2013

Characterizing the Semantic Web on the Web

20


SWT Definition complexity


A simple way to measure is to count the number of triples used to define SWT


Figure below shows the cumulative distribution of the size of SWT definition In the curve
labelled “all”

9/16/2013

Characterizing the Semantic Web on the Web

21


SWT Instance Space


Measured by counting POP
-
C and POP
-
P meta
-
usages of SWTs.


Figure below shows the cumulative distribution of the number of SWTs populated as a
class(or property) by at least m instances(or SWDs)

9/16/2013

Characterizing the Semantic Web on the Web

22


Semantic Web


Is not just one Universal RDF graph


Is a Federated collection documents distributed on and accessed via the World Wide Web.


Must be studied from both the Web perspective and the Semantic perspective


Characterizing the Semantic web on web


Estimated the size of the Semantic web using Google


Implemented a hybrid framework for harvesting Semantic Web data and


Measured results to answer questions on the Sematic Web’s current deployment status


The statistics support several conclusions about the emerging Semantic web


Semantic Web Data is growing steadily on the web


The space of instances is sparsely populated


Ontologies can be induced or amended by reverse engineering

9/16/2013

Characterizing the Semantic Web on the Web

23

Questions?

9/16/2013

Characterizing the Semantic Web on the Web

24