Identity and schema for Linked Data

wrendeceitInternet and Web Development

Oct 21, 2013 (3 years and 9 months ago)

87 views

Hideaki Takeda / National Institute of
Informatics

Identity and schema for Linked
Data

Hideaki Takeda

National Institute of Informatics

takeda

nii.ac.jp

2012 INTERNATIONAL

ASIAN
SUMMER

SCHOOL

IN LINKED DATA


IASLOD
2012, August 13
-
17
, 2012, KAIST,
Daejeon
,
Korea

Hideaki Takeda / National Institute of
Informatics

How to put the data into computer?


How to describe the data?


The way to describe individual data


Schema/Class/Concept



The way to describe relationship among
schema/class/concept


Ontology/Taxonomy/Thesaurus



How to refer the data?


The way to identify individual data


Identifier


Relationship among identifiers



Hideaki Takeda / National Institute of
Informatics

Architecture for the Semantic Web

Tim Berners
-
Lee

http://www.w3.org/2002/Talks/09
-
lcs
-
sweb
-
tbl/


The world of instances

(Linked

Data)


The

world of classes (
Ontologies
)

Hideaki Takeda / National Institute of
Informatics

Layers of Semantic Web


Ontology


Descriptions on classes


RDFS, OWL


Challenges for ontology building


Ontology building is difficult by nature


Consistency, comprehensiveness, logicality


Alignment of ontologies is more difficult

Tim Berners
-
Lee

http://www.w3.org/2002/Talks/09
-
lcs
-
sweb
-
tbl/

Descriptions on classes

インスタンスに関する記述

Ontology


Linked Data

Hideaki Takeda / National Institute of
Informatics

Layers of Semantic Web


Linked Data


Descriptions on instances (individuals)


RDF + (RDFS, OWL)


Pros for Linked Data


Easy to write (mainly fact description)


Easy to link (fact to fact link)


Cons for Linked

Data


Difficult to describe complex structures


Still need for class description (
-
> ontology)


Tim Berners
-
Lee

http://www.w3.org/2002/Talks/09
-
lcs
-
sweb
-
tbl/

Descriptions on classes

Description

on instances

Ontology

Linked Data

Hideaki Takeda / National Institute of
Informatics

Importance of Identifiers for Entities


Everything should be identifiable!


Human can identify things with vague
identifiers or even without identifiers with
help from the context around things


On the web, the context is usually not
available and the computer can seldom
understand the context even if it exists


So we need identifiers for all things

Hideaki Takeda / National Institute of
Informatics

Identification System


Identification is one of the primary functions for
human information processing


Naming: e.g., names for people, pets, and some daily
things


OK if the number of things is not so big


Systematic Identification


e.g., phone number, post
-
code, passport number, product number,
ISBN


If the number of things is big enough


Requirements for Systematic Identification


Identifier is stable and sustainable


Uniqueness is guaranteed


Identifier publisher is reliable and sustainable


Hideaki Takeda / National Institute of
Informatics

Identification system for Web


Not so different from conventional identification systems


Difference


Cross
-
system use


Truly digitized



Requirements
for Systematic
Identification for web


Identifier is stable and
sustainable
(even after an entity may
disappear)


Uniqueness is
guaranteed
over all systems


Description on should be associated to identifiers


s
ince entities may not accessible


Identifier publisher is reliable and sustainable


Hideaki Takeda / National Institute of
Informatics

Solutions for the Requirements by
LOD


Requirements for Systematic Identification for
web


1. Identifier
is stable and sustainable (even after an
entity may disappear
)


(up to each identifier publisher)


2. Uniqueness
is guaranteed over all
systems


URI (not URN)


3. Description
on should be associated to identifiers


Dereferenceable

URI


If URI is accessed, a description associated to it should be
returned


4. Identifier
publisher is reliable and sustainable


Hideaki Takeda / National Institute of
Informatics

Some examples

ISBN(International Standard Book Number
)


Abstract


a unique numeric
commercial book
identifier


13 digits


Prefix: 978 or 979 (for compatibility with EAN code)


Grou
p(language
-
sharing country group):
1 to 5 digits


Publisher code:


Item number:


Check
num
: 1 digit


Management: two layers


National ISBN Agency


Publisher


Requirement Satisfaction


1. (Stable ID) Maybe (versioning often matters, and sometimes publisher may
re
-
use ISBN)


2. (Unique ID) Uniqueness is guaranteed but not URI


3. (
Dereferenceable
) No mechanisms (amazon does instead!)


4. (
R
eliable publisher) Yes



Hideaki Takeda / National Institute of
Informatics

Some examples

DOI
(Digital Object Identifier)


Abstract


An identifier for scientific digital objects (mostly scientific articles)


An unfixed string: “prefix/suffix”


Prefix: assigned for publishers


Suffix: assigned for each object


Management: three layers


IDF (International DOI Foundation)


Registration Agency


Publisher


Requirement Satisfaction


1.
(Stable ID) Yes (not re
-
usable)


2.
(Unique ID)Uniqueness
is
guaranteed and URI accessible
(http://dx.doi.org/”DOI”)


3.
(
Dereferenaceable
)Mapping to object pages but no RDF


4.
(Reliable publisher) Maybe


Hideaki Takeda / National Institute of
Informatics

Some examples

Dbpedia (as Identifier)


Abstract


A
wikipedia

page


Name of
wikipedia

page


Maintained manually


Disambiguation page


Redirect page


Requirement Satisfaction


1.
(Stable ID) maybe (sometimes disappear, sometimes change names,
sometime change contents)


2.
(Unique ID) Uniqueness
is
mostly guaranteed
and URI accessible


3.
(
Dereferenceable
) RDF


4.
(Reliable publisher) Maybe




Hideaki Takeda / National Institute of
Informatics

Identification of relationship between
identifiers


Co
-
existence of multiple identification systems on a
field


Difference of coverage


Difference of Viewpoint


An entity can have multiple
identifiers


Need for mapping between identifiers in different
identification systems


Method: Use special properties


owl:sameAs
, (
rdfs:seeAlso
,
skos:exactMatch
)


http://sameas.org


Some problems


Logical inconsistency with
owl:sameAs


Maintainance



Hideaki Takeda / National Institute of
Informatics

LOD Cloud

(Linking Open Data)

Hideaki Takeda / National Institute of
Informatics

Summary for ID


Identification is the crucial part in LOD


Data availability


Data inconsistency


Data interoperability


Establishment of a good identification system
leads a reliable and sustainable LOD.


Hideaki Takeda / National Institute of
Informatics

Structuring Information


A wide range of structuring information


Keywords, tags


A freely chosen word or phrase just indicating some features


Controlled vocabulary


Mapping to the fixed set of words or phrases


e.g., the list of countries, the name authorities


Classification


System for classifying entities. Often hierarchical. Class may not carry meaning.


Taxonomy


Hierarchical term system for classification. Upper/lower relation usually means
general/specific relation


e.g., the subject headings of LC


Thesaurus


System for semantics. More different types of relations: (
hypersym
,
hyposym
),
synonym, antonym
, homonym,
holonym
,
meronym


Ontology


System of concepts. Concepts rather than words. More various relations, the
definitions of concepts

Hideaki Takeda / National Institute of
Informatics

Examples in Library Science


Many systems in the library community


Classification


Universal Decimal Classification (UDC)


Controlled Vocabulary


the authority files for person names, organizations, location names


Library of Congress : 8 Million records, MADS &SKOS


British Library: 2.6 million records,
foaf

& BIO
(A vocabulary for
biographical information
)


National Diet Library (Japan): 1 million records,
foaf


Deutsche
Nationalbibliothek

(DNB, Germany): 1.8 & 1.3 million records
(names & organization),


Virtual International Authority
File (VIAF): 4 million records


Taxonomy


Subject Heading: LC, NDL,


Library of Congress: MADS &SKOS


British
Library:


National Diet Library (Japan):
0.1
million records,
SKOS


Deutsche
Nationalbibliothek

(DNB, Germany): 0.16 million records





Hideaki Takeda / National Institute of
Informatics


Hideaki Takeda / National Institute of
Informatics


Hideaki Takeda / National Institute of
Informatics

UDC as Linked Data

UDC ELEMENT

DEFINITION

SKOS TERM

UDC
SUBPROPERTY

UDC number (notation)

UDC notation is combination of symbols (numerals, signs and letters) that represent a class, its
position in the hierarchy and its relation to other classes. Notation is a language
-
independent
indexing term that enables mechanical sorting and filing of subjects. Also called 'UDC number'
and 'UDC
classmark
'

skos:notation

---

class identifier (URI)

A unique identifier assigned to each UDC class. It identifies the relationship between a class'
meaning and its notational representation

skos:Concept

---

broader class (URI)

Superordinate class: the class hierarchically above the class in question

skos:broader

---

caption

Verbal description of the class content

skos:prefLabel

---

including note

Extension of the caption containing verbal examples of the class content (usually a selection of
important terms that do not appear in the subdivision)

skos:note

udc:includingN
ote

application note

Instructions for number building, further extension and specification of the class

skos:note

udc:application
Note

scope note

Note explaining the extent and the meaning of a UDC class. Used to resolve disambiguation or
to distinguish this class from other similar classes

skos:scopeNot
e

---

examples

Examples of combination are used to illustrate UDC class building i.e. complex subject
statements

skos:example

---

see also reference

Indication of conceptual relationship between UDC classes from different hierarchies

skos:related

---

<
skos:Concept

rdf:about
="http://udcdata.info/025553">

<
skos:inScheme

rdf:resource
="http://udcdata.info/
udc
-
schema"/>

<
skos:broader

rdf:resource
="http://udcdata.info/025461"/>

<
skos:notation

rdf:datatype
="http://udcdata.info/
UDCnotation
">510.6</
skos:notation
>


<
skos:prefLabel

xml:lang
="en">Mathematical logic</
skos:prefLabel
>


<
skos:prefLabel

xml:lang
="
ja
">
記号論理学
</
skos:prefLabel
>


<
skos:related

rdf:resource
="http://udcdata.info/000016"/>

</
skos:Concept
>


http://udcdata.info/

69,000
records

40 Languages

Hideaki Takeda / National Institute of
Informatics

http://id.loc.gov/authorities/names/n79084664.html

<http://id.loc.gov/authorities/names/n79084664>


<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#type>


<
http://www.loc.gov/mads/rdf/v1#PersonalName> .

<http://id.loc.gov/authorities/names/n79084664>


<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#type>

<
http://www.loc.gov/mads/rdf/v1#Authority> .

<http://id.loc.gov/authorities/names/n79084664>

<http
://
www.loc.gov/mads/rdf/v1#authoritativeLabel>



"
Natsume
,
Sōseki
, 1867
-
1916"@en .

<http://id.loc.gov/authorities/names/n79084664>

<
http://www.loc.gov/mads/rdf/v1#elementList>

_:
bnode7authoritiesnamesn79084664 .

_:bnode7authoritiesnamesn79084664


<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#first>

_:
bnode8authoritiesnamesn79084664 .

_:bnode7authoritiesnamesn79084664


<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#rest>

_:
bnode010 .

_:bnode8authoritiesnamesn79084664


<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#type>

<
http://www.loc.gov/mads/rdf/v1#FullNameElement> .

_:bnode8authoritiesnamesn79084664

<
http://www.loc.gov/mads/rdf/v1#elementValue>


"
Natsume
,
Sōseki
,"@en .

_:bnode010 <http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#first>

_:
bnode11authoritiesnamesn79084664 .

_:bnode010 <http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#rest>

<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#nil> .

_:bnode11authoritiesnamesn79084664


<
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#type>

<
http://www.loc.gov/mads/rdf/v1#DateNameElement> .

_:bnode11authoritiesnamesn79084664
<http://www.loc.gov/mads/rdf/v1#elementValue> "1867
-
1916"@en .

<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#classification> "PL812.A8" .

<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#hasExactExternalAuthority>
<http://viaf.org/viaf/sourceID/LC%7Cn+79084664#skos:Concept> .

<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#isMemberOfMADSCollection>
<http://id.loc.gov/authorities/names/collection_NamesAuthorizedHeadin
gs> .

<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#isMemberOfMADSScheme>
<http://id.loc.gov/authorities/names> .

<http://id.loc.gov/authorities/names/n79084664>
<http://www.loc.gov/mads/rdf/v1#isMemberOfMADSCollection>
<http://id.loc.gov/authorities/names/collection_LCNAF> .

Hideaki Takeda / National Institute of
Informatics

http://id.loc.gov/authorities/subjects/sh85008180.html

Hideaki Takeda / National Institute of
Informatics

http://data.bnf.fr/11932084/intelligence_artificielle/

Hideaki Takeda / National Institute of
Informatics

Some examples

Scientific Names for Species and Taxa


Abstract


Names for biological species and other taxa (kingdom,
divison
, class,
order, family, tribe, genus)


A string


Binomial name for species


Academic
societies maintain taxon names individually


E.g.,
Papilo

xuthus

(Asian Swallowtail,
ナミアゲハ
,
호랑나비
)



Requirement Satisfaction


1.
Mostly yes
(sometimes disappear,
change
names,
change
contents)


2. Uniqueness is
generally guaranteed but precise speaking some ambiguity
because of change.


3
.
No. Many systems exists but none covers all species


4. Maybe



Hideaki Takeda / National Institute of
Informatics

分類群

Taxon

植物

偬慮瑳

藻類

䅬A慥

菌類

䙵n杩

動物

䅮業慬i

ドメイン

Domain



Kingdom



Division/Phylum

-
phyta

-
phyta

-
mycota



亜門

Subdivision/Subphylum

-
phytina

-
phytina

-
mycotina





Class

-
opsida

-
phyceae

-
mycetes



亜綱

Subclass

-
idae

-
phycidae

-
mycetidae





Order

-
ales

-
ales

-
ales

亜目

Suborder

-
ineae

-
ineae

-
ineae

上科

Superfamily

-
acea

-
acea

-
acea

-
oidea



Family

-
aceae

-
aceae

-
aceae

-
idae

亜科

Subfamily

-
oideae

-
oideae

-
oideae

-
inae


/


Tribe

-
eae

-
eae

-
eae

-
ini

亜族
/
亜連

Subtribe

-
inae

-
inae

-
inae

-
ina



Genus

亜属

Subgenus



Species

亜種

Subspecies

Hideaki Takeda / National Institute of
Informatics

Ontology

An ontology is an explicit specification of a
conceptualization

[Gruber]



An
ontology

is an explicit specification of a conceptualization. The
term is borrowed from philosophy, where an Ontology is a systematic
account of Existence. For AI systems, what "exists" is that which can
be represented. When the knowledge of a domain is represented in a
declarative formalism, the set of objects that can be represented is
called the universe of discourse. This set of objects, and the
describable relationships among them, are reflected in the
representational vocabulary with which a knowledge
-
based program
represents knowledge. Thus, in the context of AI, we can describe the
ontology of a program by defining a set of representational terms. In
such an ontology, definitions associate the names of entities in the
universe of discourse (e.g., classes, relations, functions, or other
objects) with human
-
readable text describing what the names mean,
and formal axioms that constrain the interpretation and well
-
formed
use of these terms. Formally, an ontology is the statement of a logical
theory.

Hideaki Takeda / National Institute of
Informatics

Conceptualization

object

box

red box

blue box

yellow box

on_desk(A)

on(A, B)

put(A,B)

object

box

box


color:{red, blue, yellow}

on_desk(A)

on(A, B)

put(A,B)

object

box

desk

on(A/box, B/object)

put(A/box,B/object)

box


color:{red, blue, yellow}

Trade off between generality and efficiency

There are many possible ways to conceptualize the target world

Hideaki Takeda / National Institute of
Informatics

Types of Ontologies


Upper (top
-
level) ontology vs. Domain ontology


Upper Ontology: A common ontology throughout all domains


Domain Ontology: An ontology which is meaningful in a specific
domain


Object ontology vs. Task ontology


Object Ontology: An ontology on “things” and “events”


Task Ontology: An ontology on “doing”


Heavy
-
weight ontology vs. light
-
weight ontology


Heavy
-
weight ontology: fully described ontology including
concept definitions and relations, in particular in a logical way


Light
-
weight ontology: partially described ontology including
typically only is
-
a relations

Hideaki Takeda / National Institute of
Informatics

Top
-
level ontology


Ontology which covers all of the world!


Very…. Difficult


e.g., how does a thing exist?


A thing is four dimensional existence?


A thing exists three
-
dimensionally over time?


Common requirements


A small number of concepts can cover the world


Concepts can be used in lower ontologies



Concept should be
general

and
abstract

Hideaki Takeda / National Institute of
Informatics

Top
-
level ontology


Three approaches


Formal approach


Logical formalization


Fully Abstract


Pros: clean


Cons: hardly understandable


e.g., Sowa’s top
-
level ontology, DOLCE


Linguistic approach


Use and extension of linguistic concepts


Partially abstract and partially general


Pros: understandable


Cons: limitation to the linguistic world


e.g., Penman Upper Model,
WordNet


Empirical Approach


Use and extension of everyday concepts


Mostly general


Pros: understandable and applicable to all the world


Cons: lack of solid foundation


e.g. SUMO,
Cyc
, EDR

Hideaki Takeda / National Institute of
Informatics

Empirical top
-
level ontology


SUMO(Suggested Upper
Merged Ontology)


Collection and organization of
concepts used frequently


Simple relationship between
concepts


Hideaki Takeda / National Institute of
Informatics

Formal Ontology: DOLCE


DOLCE(a Descriptive Ontology for Linguistic
and Cognitive Engineering)


Intended to a reference system for top
-
level
ontology


Logical definition


Particular (DOLCE) vs. Universal


Particular: ontology about things, phenomena, quality…


Universal: ontology for describing particular like
categories and attributes

Hideaki Takeda / National Institute of
Informatics

Formal Ontology: DOLCE


Concepts


Endurant

/
Perdurant

/ Quality / Abstract


Endurant
:


“Things”


An existence over time


May change its attribute


Perdurant


“process”


No change over time


May switch a part to the other


Relations


Parthood

(abstract or
perdurant
)


Temporally
Parthood

(
endurant
)


Constitution (
endurant

or
perdurant
)


Participation between
perdurant

and
endurant


Hideaki Takeda / National Institute of
Informatics

Linguistic top
-
level ontology


WordNet


A lexical reference system


“Link
-
based electronic dictionary”




Concepts


synset



Noun 79,689


Verb 13,508



Relations


synonym



hypernym
/hyponym

(is
-
a)


holonym
/
meronym

(a
-
part
-
of)

http://www.cogsci.princeton.edu/cgi
-
bin/webwn

Hideaki Takeda / National Institute of
Informatics

Linguistic top
-
level ontology


WordNet


Top
-
level


{
entity, physical thing

(that which is perceived or known or inferred to
have its own physical existence (living or nonliving)) }


{
psychological_feature
, (a feature of the mental life of a living organism) }


{
abstraction
, (a general concept formed by extracting common features
from specific examples) }


{
state
, (the way something is with respect to its main attributes; "the
current state of knowledge"; "his state of health"; "in a weak financial
state") }


{
event
, (something that happens at a given place and time) }


{
act, human_action, human_activity
, (something that people do or cause
to happen) }


{
group, grouping
, (any number of entities (members) considered as a
unit) }


{
possession
, (anything owned or possessed) }


{
phenomenon
, (any state or process known through the senses rather
than by intuition or reasoning) }


Hideaki Takeda / National Institute of
Informatics

Summary for structuring information


Keywords, tags/Controlled vocabulary
/Classification/Taxonomy /Thesaurus
/Ontology


The difference is not clear, not important


The trend is to go more structured ones


The same requirements to Identification systems


Hideaki Takeda / National Institute of
Informatics

Summary


Requirements
for Successful Structuring
Systems


1. Entity
is stable and
sustainable


2. Uniqueness
is guaranteed over all
systems


3. Description
on should be associated to
entity


4.
System
publisher is reliable and sustainable


Learn from success in the library community

LOD Tech.

can help

Hideaki Takeda / National Institute of
Informatics

Schema/Vocabulary for
LOD


Class/Concept description


Axiom of a concept in ontology


Database schema for a table in Relational database


Object definition in Object
-
Oriented Programming/DB


Class description in Semantic Web


RDFS/OWL description for a class


RDFS: Simple class system


OWL: Description Logic
-
based


Class description in Linked Data


Mostly RDFS
-
based (exception:
owl:sameAs
)


Simple Structure (mostly property
-
value pair)

Hideaki Takeda / National Institute of
Informatics

Schema/Vocabulary for LOD


The importance of sharing schema


Interoperability


Generic applications


Some famous and frequently used
shemata


Dublin Core


FOAF (Friend
-
Of
-
A
-
Friend)


SKOS (Simple Knowledge Organization System)

Hideaki Takeda / National Institute of
Informatics

Usage of Common Vocabularies

Prefix

Namespace

Used by

dc

http://purl.org/dc/elements/1.1/

66 (31.88 %)

foaf

http://xmlns.com/foaf/0.1/

55 (26.57 %)

dcterms

http://purl.org/dc/terms/

38 (18.36 %)

skos

http://www.w3.org/2004/02/skos/core#

29 (14.01 %)

akt

http://www.aktors.org/ontology/portal#

17 (8.21 %)

geo

http://www.w3.org/2003/01/geo/wgs84_pos#

14 (6.76 %)

mo

http://purl.org/ontology/mo/

13 (6.28 %)

bibo

http://purl.org/ontology/bibo/

8 (3.86 %)

vcard

http://www.w3.org/2006/vcard/ns#

6 (2.90 %)

frbr

http://purl.org/vocab/frbr/core#

5 (2.42 %)

sioc

http://rdfs.org/sioc/ns#

4 (1.93 %)

LDOW2011 Presentation, Christian Bizer (Freie
Universität
Berlin), 2011

Hideaki Takeda / National Institute of
Informatics

(Simple) Dublin Core


Started from the library
community


Now maintained by DCMI (Dublin
Core Metadata Initiative)


(Simple) Dublin Core


Just 15 elements


Simple is best


No range restriction


http://purl.org/dc/elements/1.1/






15 elements


Title


Creator


Subject


Description


Publisher


Contributor


Date


Type


Format


Identifier


Source


Language


Relation


Coverage


Rights


Hideaki Takeda / National Institute of
Informatics

dc terms


Qualified Dublin Core


Domain & Range


More precise terms


Extension of simple dc


Properties in the

/

abstract

,

accessRights

,

accrualMethod

,

accrualPeriodicity

,

accrualPolicy

,

alternative

,

audience

,

available

,

bibliograp
hicCitation

,
conformsTo

,

contributor

,

coverage

,

created

,

creator

,

date

,

dateAccepted

,

dateCopyrighted

,

dateSubmit
ted

,

description

,
educationLevel

,

extent

,

format

,

hasFormat

,

hasPart

,

hasVersion

,

identifier

,

instructionalMethod

,

i
sFormatOf

,

isPartOf

,

isReferencedBy

,
isReplacedBy

,

isRequiredBy

,

issued

,

isVersionOf

,

language

,

license

,

mediator

,

medium

,

modified

,

provenance

,

publisher

,

references

,
relation

,

replaces

,

requires

,

rights

,

rightsHolder

,

source

,

sp
atial

,

subject

,

tableOfContents

,

temporal

,

title

,

type

,

valid

Properties in
the
/
elements/1.1/namespace

contributor

,

coverage

,

creator

,

date

,

description

,

format

,

identifier

,

language

,

publisher

,

relation

,

rights

,

source

,

s
ubject

,

title

,

type

Vocabulary Encoding Schemes

DCMIType

,

DDC

,

IMT

,

LCC

,

LCSH

,

MESH

,

NLM

,

TGN

,

UDC

Syntax Encoding Schemes

Box

,

ISO3166

,

ISO639
-
2

,

ISO639
-
3

,

Period

,

Point

,

RFC1766

,

RFC3066

,

RFC4646

,

RFC5646

,

URI

,

W3CDTF

Classes

Agent

,

AgentClass

,

BibliographicResource

,

FileFormat

,

Frequency

,

Jurisdiction

,

LicenseDocument

,

LinguisticSystem

,

Location

,
LocationPeriodOrJurisdiction

,

MediaType

,

MediaTypeOrExtent

,

MethodOfAccrual

,

MethodOfInstruction

,

Pe
riodOfTime

,

PhysicalMedium

,
PhysicalResource

,

Policy

,

ProvenanceStatement

,

RightsStatement

,

SizeOrDuration

,

Sta
ndard

DCMI Type Vocabulary

Collection

,

Dataset

,

Event

,

Image

,

InteractiveResource

,

MovingImage

,

PhysicalObject

,

Service

,

Software

,

Sound

,

Sti
llImage

,

Text

Terms related to the DCMI
Abstract Model

memberOf

,

VocabularyEncodingScheme

Hideaki Takeda / National Institute of
Informatics

Dcterms

subPropertyOf

Domain

Range

contributor

dc:contributor

rdfs:Resource

dcterms:Agent

creator

dc:creator
,
dcterms:contributor

rdfs:Resource

dcterms:Agent

coverage

dc:coverage

rdfs:Resource

dcterms:LocationPeriodOr
Jurisdiction

spatial

dc:coverage
,
dcterms:coverage

rdfs:Resource

dcterms:Location

Temporal

dc:coverage
,
dcterms:coverage

rdfs:Resource

dcterms:PeriodOfTime

Date

dc:date

rdfs:Resource

rdfs:Literal

Available

dc:date
,
dcterms:date

rdfs:Resource

rdfs:Literal

Created

dc:date
,
dcterms:date

rdfs:Resource

rdfs:Literal

dateAccepted

dc:date, dcterms:date

rdfs:Resource

rdfs:Literal

dateCopyrighted

dc:date, dcterms:date

rdfs:Resource

rdfs:Literal

dateSubmitted

dc:date, dcterms:date

rdfs:Resource

rdfs:Literal

Issued

dc:date
,
dcterms:date

rdfs:Resource

rdfs:Literal

Modified

dc:date
,
dcterms:date

rdfs:Resource

rdfs:Literal

Valid

dc:date
,
dcterms:date

rdfs:Resource

rdfs:Literal

description

dc:description

rdfs:Resource

rdfs:Resource

Abstract

dc:description
,
dcterms:description

rdfs:Resource

rdfs:Resource

tableOfContents

dc:description
,
dcterms:description

rdfs:Resource

rdfs:Resource

format

dc:format

rdfs:Resource

dcterms:MediaTypeOrExte
nt

extent

dc:format
,
dcterms:format

rdfs:Resource

dcterms:SizeOrDuration

Medium

dc:format
,
dcterms:format

dcterms:PhysicalR
esource

dcterms:PhysicalMedium

Identifier

dc:identifier

rdfs:Resource

rdfs:Literal

bibliographicCitat
ion

dc:identifier
,
dcterms:identifier

dcterms:Bibliograp
hicResource

rdfs:Literal

Language

dc:language

rdfs:Resource

dcterms:LinguisticSystem

Publisher

dc:publisher

rdfs:Resource

dcterms:Agent

Relation

dc:relation

rdfs:Resource

rdfs:Resource

source

dc:source
,
dcterms:relation

rdfs:Resource

rdfs:Resource

Dcterms

subPropertyOf

Domain

Range

conformsTo

dc:relation
,
dcterms:relation

rdfs:Resource

dcterms:Standard

hasFormat

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

hasPart

dc:relation
,
dcterms:relation

rdfs:Resource

rdfs:Resource

hasVersion

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

isFormatOf

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

isPartOf

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

isReferencedBy

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

isReplacedBy

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

isRequiredBy

dc:relation
,
dcterms:relation

rdfs:Resource

rdfs:Resource

isVersionOf

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

References

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

Replaces

dc:relation
,
dcterms:relation

rdfs:Resource

rdfs:Resource

Requires

dc:relation, dcterms:relation

rdfs:Resource

rdfs:Resource

Rights

dc:rights

rdfs:Resource

dcterms:RightsStatement

accessRights

dc:rights, dcterms:rights

rdfs:Resource

dcterms:RightsStatement

License

dc:rights
,
dcterms:rights

rdfs:Resource

dcterms:LicenseDocument

Subject

dc:subject

rdfs:Resource

rdfs:Resource

title

dc:title

rdfs:Resource

rdfs:Resourcerdfs:Literal

alternative

dc:title
,
dcterms:title

rdfs:Resource

rdfs:Resourcerdfs:Literal

type

dc:type

rdfs:Resource

rdfs:Class

audience

rdfs:Resource

dcterms:AgentClass

educationLevel

dcterms:audience

rdfs:Resource

dcterms:AgentClass

mediator

dcterms:audience

rdfs:Resource

dcterms:AgentClass

accrualMethod

dcmitype:Collec
tion

dcterms:MethodOfAccrual

accrualPeriodicity

dcmitype:Collec
tion

dcterms:Frequency

accrualPolicy

dcmitype:Collec
tion

dcterms:Policy

instructionalMethod

rdfs:Resource

dcterms:MethodOfInstruction

provenance

rdfs:Resource

dcterms:ProvenanceStatement

rightsHolder

rdfs:Resource

dcterms:Agent

http://www.kanzaki.com/docs/sw/dc
-
domain
-
range.html

http://dublincore.org/documents/dcmi
-
terms/

Hideaki Takeda / National Institute of
Informatics

The Friend of a Friend (FOAF)


Metadata describe persons and their relationship


Voluntary project


@prefix
rdf
: <http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#> .

@prefix
foaf
: <http://xmlns.com/foaf/0.1/> .

@prefix
rdfs
: <http://www.w3.org/2000/01/rdf
-
schema#> .



<#JW>


a
foaf:Person

;


foaf:name

"Jimmy Wales" ;


foaf:mbox

<mailto:jwales@bomis.com> ;


foaf:homepage

<http://www.jimmywales.com/> ;


foaf:nick

"
Jimbo
" ;


foaf:depiction

<http://www.jimmywales.com/aus_img_small.jpg> ;


foaf:interest

<http://www.wikimedia.org> ;


foaf:knows

[


a
foaf:Person

;


foaf:name

"Angela
Beesley
"


] .



<http://www.wikimedia.org>


rdfs:label

"Wikipedia" .

Classes
:


| Agent | Document | Group | Image |
LabelProperty

|
OnlineAccount

|
OnlineChatAccount

|
OnlineEcommerceAccount

|
OnlineGamingAccount

|
Organization | Person |
PersonalProfileDocument

| Project
|

Properties
:


| account |
accountName

|
accountServiceHomepage

| age |
aimChatID

|
based_near

| birthday |
currentProject

|
depiction | depicts |
dnaChecksum

|
familyName

|
family_name

|
firstName

| focus |
fundedBy

|
geekcode

|
gender |
givenName

|
givenname

|
holdsAccount

|
homepage |
icqChatID

|
img

| interest |
isPrimaryTopicOf

|
jabberID

| knows |
lastName

| logo | made | maker |
mbox

|
mbox_sha1sum | member |
membershipClass

|
msnChatID

|
myersBriggs

| name | nick |
openid

| page |
pastProject

|
phone | plan |
primaryTopic

| publications |
schoolHomepage

| sha1 |
skypeID

| status | surname | theme
| thumbnail |
tipjar

| title | topic |
topic_interest

| weblog |
workInfoHomepage

|
workplaceHomepage

|
yahooChatID

|

Hideaki Takeda / National Institute of
Informatics

SKOS (
Simple
Knowledge Organization
System)


Metadata for taxonomy


Hierarchical structure of concepts


Invented to represent taxonomy such as subject
heading


=/= subclass relationship among classes


W3C Recommendation 18 August 2009

Hideaki Takeda / National Institute of
Informatics

SKOS (Simple Knowledge Organization
System)


SKOS Core (hierarchical concept structure)


skos:semanticRelation


skos:broaderTransitive



skos:narrowerTransitive


skos:broader



skos:narrower


skos:related


skos:preflabel


skos:altlabel


skos:hiddenlabel


subPropertyOf

Hideaki Takeda / National Institute of
Informatics

SKOS (Simple Knowledge Organization
System)


SKOS Mapping


skos:mappingRelation


skos:closeMatch


skos:exactMatch


skos:broadMatch


skos:narrowMatch


skos:relatedMatch

subPropertyOf

Hideaki Takeda / National Institute of
Informatics

Linked Open Vocabulary (LOV)


A technical
platform for search and quality
assessment among the vocabularies
ecosystem


Register schemata


Search schemata


http
://labs.mondeca.com/dataset/lov/

Hideaki Takeda / National Institute of
Informatics

X


Hideaki Takeda / National Institute of
Informatics

More Info.


http://www.w3.org/2005/Incubator/lld/wiki/V
ocabulary_and_Dataset

Hideaki Takeda / National Institute of
Informatics

Summary for schema


Some major schemata


DC, DC terms, FOAF, SKOS …


More domain
-
specific schemata


CIDOC CRM


PRISM





Re
-
using is highly recommended


LOV

Hideaki Takeda / National Institute of
Informatics

Summary


Three layers


Ontology/Thesaurus/Taxonomy


Schema


Identification


Not just top
-
down, rather bottom
-
up


Each layer has own role


Not pursue the value of each layer, rather
make a good combination
of them