Controlled Vocabulary &

snufflevoicelessInternet and Web Development

Oct 22, 2013 (3 years and 9 months ago)

66 views

Controlled Vocabulary &
Thesaurus Design

Resources & Future Directions

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Thesaurus Design Software


Comprehensive list of Thesaurus Software


http://www.willpower.demon.co.uk/thessoft.htm


Comparison of Thesaurus Software


http://www.willpower.demon.co.uk/thestabl.htm

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

A Cautionary Note


The true work of controlled vocabulary design is the
collection and intellectual organization of terms!


Thesaurus software


A tool for developing thesauri


Analogous to the functionality of a word processor for
writing a book


Unfortunately will not do all of the work for you


That said, it is always good to have the right tool for
the job!

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 1

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 1

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 1

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 1

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 1

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 2

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 2

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 2

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 3

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 3

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Example 3

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

The Future of Controlled Vocab


Integrated ILS and thesauri


The semantic web


Goals


Make semantic relationships machine
-
readable


Distributed database platform


Combined sets of semantic relationships = Semantic Web


Some XML based technologies


RDF
-

Resource Description Framework


OWL
-

Web Ontology Language


RSS
-

Really Simple Syndication


Ex. MARCXML

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Continuum of Vocabulary Control

Less

Complexity
More

List

Synonym Ring

Taxonomy

Ontology

Thesaurus

Ambiguity Control



Synonym Control


Ambiguity Control

Synonym Control

Hierarchical Relationships


Ambiguity Control

Synonym Control

Hierarchical Relationships

Associative Relationships

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

A Terminology Challenge

Business

Information

Science

Computer Science

Controlled

Vocabulary

Taxonomy

Ontology

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Changing Definitions


When the standard was published


Ontology ~ Taxonomy = Hierarchy


Now…


Taxonomy = Controlled Vocabulary = Ontology


Expect it to continue to change!

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Taxonomies are Everywhere


Especially in product websites


Examples include


Yahoo!, Amazon, HomeDepot, etc.


To see them you need to just look around

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Ontologies


Part of the Semantic Web suite of technologies


Ontologies are:


Published in a Namespace (like a URL)


Consist of Objects, Associations, and Instances


Completely analogous to Controlled Vocabularies


Terms, Relationships, Application of the term to some Thing

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Information Layer

Knowledge Layer

(Content)

Topic Map Model

(Index)

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Information Layer

The Information Layer


The lower layer contains the content


usually digital, but need not be


can be in any format or notation


can be text, graphics, video, audio, etc.

knowledge layer

The Knowledge Layer


The upper layer consists of topics and associations


Topics represent the subjects that the information is about


Associations represent relationships between those subjects

ChunHyang
-
Ga
(
Pansori
)

KangSan
-
Je
(Sect)

belongs to

famous for

composed
-
By

singer
-
tone

consists of

is a body of

is a dunum of

Kim, MH

(
Tambour
)

Shin, JH

(
Composer
)

Jung
-
Mo
-
Li

(
Rhythm
)

Sarang
-
Ga
(
Part
)

Present
(Genealogy)

exponent of

teacher
-
student

part
-
rhythm

singer
-
part

played
-
by

tambour
-
rhythm

has editorials

Appears in

Soonchang
(
Region
)

SeolRyong
-
Ge
(
Tone
)


Song, KR

Park, NJ

matches with

is a member of

Man
-
Jung

ChunHyang Editorials

(
Editorials)


Pansori Occurrence Mapping

ChunHyang
-
Ga
(
Pansori
)

KangSan
-
Je
(Sect)

belongs to

Famous in

Composed
-
By

singer
-
tone

consists of

Is a body of

Is a dunum of

Kim, Myunghwan

(
Tambour
)

Shin, Jaehyo

(
Composer
)

Jung
-
Mo
-
Li

(
Rhythm
)

Sarang
-
Ga
(
Part
)

Present
(Genealogy)

exponent of

teacher
-
student

part
-
rhythm

singer
-
part

Played by

Tambour
-
Rhythm

Has editorials

appears in

Soonchang
(
Region
)

SeolRyong
-
Ge
(
Tone
)


Song, Kwangrok

Park, Nokju

Mathes with

is a member of

Man
-
Jung

ChunHyang Editorials

(
사설집
)

Pansori Occurrence Types

Contents

Geneaology

Introduction

Website

Article

Video

Position

Album

Image

Sound

Critique

Paper

A date of birth

Biography

Book

Birthplace

Real name

Pen name

Nick name

Activity year

Structure

Producing year

Pansori Domain TM Structures

Topic Types

Association Types

Occurrence Types



Je



Sect



KangSan
-
Je



DongPyon
-
Je



DongCho
-
Je



SeoPyon
-
Je



ChungGo
-
Je



People



Tambour



Composer




Singer



Singer genealogy



Pansori Type



Creative Pansori



Pansori



Dan
-
Ga



Byon
-
Chang




SeungDo
-
Chang



Chang
-
Guk



Part




Body




Rhythm



Tone




Region



Editorials




Pansori Elements



Appears in



Belongs to



Composed by



Consists of



Exponent of



Famous in



Has Editorials



Hierarchical relationship



Is a body of



Is a dunum of



Is a member of



Matches with



Part
-
Rhythm



Part
-
Tone



Played by



Singer
-
Part



Singer
-
Tone



Tambour
-
Rhythm



Teacher
-
Student



Contents



Concert



Article



Paper



Work Structure



Nick Name



Real Name



Image



Sound




Introduction



Genealogy



Producing Year



Book



Type



Position



Editorial



Video




Critique




Prize



Albums



Web site



Homepage



Date Of Birth



Activities Year



Birth Place



Pen Name

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Relational Schema for Topic Maps

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Ontology Crosswalks

EnvML



Time

Latitude

Longitude

Altitude



Species



Environment

SensorML

Timestamp



X

Y

Z

Sensor

Precision



Mote

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Ontology Integration

EnvML



Time

Latitude

Longitude

Altitude



Species



Environment

SensorML

Timestamp



X

Y

Z

Sensor

Precision



Mote

EnvSensML

Timestamp

Latitude

Longitude

Altitude

Species

Sensor

Precision

Environment

Mote

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

MARCXML Schema

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Resources Available


List of thesauri


http://www.lub.lu.se/metadata/subject
-
help.html


Thesaurus construction guide


http://www.willpower.demon.co.uk/


Course materials
-

for updated slidesets


http://www.moebiustrip.org/CV/

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Course Goals


Understand and apply fundamental concepts of
controlled vocabulary and thesaurus design, and why
they are important


Understand and apply diverse types of term
relationships to structure descriptive terms


Understand and apply both basic rules and best
practices from existing thesauri to the construction
and maintenance of thesauri and controlled
vocabularies


Develop a basis for exercising individual judgment for
making thesaurus and controlled vocabulary
decisions

Developed by the Association of Library Collections & Technical Services and
Library of Congress’s Cataloger’s Learning Workshop

Wrapping up


Any last questions?


Course evaluations