Reintroducing GLIMIR - Music OCLC Users Group

plantationscarfAI and Robotics

Nov 25, 2013 (3 years and 11 months ago)

81 views

The world’s libraries. Connected.

Reintroducing

GLIMIR

Plenary Session: WorldCat Local Panel

Music OCLC Users Group Annual Meeting


San Jose, California

2013 February 27

Jay Weitz

Senior Consulting Database Specialist

WorldCat Quality Management Division

OCLC

The world’s libraries. Connected.

Reintroducing GLIMIR: Definition and Objectives

GLIMIR =
G
lobal
LI
brary
M
anifestation
I
dentifie
R


To identify records describing the same manifestation: Manifestation Clusters.


Parallel records: Same resource with same content in same format, but described in
different languages of cataloging.


Create OCLC Manifestation Identifiers (OMI) and index them in WorldCat.


To identify records describing different manifestations with the same content:
Content Clusters.


Originals, reprints, microform reproductions, digital reproductions.


Create OCLC Content Identifiers (OCI) and index them in WorldCat.


To improve FRBR work sets by merging those containing records that GLIMIR
assesses to be equal in content.


Informing FRBR of algorithm improvements.

The world’s libraries. Connected.

FRBR algorithm:


Works in real time.


Makes author/title key.


Creates
work
clusters.


Assigns
the OCLC
Work
Identifier
(
OWI
).

Duplicate Detection and
Resolution (DDR):


Works as an offline process.


Launches queries to find
candidate duplicates.


Resolution program determines
“retained” record.


GLIMIR adapts DDR algorithms,
creates clusters and identifiers.

Reintroducing GLIMIR: Relation to FRBR and DDR

The world’s libraries. Connected.

Reintroducing GLIMIR:

Diagram of Metadata and Identifier Structure


Identifiers at all levels




Holdings at all levels



Metadata summaries at higher
levels

The world’s libraries. Connected.

Worldcat.org: Before
GLIMIR: Multiple Works,
Scattered Holdings


Retrieves and displays one
representative record per work set.


Currently there may be multiple
work sets for the same work
(particularly for works without clear
authors).


Depending on the search, these
records may be scattered in large
result sets.

Reintroducing GLIMIR: Before

The world’s libraries. Connected.

Worldcat.org: After
GLIMIR: One Work,
Consolidated Holdings


Consolidated work set (more
likely to get a thumbnail image).


Includes translations.


Briefer short lists, more complete
retrieval.

Reintroducing GLIMIR: After

The world’s libraries. Connected.


Perception of duplicate problem in
WorldCat has worsened as more non
-
English language of cataloging records are
loaded and parallel records are added.


Holdings scatter.


DDR has deleted nearly 13 million records
since 1992.


Perception of duplicates in WorldCat
remains.


GLIMIR OMI should have a bigger impact
on perceived duplication.


Importance of good work groups.

Reintroducing GLIMIR: Perceived Duplicates

The world’s libraries. Connected.

GLIMIR complements de
-
duplication:


Hides records that are duplicates
but cannot be de
-
duplicated
(styles/rules too different, sparse
records).


Surfaces holdings, hides less
desired descriptions.


Gives more accurate count of the
numbers of manifestations in
WorldCat.

Reintroducing GLIMIR: De
-
Duplication

The world’s libraries. Connected.

Just as with FRBR,
improvements to general
matching have been identified:


Typo tolerance in pagination.


Improvements to lists of noise
titles.


Improved language and
transliteration sensitivity.


Interpretation of size (e.g. gr8 =
octavo = 8
o

= 22 cm = 8 in.)


Normalizing titles.

Reintroducing GLIMIR: De
-
Duplication

The world’s libraries. Connected.


“Cast
list.”


Dates.


Scores,
Parts,
Scores
and
Parts
.

Reintroducing GLIMIR: Music and Film

The world’s libraries. Connected.

Reintroducing GLIMIR:

Show GLIMIR Search Results

The world’s libraries. Connected.


Reintroducing GLIMIR:

Show All GLIMIR Cluster Records


The world’s libraries. Connected.

Reintroducing GLIMIR:

Search Without GLIMIR Option

The world’s libraries. Connected.

Reintroducing GLIMIR:

Same Search with GLIMIR Option Selected

The world’s libraries. Connected.

Reintroducing GLIMIR: GLIMIR Cluster

The world’s libraries. Connected.

Reintroducing GLIMIR: Cluster Holdings

Information Displays on Each Bibliographic Record

The world’s libraries. Connected.


Robert Bremer


Ted Fons


Janifer Gatenby


Richard O. Greene


Ying Li


W. Michael Oskins


Patricia Schuette Sexton


Gail Thornburg


Kelly Womble

Reintroducing GLIMIR: Acknowledgements