web.tech.lib: Leveraging Metadata to Create Better Web

convertingtownΛογισμικό & κατασκευή λογ/κού

4 Νοε 2013 (πριν από 4 χρόνια και 3 μέρες)

77 εμφανίσεις

web
.tech.lib
:
Leveraging Metadata to Create Better Web
S
ervices

By Erik Mitchell

Libraries have been increasingly concerned with data

creation, management
,

and publication.

This
increase
is
partly
driven by shifting metadata standards in
libraries and
part
ly

by the growth of data and metadata repositories being managed
by libraries.

In order to manage these data sets
,

libraries are looking for new
preservation and discovery systems
as well as

opportunities to connect local
metadata with the
W
eb
.

In this

column
,

I
will
look at some potential benefits and
problems that data brings and
consider

some ways to get started
with new data
management and publication tools
.

As libraries become responsible for more data, they are also being called on
to support dat
a preservation, discovery
,

and analysis tools.
Data types and sources
include bibliographic records, digital library collection metadata,
W
eb site
resources
,

and research data.

While this is an exciting area of
W
eb development, the
range of needs from rese
archers, librarians, archivists
,

and general users means
the
process of creating and publishing data has become complex
.

Likewise
,

a growing
set of metadata means it is becoming more difficult to manage metadata using a
single set of systems.

While this
si
tuation
poses difficulties for libraries
,

it is also a
chance for libraries to demonstrate their relevance in a changing landscape and
create new opportunities to engage
their

patrons in analytical and participatory

roles.

Over the next few issue
s

we will explore this area, beginning with a look at
how libraries can get started by converting existing metadata and using it in new
research platforms
, and continuing with this theme until the upcoming JWL
special
issue on data curation, 6(4).


Upgrade
your existing metadata

Perhaps the largest

set of data
,
and the one
with the most need for change
,

is
library bibliographic data.

While

the migration of bibliographic data to a new
platform is

a long process, and one in which libraries must work together,
initiatives
like the Library of Congress Bibliographic Framework Transition Initiative
(Marcum
2011)

are
moving towards a standard that will natively work in new research
platforms.

Part of this transition includes the adoption of Linked Open Data as a
model for representing data and relationships between data sets
.

An introductory
summit
,

Linked Open Data in

Libraries, Archives
,

and Museums (LOD
-
LAM)
,

discussed

the

use of Resource
Description
Framework

(RDF
)
-
encoded vocabularies,
publishing data under open licenses
,

and the use of
graph structures as opposed to
relational databases to represent metadata
(

Intro to the LODLAM

Talk
,


2011)
.


Understanding RDF, Linked
Open
Data
,

and
W
eb ontologies can be
challenging
,

if not overwhelming.

Traditional metadata creation and management
tools are not always best suited for this work.

New tools are being created to help
migrate metadata from spreadsheets, databases
,

and resource
-
focused metadata
models to graph
-
based RDF models.

The Extensible C
atalog

s Metadata Services
Toolkit (
http://code.google.com/p/xcmetadataservicestoolkit/
)

provides a
platform for ingesting and manipulating bibliographic metadata.

An
other
interesting tool for bibliographic and other metadata

is Google Refine
(
http://
googl
e.com/refine
).

This tool
is a
J
ava plug
-
in that works in the Chrome
browser and is capable of ingesting, manipulating
,

and exporting metadata.

It
accomplishes
these tasks
through a

set of tools that include data ingest and
normalization, data linking
,

and
record encoding.

Google Refine
also
includes plug
-
ins that can export RDF
-
encoded linked data, normalized spreadsheet data
,

and
other metadata formats that are ready to be used in a number of data research tools.

Just being aware of the tools is not enoug
h
,

however
.

L
ibraries need
accessible tools and familiarity with tool use in order to create Linked Open Data
.

Sites like
Free Your Metadata (
http://f
reeyourmetadata.org
)
, created by Seth Van
Hooland
, Max De Wilde, and Ruben Verborgh
,

explore the process o
f converting
metadata to LOD using commonly available tools and standards.

The videos and
tutorials on the site step users through the process of using Google Refine to create
RDF
-
formatted linked data from traditional bibliographic records.

While Free

Y
our

M
et
a
data focuses on bibliographic data and vocabularies, applying these practices to
any local data creates opportunities to implement new discovery and research
services on library metadata.


Analyze and publish your data

One of the reasons libraries
are motivated to create and publish data using
these new standards is the need to make their data available in research
services
and discovery tools.

Research
-
focused
W
eb services

differ from other digital library
services in that they support data discove
ry, visual
ization
,

and export functionality.

Community
-
managed sites like J
STOR

s Data for Research (
http://jstor.org/dfr
)
demonstrate how a single data set can be made useful in new ways with a different
interface.

After
using tools like Google Refine to
migrate and transform metadata
,

libraries need platforms on which their metadata can be made available.

Sites like
DataHub (
http://thedatahub.org
) and Freebase

(http://freebase.com)

are potential
places
to publish linked data.

However, t
hese sites focus on

providing access to data
as opposed to research support.

Developing research tools is possible
,

and most
common development platforms have

programming

libraries available to facilitate
the process
.

T
here are also a number of tools
that offer ready
-
made se
rvices without
the overhead of local development.

For example,
Viewshare is
a hosted implementation of the open source
Recollection software (
http://sourceforge.net/projects/loc
-
recollect/
)

developed in
conjunction with the Library of Congress
.

Recollection is different from other digital
library platforms in that it
focuses on
supporting
data visualization and

research
services in addition to
traditional

discovery services.

Viewshare takes this a step
further by offering a cloud
-
based implementa
tion of Recollection so that libraries
can publish data without the overhead of implementing and managing Recollection
software
.

Two important features that Recollection supports
are

geographic and
date/time discovery interfaces.

In order to accomplish thi
s, the service relies on
normalized
date/time and geo
-
location data
.

It also supports the normalization of
this data, making it possible for libraries with geo
-
location and date/time data in
different formats to easily migrate
that
data into the platform.

The platform also
supports data export in multiple formats, enabling researchers to take data with
them for use in their own systems.

By providing a cloud platform for this open
source software on the viewshare.org site, the Library of Congress makes it ea
sy for
small and medium
-
size repositories to leverage Linked Open Data in a new
discovery/research service.

Viewshare can be populated by simple spreadsheets or
by
an
OAI/PMH
provider
,

m
eaning

it can easily fit in to a library

s suite of
W
eb
services.


Goo
d practice and getting started

Just as library IT
had to grow
to include new types of automation systems
such as electronic resource management systems,
a
rchival
management

systems
,

and online reference service
s
, the new crop of data visualization services

requires
W
eb service teams to be prepared to publish data in standard
metadata

formats.

This may
or may
not mean adopting a new
metadata
standard
;

popular standards
like

Dublin Core, MODS,
and
METS
support

the metadata elements that are
commonly used in d
iscovery and research systems.

What
will

be required
,
however
,
is the implementation of metadata following strict encoding guidelines for
standard
vocabularies and interoperable metadata formats like
date/time and geographic
data
.

In particular
,

it is important to follow strict date/time encoding guidelines
ISO8601
(
http://www.iso.org/iso/support/faqs/faqs_widely_used_standards/widely_used_s
tandards_other/date_and_time_format.htm
) and
g
eo
-
location guidelines ISO TC211
(
http://www.iso.org/iso/iso_
catalogue/catalogue_tc/catalogue_tc_browse.htm?com
mid=54904
).


By
taking a look at all of your library
W
eb metadata and moving in the
direction of linked open data
,

you can enable easy use of your metadata in these new
visualization services.

While this
re
quires
a different approach, there are new off
-
the
-
shelf tools that
can

help your
W
eb

presence grow.

Beginning your
W
eb service
development

by focusing on the metadata issues may feel somewhat backwards.

In
fac
t, it may seem frustrating when metadata norma
lization takes more time than
service development
.

But publishing your metadata as
l
inked
o
pen
d
ata, creating
services that make it possible to publish and harvest metadata at a large scale
,

and
adding a research platform interface to your digital library
will have an impact on
your library service utilization
for

years to come
.

Over the next few columns
,

I
will
explore the role of data and metadata in
W
eb services and consider how to create
services that make this data work.


References

Marcum, Deanna.

2011.

Intro to the LODLAM Talk: Live from the Smithsonian.


A Bibliographic Framework for the Digital Age
.

Accessed February 27, 2012.
http://www.loc.gov/marc/transition/
.


This is a preprint submitted for consideration in the
Journal of Web
Librarianship
, copyright 2012, Taylor & Francis. The
Journal of Web
Librarianship

is available online at: http://www.informaworld.com.