21. Digital Libraries: Study into the features of

righteousgaggleΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 7 μήνες)

109 εμφανίσεις

Digital Libraries: Study into the features
of the DSpace Suite



Devika P. Madalli

Documentation Research and Training Centre

Indian Statistical Institute

Bangalore 560059




International Workshop On Building Digital Libraries, DRTC/ISI 7
th


11
th

March 2005

International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

2

Introduction

Digital libraries encompass a whole range of
information services related work such as


Organization of digital information


Information retrieval


User interface


Archiving and preservation


Services and social issues


Evaluation and applications to particular areas

International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

3

Desirable Features of DL Software


Structures


Accessible


Searchable


Extensible


Massive


Heterogeneous


Persistent


International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

4

DL’s operation should be examined
under…

1.
Architectural design


Modular and Open

2.
Backend Database


scalable, robust, data formats

3.
Network capabilities


web
-
based and seamless
operations, persistent Ids, security and authentication

4.
Metadata and Interoperability


compatible with world
standards such as Dublin Core and OAI
-
PMH

International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

5

Technical Issues


Open source software Vs Commercial OS


Hardware and peripheral requirements


Network Components


Standards


data formats, metadata, network,
access, interoperability, encoding

International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

6

Approaches to Building DL


Digitization


retro
-
conversion of non
-
digital
resources to digital


Digitally born resources


involves inter
-
conversion to standard formats and storage

International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

7

Why DSpace Digital Library


An open source technology platform which can be
customized and its capabilities can be extended


A service model for open access and/or digital
archiving for perpetual access


A platform to build an Institutional Repository and the
collections are searchable and retrievable
by/on

the
Web


To make available institution
-
based scholarly
material in digital formats. The collection will be open
and interoperable.


DSpace is

International Workshop On Building
Digital Libraries, DRTC/ISI 7th

11th
March 2005

8

Architecture and System
Requirement

The DSpace system is organized into three layers



The Storage Layer
:
responsible for physical storage of
metadata and content


The Business Layer
:
deals with managing the content of the
archive, users of the archive (e
-
people), authorization, and
workflow


The Application Layer
:
containing components that
communicate with the networked world outside of the individual
DSpace installation,


for example the Web user interface and the modules for metadata
harvesting service

Features of a near ideal DL


Low cost, including all hardware and software
components


Technically simple to install and manage


Robust


Scalable


Open and inter
-
operable


Modular


User Friendly


Multi
-
user (including both searching and
maintenance)


Multimedia digital object enabled


Platform independent (including both client and
server components) interoperable

DSpace is a joint project of MIT
Libraries and Hewlett
-
Packard Labs

What is DSpace?


Digital Object management system


Create, search and retrieve digital objects


Facilitate preservation of digital objects


An open source software


Allows open access and digital archiving


Allows building
Institutional Repositories


H/W and S/W requirements


UNIX recommended (Java
-
based program should
run on anything)


Open source, built on Apache web server and
Tomcat Servlet engine


Uses postgreSQL relational database

What DSpace can do?


Captures


Digital content in any formats directly from creators
(e.g. researcher, authors)


Describes


Descriptive, technical, rights metadata


Persistent identifiers


OAI
-
PMH version 2.0 compliant


Allow metadata creation

Possible types of Content


Preprints, articles


Postprints


Technical Reports


Conference Papers


Theses/Dissertations


Datasets


e.g. statistical,
geospatial, scientific


Images


visual, scientific, etc.


Audio files


Video files


Digitized library collections

Formats of Content

File Formats


Supported
: fully supports the format


Known
: recognizes the format, but cannot
guarantee full support


Unsupported
: cannot recognize a format;
these will be listed as "application/octet
-
stream",
--

Unknown



Information Model


Communities


Departments, Labs, Research Centers, Schools…


Collections


Items


Files (bitstreams)


Multiple formats
-

same content


Complex objects


multiple files


Intellectual Property


Click
-
through license during submission


Grants DSpace non
-
exclusive right to acquire,
manage, preserve, distribute the item


Does
not
grant DSpace copyright


Copy of license stored with item

Goodies


Modular architecture, well
-
defined APIs


100% open source


Programmed in java


RDBMS and SQL for metadata


CNRI “handles” for persistent identifiers


OpenURL linking


OAI
-
PMH for exposing metadata

Backend Technology


Apache, Tomcat, OpenSSL/mod_ssl


Java 1.3, JSP 1.2, Servlet 2.3


PostgreSQL 7, JDBC (rdbms)


CNRI Handle System 5 (persistent ids)


Lucene 1.2 (index/search)


Metadata standards


Dublin Core only (currently)


Descriptive metadata only



OAI
-
PMH v 2.0 (Open Archive’s Initiative Protocol
for metadata harvesting)


Capabilities


Exports in XML format


Next version will have METS (metadata encoding
and transmission standard) for export