Why DSpace Digital Library

arghtalentΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 9 μήνες)

181 εμφανίσεις


Digital Libraries: Study into the features
of the DSpace Suite



Devika P. Madalli

Documentation Research and Training Centre

Indian Statistical Institute

Bangalore 560059





2

Introduction

Digital libraries encompass a whole range of
information services related work such as


Organization of digital information


Information retrieval


User interface


Archiving and preservation


Services and social issues


Evaluation and applications to particular areas

3

Desirable Features of DL Software


Structures


Accessible


Searchable


Extensible


Massive


Heterogeneous


Persistent


4

DL’s operation should be examined
under…


Architectural design


Modular and Open


Backend Database


scalable, robust, data formats


Network capabilities


web
-
based and seamless
operations, persistent Ids, security and authentication


Metadata and Interoperability


compatible with world
standards such as Dublin Core and OAI
-
PMH

5

Technical Issues


Open source software Vs Commercial OS


Hardware and peripheral requirements


Network Components


Standards


data formats, metadata, network,
access, interoperability, encoding

6

Approaches to Building DL


Digitization


retro
-
conversion of non
-
digital
resources to digital


Digitally born resources


involves inter
-
conversion to standard formats and storage

7

Why DSpace Digital Library


An open source technology platform which can be
customized and its capabilities can be extended


A service model for open access and/or digital
archiving for perpetual access


A platform to build an Institutional Repository and the
collections are searchable and retrievable
by/on

the
Web


To make available institution
-
based scholarly
material in digital formats. The collection will be open
and interoperable.


DSpace is

8

Architecture and System
Requirement

The DSpace system is organized into three layers



The Storage Layer
:
responsible for physical storage of
metadata and content


The Business Layer
:
deals with managing the content of the
archive, users of the archive (e
-
people), authorization, and
workflow


The Application Layer
:
containing components that
communicate with the networked world outside of the individual
DSpace installation,


for example the Web user interface and the modules for metadata
harvesting service

Features of a near ideal DL


Low cost, including all hardware and software
components


Technically simple to install and manage


Robust


Scalable


Open and inter
-
operable


Modular


User Friendly


Multi
-
user (including both searching and
maintenance)


Multimedia digital object enabled


Platform independent (including both client and
server components) interoperable

DSpace is a joint project of MIT
Libraries and Hewlett
-
Packard Labs

What is DSpace?


Digital Object management system


Create, search and retrieve digital objects


Facilitate preservation of digital objects


An open source software


Allows open access and digital archiving


Allows building
Institutional Repositories


H/W and S/W requirements


UNIX recommended (Java
-
based program should
run on anything)



Open source, built on Apache web server and
Tomcat Servlet engine


Uses postgreSQL or Oracle relational database

What DSpace can do?


Captures


Digital content in any formats directly from creators
(e.g. researcher, authors)



Describes


Descriptive, technical, rights metadata


Persistent identifiers


OAI
-
PMH version
2.0
compliant


Allow metadata creation

Possible types of Content


Preprints, articles


Postprints


Technical Reports


Conference Papers


Theses/Dissertations


Datasets


e.g. statistical,
geospatial, scientific


Images


visual, scientific, etc.


Audio files


Video files


Digitized library collections

Formats of Content

File Formats


Supported
: Repository administrator can
inform the submitters which file formats
will be supported in the future by his
organization


Known
: recognizes the format, but cannot
guarantee full support


Unsupported
: cannot recognize a format;
these will be listed as "application/octet
-
stream",
--

Unknown



Information Model


Communities


Departments, Labs, Research Centers, Schools…


Collections


Items


Files (bitstreams)



Multiple formats
-

same content


Complex objects


multiple files


Intellectual Property


Click
-
through license during submission


Grants DSpace non
-
exclusive right to acquire,
manage, preserve, distribute the item


Does
not
grant DSpace copyright


Copy of license stored with item

Goodies


Modular architecture, well
-
defined APIs


100
% open source


Programmed in java


RDBMS and SQL for metadata


CNRI “handles” for persistent identifiers


OpenURL linking


OAI
-
PMH for exposing metadata

Backend Technology


Apache, Tomcat, OpenSSL/mod_ssl


Java


PostgreSQL/Oracle


CNRI Handle System
5
(persistent ids)



Lucene Search Engine


Standards


Dublin Core only


Descriptive metadata only



OAI
-
PMH v
2.0
(Open Archive’s Initiative Protocol
for metadata harvesting)



UNICODE Compliant


Capabilities


Exports in XML format


Supports crosswalks through OAI
-
PMH


DC (Dublin Core)



Qualified DC


METS (Metadata Encoding and Transmission Standard


MODS (Metadata Object Description Schema


sibling
of MARCXML)



Can be extended to any Metadata Schema


Customization


Screens (Manakin)



E
-
mails


Any language interface


Metadata


Input
-
forms


Display of results


Fields to be Indexed


Access restrictions


License (in addition to Creative Commons)


Advanced Feature


Grid Compliant (Storage)



LDAP authentication


Usage statistics generation


SFX Server integration


RSS (Really Simple Syndication)



Item Recommendation to a friend


Use of Thesaurus (though not OWL/SKOS/RDF)



Full
-
text indexing of PDF, MS
-
WORD files

Important Sites


http://www.dspace.org


http://www.sourceforge.net/projects/dspace


http://wiki.dspace.org


http://mailman.mit.edu/mailman/listinfo/dspace
-
general


http://lists.sourceforge.net/lists/listinfo/dspace
-
tech


http://lists.sourceforge.net/lists/listinfo/dspace
-
devel

DRTC Sites


https://drtc.isibang.ac.in (Librarians' Digital Library)



http://drtc.isibang.ac.in/dlrg (Discussion Forum)



http://drtc.isibang.ac.in/sdl (Harvester in LIS)



http://drtc.isibang.ac.in


http://drtc.isibang.ac.in/blog

Questions?

Thank You

devika@drtc.isibang.ac.in