All_Section_2.doc - Apache Tomcat

seasoningalluringΔιαχείριση Δεδομένων

29 Νοε 2012 (πριν από 4 χρόνια και 8 μήνες)

235 εμφανίσεις

SECTION 2: OVERALL DESCRIPTION (
Describe the what approach will be used to
solving the problem)


Be done with this on Wednesday

2.1

Testing Strategy
(add to glossary)



The Archival Information Packages (AIPs) generated from the FDsys will have to
be unpacka
ged in order to retrieve the XML based content metadata. Within the content
metadata, there is a MET binding file, named aip.xml, that will describe the relationship
of data within the Archival Information Package. Additionally, the METS binding file
pinp
oints the location of the package components. The package components are
randomly assigned to folders and in order to connected them scripts using XML Java
Library must be written. The scripts will serve to parse metadata in XML files into a
human readable

form in order to ingest it and recover the original digital object by using
software repository. See
Figure 1

for an illustration of the process described above.


Figure 1



2.2 User Characteristics


(add to glossary)



The intended user is anyone inter
ested in reconstructing the archive. The user can
do so using only the content data and utilizing the available resources for preservation
such as Fedora Commons and Dspace. The user must have knowledge of XML schemas,
more specifically METS, MODS, and PRE
MIS as well as knowledge of software
repository such as Dspace and Fedora Commons including unpacking software and
following installation process. In addition, the user should have a strong understanding
and knowledge of XML Java Library. The user requisi
tes must be met in order to ensure
the successful reconstruction of the archive.


2.3

Testing Tools and Environment


(add to glossary)




Testing tools will consist of mainly two digital repository software, Fedora
Commons and Dspace.
Fedora

Commons is a digit
al repository that makes digital
management of information possible by providing the basis for software systems
through the use of abstractions of digital information. The Windows based Fedora
3.4.1 will be used. The Fedora Commons prerequisites will be th
e following: Java SE
Development Kit (JDK) 6, DerbySQL as the database, and Tomcat 6.0.20 sever.
Dspace is also an open source digital repository software that allows open sharing of
information. Furthermore, Dspace will be used to preserve all types of di
gital content
such as text, images, moving images, mpegs and data sets. The Windows based
Dspace 1.6.2 will be used. The following will be the prerequisite software for Dspace:
Java SDK 1.5 or later, PostgreSQL 8.x for Windows or Oracle 9 or later, Apache
Ant
1.6.2 or later, Jakarta Tomcat 5.x or later and Apache Maven 2.0.8 or later.



2.4

Testing Tools*



Fedora Commons uses DerbySQL database server



Dspace uses PostgreSQL database server


2.5

Content Data within Archival Information Package

(added to glossary
)

The FDsys’ AIPs consists of digital objects and the associated technical,
descriptive and preservation metadata pertaining to the digital object(s). AIPs will define
how digital objects and it

s associated metadata are packaged using the METS standard.
More specifically, a binding METS file, named aip.xml. The aip.xml file will describe the
relationships between digital object(s) and metadata. Additionally, the AIPs will have a
MODS file, named mods.xml and a PREMIS file, named premis.xml associated with

the
metadata about the digital objects. Each component illustrated in
Figure
2

of the AIP will
be discussed in detail.



Figure 2




2.5

Database Considerations

(added to glossary)




The database that will be used will depend on the software repository. Fed
ora
Commons uses Derby SQL Database 10.5.3. Additionally, Fedora Commons supports
the following four external databases: MySQL, Oracle, PostgreSQL and Microsoft
SQL Server.
Dspace


uses PostgreSQL
8.x for Windows or Oracle 9 or later as the
database server
.


2.7

Design and Implementation Constraints*



This testing project m
ust use Open Archival Information System(OAIS)
compliant open
-
source repository software with AIPs exported from FDsys such as
Fedora Commons and Dspace
. Scripts will be written to


2.8

Assumpt
ions and Dependencies


(add to glossary)




The AIP testing is heavily dependent on GPO providing the set of data from its
archival storage. The assumption is that AIPs in FDsys are truly independent from
hardware or software. Furthermore, it is assumed th
at reconstruction of an archive can
be achieve using only the content data within the AIPs. It is assumed that the scripts
that will be written can only handle AIPs exported from FDsys. Additionally,
information extracted from scripts can be ingested into
the OAIS compliant software
repository such as Fedora Commons and Dspace.