Preservation and Access

pribblingchoppedElectronics - Devices

Nov 15, 2013 (3 years and 9 months ago)

69 views

HATHI TRUST


A Shared Digital Repository

Digital Repositories for
Preservation and Access

Digital Directions 2013

Jeremy
York

July 22, 2013

Unless otherwise noted, these slides and their contents are licensed under a
Creative Commons
Attribution Unported License
.

Digital repositories


Primary mission to preserve content


Performs actions to this end

Reasons to preserve content


For access


Guard against threats
to
content


Digitization accepted method of preservation
reformatting


Digital deteriorates, is fragile

Reasons to provide access


Meet needs of designated community


Check on integrity of content


Content that is accessible is more likely to be
valued and preserved in the future

Reasons access might not be offered


Copyright


Privacy


Licensing


Needs
of user
community


Content available elsewhere


Technical
limitations


N
etworking
and storage
requirements


A number of
models


Full
user access to preserved digital objects


No end
-
user access to digital objects


Delayed or triggered user access to digital
objects


Partial access to digital objects


Requirements to preserve content


OAIS


“An
OAIS is an Archive, consisting of an
organization...of
people and systems that has
accepted the responsibility to preserve
information and make it available for a Designated
Community
.” [does not imply unrestricted access]

OAIS


Support information model


Define target of preservation (content data and representation
information)


Define metadata needed to preserve, identify, contextualize
information (PDI)


Fulfill responsibilities


Accept information from Producers


Obtain control sufficient to preserve


Ensure understandable to designated community


Ensure preservation


Make available to designated community with information
supporting authenticity

Ensure preservation


Some strategies:


Transformation


Validation


Checks
on integrity


Replication


Choice of formats


Migration

TRAC


Starts with “
a mission to provide reliable,
long
-
term access to managed digital resources
to its designated community, now and into the
future



Encompasses


Organizational Infrastructure


Digital Object Management


Technical Infrastructure

TRAC (2)


Borrows vocabulary from OAIS


Adapts ideas for applying criteria from
nestor

and Digital
Curation

Centre


Documentation (evidence)


Transparency


Adequacy


Measurability

OAIS

TRAC

Transparency

Documentation

Adequacy

Measurability

Provenance

Context

Reference

Fixity

Access Rights

Designated Community

Mission

Organizational
Infrastructure

Digital Object
Management

Technical
Infrastructure

Representation
Information

Content Data

Preservation
Actions

Authenticity

Reliability

Integrity

Preserve Content

Where does access come in


Some level of access is necessary


Management, integrity


What is preserved may not be what is most
useful to the end user


Implications across the repository

Content formats


Can the content you are preserving be delivered
over the
Web
?


Will you be storing derivative files
?


Is some kind of transformation needed?


Do the files offer consistent functionality?


Implications for scale of repository, access systems, changes
to
services


In
HathiTrust
:


Limited to 3 formats, largely
uniform in technical characteristics


ITU
G4 TIFF


JPEG2000


Unicode (with and without coordinates)

Storage of information about content


Is information about object adequately
available for both preservation and access?


Structural information


Preservation information with implications for
interface


HathiTrust

uses METS as a wrapper


Available for preservation and access

Content Package

images

Source
METS

text

HT

METS

Zip

Architecture

images

bib

data

bib

data

Source
METS

text

HT

METS

../
uc1
/pairtree_root/b3/54/34/86/b34543486

b34543486.zip

b34543486.mets.xml

Storage


Does the storage system support needs for
ingest and access?


In
HathiTrust
:


Need to have fast access to repository systems to
support services

Security


Data Integrity


Checksum validation, digital object provenance


Physical security


Biometric door systems, locked racks


Network security


Firewalling, vulnerability scanning


Application security


Developer best practices, input validation


Access control…

Differential access to content


Rights database


Ensures appropriate access


Holdings database


Facilitates lawful uses of materials

Authentication/Authorization


Mechanisms to enable differential access,
ensure security and appropriate use

User services


Bibliographic and full
-
text search indexes


Collection
-
building capabilities


User interfaces

APIs and Datasets


Data API


Bibliographic
API


OAI



Hathifiles



Datasets


More


Quality


User Support


Correction

Provide Access

Content Package

Content Formats

Architecture

Storage

Authentication

Security

Authorization

Differential Access

Services / User
Interfaces

Lawful Uses

APIs and Datasets

Copyright/Agreem
ents

User Support

Indexes

Correction

Information Quality

Content
Package

Content
Formats

Architecture

Storage

Authentication

Security

Authorization

Differential
Access

Services / User
Interfaces

Lawful Uses

APIs and
Datasets

Copyright/Agre
ements

User Support

Indexes

Correction

Information
Quality

OAIS

TRAC

Transparency

Documentation

Adequacy

Measurability

Provenance

Context

Reference

Fixity

Access Rights

Designated Community

Mission

Organizational
Infrastructure

Digital Object
Management

Technical
Infrastructure

Representation
Information

Content Data

Preservation
Actions

Authenticity

Reliability

Integrity

Preservation

Access

Thank you!

How to find out more


About:
http
://www.hathitrust.org/about


Twitter:
http://twitter.com/hathitrust


Facebook:
http://www.facebook.com/
hathitrust


Monthly newsletter:


http:www.hathitrust.org/updates


RSS
http://www.hathitrust.org/updates_rss


Contact us:
feedback@issues.hathitrust.org


Blogs:
http://www.hathitrust.org/blogs


Large
-
scale Search


Perspectives from
HathiTrust