Download - Louisiana State University

arghtalentData Management

Jan 31, 2013 (4 years and 2 months ago)

89 views

The Digital Object Architecture

A presentation at

Louisiana State University

Baton Rouge, Louisiana

August 26, 2005

Robert E. Kahn

Corporation for National Research Initiatives

Reston, Virginia

Selected Major Network
Issues


How to get affordable broadband access
to homes, businesses, government, etc.


How to add more dimensionality to the
mobile wireless experience


How to take advantage of many
devices/appliances being on the Internet


Protecting critical elements (including
infrastructure elements such as DNS)


Stifling SPAM; detecting and fighting
Viruses


Selected Major Issues (con’t)


Identity Management (w/o certificates)


Trust in the security mechanisms


Managing Privacy


How to enable more widespread sharing
of important information on the net


Trusting your information to the Net


Managing your information on the Net over
very long periods of time


Infrastructure Development


What is so hard about it?


Making it scalable over platforms, size and time


Achieving Critical Mass


Getting Buy in


Pleasing many essential participants


Displacing prior capabilities


Structuring matters to deal with concerns about empire
building


It’s a lot easier to create brand new capabilities
than to affect existing means of operation


Infrastructure Creation is a
Subtractive Process


Infrastructure reduces a common, shared
capability to its basic and essential attributes


These attributes are not always recognized
or understood up front


Upon further scrutiny, capabilities are usually
deleted from a well
-
conceived architecture
over time


Consensus develops when no more can be
removed without disabling the infrastructure


What is the Problem?


Managing information in the Net over very long
periods of time


e.g. centuries or more


Dealing with very large amounts of information in
the Net over time


When information, its location(s) and even the
underlying systems may change dramatically
over time


Respecting and protecting rights, interests and
value


A Meta
-
level Architecture


Allows for arbitrary types of information
systems


Allows for dynamic formatting and data
typing


Can accommodate interoperability
between multiple different information
systems


Allows metadata schema to be identified
and typed

Digital Object Architecture:

Motivation


To
reformulate the Internet architecture

around the
notion of uniquely identifiable data structures


Enabling existing and new types of information to be
reliably managed and accessed in the Internet
environment
over long periods of time


Providing mechanisms to stimulate innovation, the
creation of dynamic new forms of expression and to
manifest older forms


While supporting intellectual property protection, fine
-
grained access control, and enable well
-
formed
business practices to emerge

Objective of the Framework


Internet objective

Best
-
effort Packet Delivery


Heterogeneous


Networks


Information Systems

Seamless

Interoperability

Networks

Information

Systems

Organizing Heterogeneous Systems

Digital Object Architecture


Technical Components


Digital Objects (DOs)


Structured data, independent of the platform on which it was
created


Consisting of “elements” of the form <type,value>


One of which is its unique, persistent identifier


Resolution of Unique Identifiers


Maps an identifier into “state information” about the DO


Handle System is a general purpose resolution system


Repositories

from which DOs may be accessed


And into which they may be deposited


Metadata Registries


Repositories that contain general information about DOs


Supports multiple metadata schemes


Can map queries into unique DO specifications (via handles)


What is a Digital Object


Defined data structure
, machine independent


Consisting of a set of elements


Each of the form
<type,value>


One of which is the
unique identifier


Identifiers are known as “Handles”


Format is
“prefix/suffix”


Prefix is unique to a naming authority


Suffix can be any string of bits assigned by that authority


Data structure can be parsed; types can be resolved
within the architecture


Associated
properties record

and
transaction record
containing metadata and usage information

Interoperability & Federated
Repositories

Create a cohesive interoperable collection
of repository
-
based systems


Initially, perhaps, around a core set of
projects, content, applications and/or
organizations as in ADL


Demonstrate interoperability between
different repository collections


Develop procedures to insure continued
accessibility to key archival information

Repository Notion

Any Hardware & Software

Configuration

Logical External Interface

RAP

Repository

Access Protocol

Repository

Digital Object Repository

Client



Provides distributed Digital Object storage.



May itself be a Digital Object.



Provides a dynamic acquisition and
execution mechanism for the mobile code that
implements the content type operations.



Exclusively accessed using the Repository
Access Protocol (RAP).


Disseminate

Deposit

Nesting of Repository

Functionality

Core

Structure

Content

Aggregation &

De
-
aggregation

Core Interface must be present at each level

Other levels could be separately defined later


Repositories & Digital Objects

REPOSITORY

IPv6

Each Digital

Object has its

own unique &

persistent ID


Content Providers

want to assign Ids


Could be upwards

of trillions of DOs

per Repository

Objects may be

Replicated in

Multiple Repositories

Handle System


Distributed Identifier Service on the Internet



First General Purpose Resolution system



Can be used to
locate repositories

that contain digital objects
given their handles

-

and more!



Other indirect references



Public Keys, Authentication information for Dos



Accommodates interoperability between many different information
systems; for example


DNS was demonstrated on the Handle System in preparation for Y2K


Can support ENUM, RFID, and more

Attributes of the Handle System


The basic Architecture of the Handle
System is
flat, scaleable, and extensible


Logically central, but physically
decentralized


Supports
Local Handle Services
, if desired


Handle resolutions return entire “Handle
Records”
or portions thereof


Handle Records are also


digital objects


signed by the servers


doubly certificated by the system


Resolution Mechanism

Multiple Sites

Multiple Servers


Handle System

<www.handle.net>

Handle


Handle

Record



System is non

nodal



Scaleable & Distributed



Supports global (and local) resolution



With backup for reliability, mirroring for efficiency

Type Resolution


Types are resolvable in the Handle
System


Types may be created dynamically


Types may be locally named, mapped into
bit strings without semantics


Primary prefix zero “0” is used for system
identifiers


0.type/<type> is the system handle for
type


Other handles may cross reference this
handle (e.g. for international use)


Handle Format

Prefix
Authority

Item ID

(any format)

Prefix

Suffix

In use, a Handle is an opaque string.


2304.40
/
1234

Other examples of

Handles


2304/general info

2304/1

2304. HQ/staff

2304.1/memo123

2304.22.Pub/2004


Direct Access and Proxies

Direct

Access

One or more

Proxy Servers

Indirect

Access

Redirection of Handle Requests

Direct

Access

One or more

Local Handle


Services

General Registry of all

Naming Authorities

Redirection

Information

Literary Music Video Financial Grid Enum RFID

“SimpleLookup URL IPaddresses “Unfederated Databases”

Digital Object

Content Type(s)

Access

Requests

Information

Digital Object Overview

Disseminations

Unique Identifier

Handle

Hamlet

It’s a Book

Get Page(2)

Digital Object Overview

Hamlet

Hamlet


Digital objects are uniquely identified in a given identifier space.


Data elements reference sequences of typed data.


A Digital Object can have zero or more C
ontent Types

to reflect
intended uses by its creator.


Content Type Operations are accessible as DOs

Data

Element

Data

Element

Hamlet

Content Type

Operations

Content Type

Operations

Digital Object Overview

The Digital Object Identifier
(DOI
®
)


Used by the International DOI
Foundation (IDF) to reference high
-
quality materials of publishers (and
other owners of IP)


Major Commercial User of the Handle
System at present with approximately
12 Million handles


Usage growing at about 4 Million per
year


DNS domain names, by comparison,
are relatively flat with perhaps 40%
churn per year.

Setting up a
Local Handle
Service...


Download the software from
http://www.handle.net


Follow the instructions in the installation script.


Send your “site bundle”, containing the IP address of
your server and your administrator information, to the
Global Handle Registry
®

(GHR) administrator


Site is under re
-
development to accommodate
widespread use via automated means


Experimental Repository software also available on
-
line

Managing Rights & Interests


Not just about copyright


Terms and Conditions (T&Cs) for use may be
contained within each DO; also information
about intrinsic value, such as monetary value


T&Cs are intended to indicate clearly what one
can and/or cannot do

with a given DO, where
such clarity is intended by the owner of the DO


Not an enforcement means, although it may be
used by an enforcement system


Mobile programs that are Digital Objects may
apply such terms to themselves and to any
digital objects they contain

Handle
-
DNS Integration


Developing Environment


C/C++, Linux/Windows


Additional Modules


DNS Interface integrated with handle server


Cache/Preload Module


Database Connection Pools


C
-
Version Handle
-
DNS Admin Toolkit


Performance Improvements


Exceptional Processing


Memory Leak Protection


Thread Pool Management

Design &
Implementation


Simple Handle Server Workflow (C
-
Version)

Storage
Management
Interface

Handle Requests

Thread Pool

Listener

Handle Server

Client

Message

Processor

DB

Database
Connection Pool

External Protocol Converter







DNS Protocol

DNS Protocol

Converter

Handle Protocol

53

8000

2641

Handle

Process

Module

Handle Server

Latency

Plug & Play Interfaces



Integrate DNS Interface with Handle Server




DNS Protocol

DNS Message

Processor

Handle Protocol

53

8000

2641

Handle
Message
Processor

Handle Server

Cache & Storage Management


Preload (Cache)
Module


Preload Handle Records
from Database into RAM


Reduce Database Access
Times


Improve Throughput of
Handle Server


Storage Management
API


User Transparent


RAM or Database


Combination of RAM and
Database


Multiple Database
Interfaces


Mysql, PostgreSQL, etc.


Features of Cache
Module


Efficient Query
Performance


STL RBTree, Hash Table


Configurable size of RAM
for each Handle Record,
or total records

Storage Management API

Storage
Management
Interface

RAM

Operations

Create

Modify

Delete

Data Base

Periodic

Update

Benchmark


UDP Interface for DNS Protocol


Compared to BIND 9.3.0

Handle-DNS VS Bind
0
2000
4000
6000
8000
10000
12000
14000
16000
2
8
14
20
26
32
38
44
50
56
62
68
74
80
86
92
98
Number of Client Requests(10
3
)
Responses per Second
Handle-DNS
Bind

Selling infrastructure technology


Providing identification, management and
Metadata services


Enabling third
-
party value
-
added
capabilities


Helping organizations manage their own
information better & offer new types of
services


Stimulating access to “surface information”
and “embedded information” with
appropriate access controls and conditions
of use

Business Potential

Conclusions


Managing Digital Objects for long
-
term access is a
key challenge


Initial Technology Components are available;
Industry is expected to generate more over time


Third
-
party value
-
added providers in the private
sector will ultimately shape the long
-
term evolution


Interoperability and reliable information access is a
critical objective


A diversity of applications (with user
-
friendly
interfaces) need to be developed & deployed


Application Projects have a central role to play in
demonstrating the technology and using it effectively