METS - API

clumpsmackoverΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 4 μήνες)

40 εμφανίσεις

application programming interface

METS
-

API

application
programming

interface

Markus Enders, SUB Göttingen

Jens Ludwig, SUB Göttingen

METS Implementors Meeting,
May 8th, 2007

application programming interface

Why?

necessity of an API

application programming interface

Why?

METS has a complex data model:

the most common instantiation of METS

is its XML form

an API should be based on the data
model and is (theoretically) independent
of its XML representation

application programming interface

Why?

API should support creation of METS as well:

creation of invalid data should not be
possible (e.g. wrong order of elements...)

100% valid METS data

API should be focused on METS elements
and their appropriate attributes and
relationships

application programming interface

Why?

API connects application with serialization
level.

API as a framework for METS creation /
parsing

Multi
-
Tier Applications:

application programming interface

Why?

METS API

Database

Repository

XML

Applikation

application programming interface

Implementation Issues:

Maintainance:

Changes in METS
-
schema must be reflected
by API

Programming language:

more than one language should be supported

multi
-
level access:



Granularity of access

application programming interface

Implementation Issues:

Maintainance:

Changes in METS
-
schema must be reflected
by API

Programming language:

more than one language should be supported

multi
-
level access:


Granularity of access

Derive classes from xml
-
schema:

e.g.

Apache xmlbeans or

SUN JAXB


provides java classes for

xml
-
schema

application programming interface

Implementation Issues:

Maintainance:

Changes in METS
-
schema must be reflected
by API

Programming language:

more than one language should be supported

multi
-
level access:


Granularity of access

php
-
java bridge:

http://php
-
java
-
bridge.sourceforge.net

Inline
-
Java perl module:

http://search.cpan.org/~patl/Inline
-
Java/


application programming interface

Implementation Issues:

Maintainance:

Changes in METS
-
schema must be reflected
by API

Programming language:

more than one language should be supported

multi
-
level access:


Granularity of access



access to single elements
/ attributes



higher level for more
widespread functionality


application programming interface

Implementation Issues:

Apache xmlbeans based API for java

Creates an interface for each schema object

and an implementation to read / write this

object to XML

Other implementations
possible (repository)

Can create DOM tree at any time, e.g. if

non
-
schema based xml
-
data needs to be stored.

application programming interface

Implementation Issues:

level one:

METSbeans

xmlbeans based API for java

allows acces to single METS elements,
attributes and their relationships

level two:

more complex functions which are based on

the METSbeans

application programming interface

METSbeans

every type from schema becomes one class

classes are generated automatically from
the XML
-
schema

additional APIs can be generated and
integrated for any xml
-
schema based
data format (e.g. MODS, premis etc.)

application programming interface

METSbeans

internal architecture:

for every type in the xml schema, an
appropriate java interface exists

every interface is implemented during
automatic generation process

additional implementations of an
interface are possible


high flexibility to
access mets
-
data outside a file system

application programming interface

METSbeans

internal architecture:

interface:

DivType

<xsd:complexType name="divType">

class:


DivTypeImpl

application programming interface

METSbeans

internal architecture:

xmlbeans has a set of native data types:


XMLObject, XMLString


XMLShort, XMLTime


etc...


application programming interface

METSbeans

internal architecture:

All other objects cannot be created
without this object

METSDocument as topmost class
instantiates the document.

Instance can be created by:



parsing a file



using a factory class to create new


document

application programming interface

METSbeans

snippet: MetsDocument

example factory class:

MetsDocument
mets=MetsDocument.Factory.newInstance();

try {


xml = XmlObject.Factory.parse(f);

} catch (XmlException e) {


e.printStackTrace();


return false;

}

MetsDocument metsDoc=(MetsDocument) xml;

example parsing a file:

application programming interface

METSbeans

DivType:

methods for accessing <mprtr> element

getMptrArray(),

getMptrArray(int i),
sizeOfMptrArray(),
setMptrArray(Mptr[] mptrArray),
setMptrArray(int i, Mptr mptr),

insertNewMptr(int i),

addNewMptr();

removeMptr(int i)


application programming interface

METSbeans

DivType:

methods for accessing <div> element

getDivArray()

getDivArray(int i)
sizeOfDivArray()
setDivArray(DivType[] divArray)
setDivArray(int i, DivType div)

insertNewDiv(int i)

addNewDiv()

removeDiv(int i)

application programming interface

METSbeans

DivType:

very similar methods for handling file
pointers (<fptr> elements)

application programming interface

METSbeans

DivType:

methods to set attributes (id attribute)

getID();

isSetID();

setID(String id);

unsetID();

xsetID(org.apache.xmlbeans.XmlID
id);

xgetID();

application programming interface

METSbeans

snippet:

create a new <div> element

MetsDocument


mets=MetsDocument.Factory.newInstance();

MetsType myMets=mets.addNewMets();

StructMapType sm=myMets.addNewStructMap();

DivType div=sm.addNewDiv();

div.setTYPE("Monograph");

DivType firstchild=div.addNewDiv();

firstchild.setTYPE("TitlePage");

application programming interface

METSbeans

snippet:

saving a METS document

HashMap suggestedPrefixes = new HashMap();

suggestedPrefixes.put("http://www.loc.gov/METS/",


"mets");

suggestedPrefixes.put("http://www.w3.org/1999/xlink",


"xlink");

XmlOptions opts = new XmlOptions();

opts.setSaveSuggestedPrefixes(suggestedPrefixes);

File outputFile=new File(filename);

mets.save(outputFile,opts);

application programming interface

METSbeans

MdSecType

represents the METS elements

<dmdSec>

<techMd>

<digiprovMd>

<rightsMd>

<sourceMd>

but not:


<amdSec>

may contain:

MdRef or

MdWrap object

application programming interface

METSbeans

snippet:

create an MdSecType object

MetsDocument


mets=MetsDocument.Factory.newInstance();

MetsType myMets=mets.addNewMets();

MdSecType dmdSec= myMets.addNewDmdSec();

dmdSec.setID("DMDID01");

MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap();

MdSecType.MdWrap.XmlData
xmldata=mdwrap.addNewXmlData();


xmldata.set(modsObject);



any XMLObject: e.g XMLString

application programming interface

METSbeans

snippet:

create an MdSecType object


ModsDocument


modsObject=ModsDocument.Factory.newInstance();

ModsType myMods=mods.addNewMods();

IdentifierType
identifier=myMods.addNewIdentifier();

....

xmldata.set(modsObject);


String:

Document:

XmlString


xs=XmlString.Factory.newValue("<mydata/>");

xmldata.set(xs);

application programming interface

METSbeans

parse mets data:

the API provides some parse
-
methods:

parse(java.lang.String xmlAsString)

parse(java.io.File file)

parse(java.net.URL u)

parse(java.io.InputStream is)

parse(org.w3c.dom.Node node)

if the parsed data is NOT valid METS a
XmlException

is thrown.

application programming interface

METSbeans

snippet:

parse mets data

File f=new File(filename);

XmlObject xml;

try {


xml = XmlObject.Factory.parse(f);

} catch (XmlException e) {


e.printStackTrace();

} catch (IOException e) {


e.printStackTrace();

}

MetsDocument metsDoc=(MetsDocument) xml;

application programming interface

METSbeans

snippet:

get a DivType

MetsDocument metsDoc=(MetsDocument) xml;

MetsType mets=inDoc.getMets();

StructMapType structs[]=mets.getStructMapArray();

for (int i=0; i<structs.length;i++){


StructMapType struct=structs[i];


String structtype=structs[i].getTYPE();


if ((structtype!=null)&&(


structtype.equals("LOGICAL"))){


DivType div= struct.getDiv();


String divtype=div.getTYPE();


return divtype;


}

}


application programming interface

METSbeans

easy to create and parse valid METS data
(much easier than parsing DOM trees)

easy to combine with other xml data

Drawback
:

as based on xmlbeans it is only available for
java;

php
-
java / inline::java modul needed for
php/perl

quite fast compared to DOM

application programming interface

Helper
-
class

Functions:

Though the METSbeans allow access to
every single METS element, it is still a
complex task to do simple things e.g.
adding metadata to a <div>

Need for additional high
-
level functions:

Helper
-
class needed, which sits on

top of MetsBeans

application programming interface

Helper
-
class

Functions:

No official implementation, just an excerpt
of functions which a level 2 API could
provide

Following examples are from experiences
working with METSbeans

(based on METSbeans)

application programming interface

Helper
-
class

Functions:

createDMDSec(XMLObject inMetadata,


DivType inDiv)

createDMDSec(XMLObject inMetadata,


FileType inFile)

...


Create DMDSec for common METS
-
objects:


application programming interface

Helper
-
class

Functions:

createMDSectionInAMDSec(

XMLObject inMetadata,

String type,


DivType inDiv,

AmdSecType inAmdSec)

...


Create adminsitrative metadata for
common METS
-
objects: e.g.


application programming interface

Helper
-
class

Functions:

getMDSecTypeByID(

String inID)

getMDSecTypeByType(

String inType)


...


function to retrieve special metadata
sections by ID or TYPE:


application programming interface

Helper
-
class

Functions:

getAllFilesForDivType(

DivType inDiv)

getAllFilesForFileGroup(

FileGrpType inGrp)

...


functions to get related files (to a <div>
element):


application programming interface

Extension schema

Integration of extension schema:

Export MetsBeans
-
objects as DOM tree.


Create Beans for extensions schema as well:

Premis, MODS, MIX
-

Beans.

application programming interface

Extension schema

Example: create MODS data

MdSecType dmdSec=mets.addNewDmdSec();
dmdSec.setID(dmdid_string);

MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap();
MdSecType.MdWrap.XmlData xml=mdwrap.addNewXmlData();


ModsDocument mods=ModsDocument.Factory.
newInstance
();

ModsType myMods=mods.addNewMods();

xml.set(mods);

application programming interface

Extension schema

Example: create <premis:object> data

MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap();
MdSecType.MdWrap.XmlData xml=mdwrap.addNewXmlData();


ObjectDocument


objdoc=ObjectDocument.Factory.
newInstance
();
ObjectDocument.Object


premis_object=objdoc.addNewObject();

xml.set(objdoc);

application programming interface

Extension schema

Example: parse MODS data


MdSecType dmdSec;

....

MdSecType.MdWrap mdw= dmdSec.getMdWrap();

MdSecType.MdWrap.XmlData xml_data=mdw.getXmlData();

String result=xml_data.xmlText();


ModsDocument mods=ModsDocument.Factory.
parse
(result);

application programming interface

Problems?!

Quality of the API



API depends on XML
-
schema;

quality of API depends on quality of schema.

MetsType


fpr <mets>

DivType


for <div>

MdSecType


for <dmdSec>,....


but not type for METS
-
Header <metsHdr>

as it is defined inline





application programming interface

Problems?!

Integration of extension schema



Problematic, if extension schema do not have a
top
-
level element; especially parsing is difficult:

String result=xml_data.xmlText();

ModsDocument


mods=ModsDocument.Factory.
parse
(result);

result

must always contain a valid XML
-
document!



e.g DublinCore simple

application programming interface

How to continue

Work with METSbeans



everybody can create METSbeans by
him/herself
-
> see Apache xmlbeans


Downloadable from GDZ website


Will provide a primer as a non
-
complete documention for
METSbeans.

application programming interface

How to continue



Identify necessary functions

for helper
-
class

Over time we will identify additional
methods which might be useful and
should be integrated in the "helper
-
class".

application programming interface

Application Layer



can be build on top of METSbeans

Profile specific implementations can
be build on top of METSbeans and
provide an API to the underlying
document/content model.

application programming interface

Application Layer



can be build on top of METSbeans

METS API

XML
serialization

Applikation

helper class

API for content model

Applikation