Application programming interface METS - API application programming interface Markus Enders, SUB...

47
application programming interface METS - API application programming interface Markus Enders, SUB Göttingen Jens Ludwig, SUB Göttingen METS Implementors Meeting, May 8th, 2007

Transcript of Application programming interface METS - API application programming interface Markus Enders, SUB...

application programming interface

METS - APIapplication programminginterface

Markus Enders, SUB GöttingenJens Ludwig, SUB Göttingen

METS Implementors Meeting, May 8th, 2007

application programming interface

Why?necessity of an API

application programming interface

Why?METS has a complex data model:

the most common instantiation of METS is its XML form

an API should be based on the data model and is (theoretically) independent of its XML representation

application programming interface

Why?

API should support creation of METS as well:

creation of invalid data should not be possible (e.g. wrong order of elements...)100% valid METS data

API should be focused on METS elements and their appropriate attributes and relationships

application programming interface

Why?

API connects application with serialization level.

API as a framework for METS creation / parsing

Multi-Tier Applications:

application programming interface

Why?

METS API

DatabaseRepositoryXML

Applikation

application programming interface

Implementation Issues:Maintainance:

Changes in METS-schema must be reflected by API

Programming language:more than one language should be supported

multi-level access:

• Granularity of access

application programming interface

Implementation Issues:Maintainance:

Changes in METS-schema must be reflected by API

Programming language:more than one language should be supported

multi-level access:

•Granularity of access

Derive classes from xml-schema:e.g.Apache xmlbeans orSUN JAXB

provides java classes for xml-schema

application programming interface

Implementation Issues:Maintainance:

Changes in METS-schema must be reflected by API

Programming language:more than one language should be supported

multi-level access:

•Granularity of access

php-java bridge:http://php-java-bridge.sourceforge.netInline-Java perl module:http://search.cpan.org/~patl/Inline-Java/

application programming interface

Implementation Issues:Maintainance:

Changes in METS-schema must be reflected by API

Programming language:more than one language should be supported

multi-level access:

•Granularity of access

• access to single elements / attributes• higher level for more widespread functionality

application programming interface

Implementation Issues:Apache xmlbeans based API for java

Creates an interface for each schema objectand an implementation to read / write thisobject to XML

Other implementations possible (repository)

Can create DOM tree at any time, e.g. ifnon-schema based xml-data needs to be stored.

application programming interface

Implementation Issues:level one: METSbeans

xmlbeans based API for java

allows acces to single METS elements, attributes and their relationships

level two:more complex functions which are based onthe METSbeans

application programming interface

METSbeansevery type from schema becomes one class

classes are generated automatically from the XML-schema

additional APIs can be generated and integrated for any xml-schema based data format (e.g. MODS, premis etc.)

application programming interface

METSbeansinternal architecture:

for every type in the xml schema, an appropriate java interface exists

every interface is implemented during automatic generation process

additional implementations of an interface are possible – high flexibility to access mets-data outside a file system

application programming interface

METSbeansinternal architecture:

interface: DivType

<xsd:complexType name="divType">

class: DivTypeImpl

application programming interface

METSbeansinternal architecture:

xmlbeans has a set of native data types: XMLObject, XMLString XMLShort, XMLTime etc...

application programming interface

METSbeansinternal architecture:

All other objects cannot be created without this object

METSDocument as topmost class instantiates the document.

Instance can be created by:• parsing a file• using a factory class to create new document

application programming interface

METSbeanssnippet: MetsDocument

example factory class:

MetsDocument mets=MetsDocument.Factory.newInstance();

try { xml = XmlObject.Factory.parse(f);} catch (XmlException e) { e.printStackTrace(); return false;}MetsDocument metsDoc=(MetsDocument) xml;

example parsing a file:

application programming interface

METSbeansDivType: methods for accessing <mprtr> element

getMptrArray(), getMptrArray(int i), sizeOfMptrArray(), setMptrArray(Mptr[] mptrArray), setMptrArray(int i, Mptr mptr),insertNewMptr(int i),addNewMptr();removeMptr(int i)

application programming interface

METSbeansDivType: methods for accessing <div> element

getDivArray()getDivArray(int i) sizeOfDivArray() setDivArray(DivType[] divArray) setDivArray(int i, DivType div)insertNewDiv(int i)addNewDiv()removeDiv(int i)

application programming interface

METSbeansDivType: very similar methods for handling file pointers (<fptr> elements)

application programming interface

METSbeansDivType: methods to set attributes (id attribute)

getID();isSetID();setID(String id);unsetID();xsetID(org.apache.xmlbeans.XmlID id);xgetID();

application programming interface

METSbeanssnippet:create a new <div> element

MetsDocument mets=MetsDocument.Factory.newInstance();MetsType myMets=mets.addNewMets();StructMapType sm=myMets.addNewStructMap();DivType div=sm.addNewDiv();div.setTYPE("Monograph"); DivType firstchild=div.addNewDiv();firstchild.setTYPE("TitlePage");

application programming interface

METSbeanssnippet:saving a METS document

HashMap suggestedPrefixes = new HashMap();suggestedPrefixes.put("http://www.loc.gov/METS/", "mets");suggestedPrefixes.put("http://www.w3.org/1999/xlink", "xlink");XmlOptions opts = new XmlOptions();opts.setSaveSuggestedPrefixes(suggestedPrefixes);File outputFile=new File(filename);mets.save(outputFile,opts);

application programming interface

METSbeansMdSecType

represents the METS elements

<dmdSec><techMd><digiprovMd><rightsMd><sourceMd>

but not: <amdSec>

may contain:MdRef or MdWrap object

application programming interface

METSbeanssnippet:create an MdSecType object

MetsDocument mets=MetsDocument.Factory.newInstance();MetsType myMets=mets.addNewMets();MdSecType dmdSec= myMets.addNewDmdSec();dmdSec.setID("DMDID01");MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap();MdSecType.MdWrap.XmlData xmldata=mdwrap.addNewXmlData();

xmldata.set(modsObject);

any XMLObject: e.g XMLString

application programming interface

METSbeanssnippet:create an MdSecType object

ModsDocument modsObject=ModsDocument.Factory.newInstance();ModsType myMods=mods.addNewMods();IdentifierType identifier=myMods.addNewIdentifier();....xmldata.set(modsObject);

String:

Document:

XmlString xs=XmlString.Factory.newValue("<mydata/>");xmldata.set(xs);

application programming interface

METSbeansparse mets data:

the API provides some parse-methods:

parse(java.lang.String xmlAsString)parse(java.io.File file)parse(java.net.URL u)parse(java.io.InputStream is)parse(org.w3c.dom.Node node)

if the parsed data is NOT valid METS a XmlException is thrown.

application programming interface

METSbeanssnippet:parse mets data

File f=new File(filename);XmlObject xml;try { xml = XmlObject.Factory.parse(f);} catch (XmlException e) { e.printStackTrace();} catch (IOException e) { e.printStackTrace();}MetsDocument metsDoc=(MetsDocument) xml;

application programming interface

METSbeanssnippet:get a DivType

MetsDocument metsDoc=(MetsDocument) xml;MetsType mets=inDoc.getMets();StructMapType structs[]=mets.getStructMapArray();for (int i=0; i<structs.length;i++){ StructMapType struct=structs[i]; String structtype=structs[i].getTYPE(); if ((structtype!=null)&&( structtype.equals("LOGICAL"))){

DivType div= struct.getDiv(); String divtype=div.getTYPE(); return divtype;

}}

application programming interface

METSbeanseasy to create and parse valid METS data (much easier than parsing DOM trees)

easy to combine with other xml data

Drawback:as based on xmlbeans it is only available for java;php-java / inline::java modul needed for php/perl

quite fast compared to DOM

application programming interface

Helper-classFunctions:

Though the METSbeans allow access to every single METS element, it is still a complex task to do simple things e.g. adding metadata to a <div>

Need for additional high-level functions:

Helper-class needed, which sits ontop of MetsBeans

application programming interface

Helper-classFunctions:

No official implementation, just an excerpt of functions which a level 2 API could provide

Following examples are from experiences working with METSbeans(based on METSbeans)

application programming interface

Helper-classFunctions:

createDMDSec(XMLObject inMetadata, DivType inDiv)createDMDSec(XMLObject inMetadata, FileType inFile)...

Create DMDSec for common METS-objects:

application programming interface

Helper-classFunctions:

createMDSectionInAMDSec(XMLObject inMetadata,String type,

DivType inDiv,AmdSecType inAmdSec)

...

Create adminsitrative metadata for common METS-objects: e.g.

application programming interface

Helper-classFunctions:

getMDSecTypeByID(String inID)

getMDSecTypeByType(String inType)

...

function to retrieve special metadata sections by ID or TYPE:

application programming interface

Helper-classFunctions:

getAllFilesForDivType(DivType inDiv)

getAllFilesForFileGroup(FileGrpType inGrp)

...

functions to get related files (to a <div> element):

application programming interface

Extension schemaIntegration of extension schema:

Export MetsBeans-objects as DOM tree.

Create Beans for extensions schema as well:Premis, MODS, MIX - Beans.

application programming interface

Extension schemaExample: create MODS data

MdSecType dmdSec=mets.addNewDmdSec(); dmdSec.setID(dmdid_string);MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap(); MdSecType.MdWrap.XmlData xml=mdwrap.addNewXmlData();

ModsDocument mods=ModsDocument.Factory.newInstance();ModsType myMods=mods.addNewMods();xml.set(mods);

application programming interface

Extension schemaExample: create <premis:object> data

MdSecType.MdWrap mdwrap=dmdSec.addNewMdWrap(); MdSecType.MdWrap.XmlData xml=mdwrap.addNewXmlData();

ObjectDocument objdoc=ObjectDocument.Factory.newInstance(); ObjectDocument.Object premis_object=objdoc.addNewObject();xml.set(objdoc);

application programming interface

Extension schemaExample: parse MODS data

MdSecType dmdSec;....MdSecType.MdWrap mdw= dmdSec.getMdWrap();MdSecType.MdWrap.XmlData xml_data=mdw.getXmlData();String result=xml_data.xmlText();

ModsDocument mods=ModsDocument.Factory.parse(result);

application programming interface

Problems?!Quality of the API

API depends on XML-schema; quality of API depends on quality of schema.

MetsType fpr <mets>DivType for <div>MdSecType for <dmdSec>,....

but not type for METS-Header <metsHdr>as it is defined inline

application programming interface

Problems?!Integration of extension schema

Problematic, if extension schema do not have a top-level element; especially parsing is difficult:String result=xml_data.xmlText();ModsDocument mods=ModsDocument.Factory.parse(result);result must always contain a valid XML-document!

e.g DublinCore simple

application programming interface

How to continueWork with METSbeans

everybody can create METSbeans by him/herself -> see Apache xmlbeans

Downloadable from GDZ website

Will provide a primer as a non-complete documention for METSbeans.

application programming interface

How to continueIdentify necessary functions for helper-class

Over time we will identify additional methods which might be useful and should be integrated in the "helper-class".

application programming interface

Application Layercan be build on top of METSbeans

Profile specific implementations can be build on top of METSbeans and provide an API to the underlying document/content model.

application programming interface

Application Layercan be build on top of METSbeans

METS API

XML serialization

Applikation

helper class

API for content model

Applikation