The OAI-PMH Harvester Plugin for The Omeka Content Management System

14
LIS 654 BUILDING DIGITAL LIBRARIES FALL 2011 NOVEMBER 03, 2011 Plugin for The Omeka Content Management System JAMES R. GRIFFIN III 100356891

description

The OAI-PMH Harvester Plugin for The Omeka Content Management System. LIS 654 Building digital libraries Fall 2011 November 03, 2011. James r. griffin iii 100356891. Defining the OAI-PMH. - PowerPoint PPT Presentation

Transcript of The OAI-PMH Harvester Plugin for The Omeka Content Management System

Page 1: The OAI-PMH Harvester Plugin for The Omeka Content Management System

LIS 654BUILDING DIGITAL LIBRARIES

FALL 2011NOVEMBER 03, 2011

The OAI-PMH Harvester Pluginfor

The Omeka Content Management System

JAMES R. GRIFFIN III100356891

Page 2: The OAI-PMH Harvester Plugin for The Omeka Content Management System

Defining the OAI-PMH

• "The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP.“1

• Thus, the OAI-PMH is a means by which to enable digital repositories to openly and freely exchange and share metadata detailing their collections with the world.

1Open archives initiative protocol for metadata harvesting. (2011). Retrieved from http://www.openarchives.org/pmh/

Page 3: The OAI-PMH Harvester Plugin for The Omeka Content Management System

Installing the OAI-PMH Harvester Plugin for Omeka

1. Download the plug-in from the following source:http://omeka.org/add-ons/plugins/oai-pmh-harvester/

(Note: This is a ZIP archive [like other plug-ins for Omeka])

2. Upload the ZIP archive to the server wotan (Note: This can be done using any scp client such as WinSCP)

3. Decompress the archive into the appropriate directory for your installation of Omeka

(Note: This is typically the path /home/[USER NAME]/omeka/plugins/)

4. Using the web interface, install the harvester plug-in

Page 4: The OAI-PMH Harvester Plugin for The Omeka Content Management System

The Purpose Behind the OAI-PMH

Metadata shared using the OAI-PMH is structured in a uniform manner, ensuring that metadata for all collections shared on the World Wide Web can be harvested regardless of the specific application

For example, one institution can archive content using the Drupal application as a repository, while another institution can archive content using Omeka

Using the OAI-PMH protocol, both repositories can be configured to exchange information detailing the contents of their archived collections.

Page 5: The OAI-PMH Harvester Plugin for The Omeka Content Management System

Repository Interoperability

Unfortunately, not every digital repository has been developed using the same framework (or even the same programming language[s])

Thus, if OAI-PMH were to attempt to institute language-specific standards for exchanging metadata, inevitably some repository application would be developed in an unsupported language

The solution to this is the software object

Page 6: The OAI-PMH Harvester Plugin for The Omeka Content Management System

OAI-PMH Metadata Objects

For the purposes of this presentation, a software object is a means by which to structure data in a language-independent manner

As the OAI-PMH Initiative seeks to establish their contribution as the definitive standard for the exchange of repository metadata, this will increase the likelihood that future repository applications (some of which will be written in currently non-existent [i.e. future] languages) will still employ this protocol

Page 7: The OAI-PMH Harvester Plugin for The Omeka Content Management System

OAI-PMH Metadata Objects

The metadata objects are transferred over the HyperText Transfer Protocol (HTTP)

This means that no platform-specific binaries must be employed in order to harvest OAI-PMH-compliant metadata

(e.g. Anyone can access information detailing the contents of these archived collections using a web browser – you do not need to purchase or install any additional software)

Page 8: The OAI-PMH Harvester Plugin for The Omeka Content Management System

OAI-PMH Metadata Objects

The metadata objects are bound to/serialized using the eXtensible Markup Language (XML)

This is mentioned for the sake of those who are enrolled in LIS650, those who have previously taken LIS650, or those who are familiar with web design

For those unfamiliar with XML or web design itself, this simply means that this metadata can be extended and manipulated easily by web designers as well as developers

Page 9: The OAI-PMH Harvester Plugin for The Omeka Content Management System

An Instance of an OAI-PMH Metadata Object

In order to generate OAI-PMH-compliant metadata objects for one’s collection, one must first install and configure another plugin:

The OAI-PMH Repository(http://omeka.org/add-ons/plugins/oai-pmh-repository/)

Retrieving metadata from the repository:http://wotan.liu.edu/omeka/jgriffin/oai-pmh-repository/request?

verb=ListRecords&metadataPrefix=oai_dc

The parameter “verb” specifies to wotan precisely what is being requested (e.g. A list of my collections – “ListRecord”)

The parameter “metadataPrefix” specifies to wotan precisely which metadata framework to use in the formatting of the response (e.g. “oai_dc” is the OAI’s format which is based upon the Dublin Core framework)

Page 10: The OAI-PMH Harvester Plugin for The Omeka Content Management System

An Instance of an OAI-PMH Metadata Object

This was retrieved by requesting the following resource:http://wotan.liu.edu/omeka/jgriffin/oai-pmh-repository/request?

verb=ListRecords&metadataPrefix=oai_dc

<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2011-11-03T19:46:59Z</responseDate> <!-- When I requested this object --><request verb="ListRecords" metadataPrefix="oai_dc"> <!-- Which parameters were passed to wotan -->

http://wotan.liu.edu/omeka/jgriffin/oai-pmh-repository/request</request><ListRecords> <!-- A detailed listing of the collection records -->

<record><header>

<identifier>oai:wotan.liu.edu/omeka/jgriffin/:5</identifier><datestamp>2011-10-22T00:48:49Z</datestamp> <!– Record creation time --><setSpec>6</setSpec>

</header><metadata>

<oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/http://www.ope[...]">

<!-- The Dublin Core Elements --><dc:title>/src/bin/psql/psql.c</dc:title><dc:creator>Regents of the University of

California</dc:creator><dc:publisher>[…]

</metadata></record>

</ListRecords></OAI-PMH>

Page 11: The OAI-PMH Harvester Plugin for The Omeka Content Management System

Harvesting Metadata from Remote Repositories in Omeka

The plugin has its utility in its ability to directly import data detailing items archived in a remote repository into one’s own repository

Conceptually, the mechanisms underlying this process are similar to those used in the practice of “copy cataloging”

Page 12: The OAI-PMH Harvester Plugin for The Omeka Content Management System

Harvesting Metadata from Remote Repositories in Omeka

As previously specified, the server must be running an OAI-PMH repository for the archived collections

In order to demonstrate this, I can harvest from my own OAI-PMH repository:http://wotan.liu.edu/omeka/jgriffin/oai-pmh-repository/request

…as well as from L’Université Rennes 2 de la Bibliothèque Numérique*:

http://bibnum.univ-rennes2.fr/oai-pmh-repository/request?verb=ListRecords&metadataPrefix=oai_dc

*This source was specified by Sheila Brennan of the Roy Rosenzweig Center for History and New Media.Please see http://omeka.org/blog/2011/08/29/do-you-share-your-data/

Page 13: The OAI-PMH Harvester Plugin for The Omeka Content Management System

Harvesting Metadata from Remote Repositories in Omeka

Metadata sets can be re-harvested or deleted

While a set of records are being harvested, one is offered the ability to “kill” the process

Should there be problems regarding the memory required by the harvester, one can modify the settings of the plugin

The “Memory Limit” field should only be modified if a harvest fails due to an error.

The path for the PHP binary should always be ‘/usr/bin/php5’ on wotan

Page 14: The OAI-PMH Harvester Plugin for The Omeka Content Management System

The OAI-PMH Harvester Plug-In for the Omeka Digital Archive

Questions?

Comments?