Archive Information Packages for NASA HDF-EOS Data

22
Archival Information Packages for NASA HDF- EOS Data R. Duerr, Kent Yang, Azhar Sikander

description

One of the guiding concepts of the Reference Model for an Open Archival Information System, commonly referred to as the OAIS Reference Model, is the concept of an Archive Information Package (AIP) containing not just the data to be preserved for future access, but also the reference information needed to ensure that the data is understandable by its target audience and the preservation description information containing the lineage of the data and which ensures that an accurate, unaltered copy is retrieved at any point in the future. While creating AIPs is simple in principle it is not necessarily obvious that it will be as simple in practice. In this talk, the results of an experiment to develop AIPs for data in NASA'S Earth Observation System (EOS) Data and Information System (DIS) are reported.

Transcript of Archive Information Packages for NASA HDF-EOS Data

Page 1: Archive Information Packages for NASA HDF-EOS Data

Archival Information Packages for NASA HDF-EOS DataR. Duerr, Kent Yang, Azhar Sikander

Page 2: Archive Information Packages for NASA HDF-EOS Data

Outline

• What is an Archival Information Package? HDF-AIP

• Standards? What Standards? METS DIF/FGDC/ISO 19115-2 PREMIS

• Results• Next Steps

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 3: Archive Information Packages for NASA HDF-EOS Data

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

OAIS Reference Model1

1 Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.

Archive Information Package

Page 4: Archive Information Packages for NASA HDF-EOS Data

Archival Information Package Contents

• Content Information The data object to be preserved Information that describes the data object

o Typically interpreted as the syntax and semantics of the file structure

• Preservation Description Information Provenance – Origin or source of the data, any changes that have

taken place since, and who has had custody of it

Fixity – the authentication mechanisms (with keys) needed to ensure that the data object has not been altered in an undocumented manner

Reference – identification mechanisms and values

Context – relation of the object to its environment

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 5: Archive Information Packages for NASA HDF-EOS Data

HDF-Archive Information Packages

• The HDF group was funded to investigate and propose a design for a complete archival information package for HDF data files

• The result was a METS metadata file to accompany the HDF data file

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

http://www.hdfgroup.org/projects/hdf5_aip/hdf5_aip_wp.html

Page 6: Archive Information Packages for NASA HDF-EOS Data

Metadata Standards - METS

• Metadata Encoding and Transmission Standard

• An initiative of the Digital Library Federation• Provides the means to convey the metadata

necessary for management of digital objects within a repository exchange of objects between repositories (or

between repositories and their users)

• Designed to facilitate shared development of information management

tools/services interoperable exchange of digital materials

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 7: Archive Information Packages for NASA HDF-EOS Data

METS - A very brief overview

Describes the METS document itself

e.g., creator or editorDescribes the objectusing some external standard

e.g., MARC, FGDC, Dublin CoreDescribes object creation, storage, intellectual property rights, source

info, provenance, etc.e.g., PREMISProvides an inventory of all of the

files that are part of the object described

A physical or logical map of theorganization of the materials

describedAllows specification of hyperlinksbetween parts of the map (mostlyuseful when preserving websites)Used to associate executable code

with parts of the content

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 8: Archive Information Packages for NASA HDF-EOS Data

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Metadata Standards - Descriptive Metadata

• Discovery, Assess and Access Metadata GCMD DIF FGDC CSDGM ISO 19115

Derived from

Page 9: Archive Information Packages for NASA HDF-EOS Data

Metadata Standards - ISO 19115:2003

• The international equivalent of the FGDC standard

• Most fields can be mapped or generated from FGDC metadata

• The exception is the Dataset Topic Keywords

• Allows for national profiles

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 10: Archive Information Packages for NASA HDF-EOS Data

Metadata Standards - ISO 19115:2003

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 11: Archive Information Packages for NASA HDF-EOS Data

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Is there a metadata standard for AIP information?

1 Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.

Archive Information Package

Page 12: Archive Information Packages for NASA HDF-EOS Data

Preservation Metadata Implementation Strategies (PREMIS)

• Provide a core preservation metadata set with broad applicability across the digital preservation community

• Developed by an OCLC and RLG sponsored international working group Representatives from libraries, museums,

archives, government, and the private sector.

• Maintained by the Library of Congress• Based on the OAIS reference model

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 13: Archive Information Packages for NASA HDF-EOS Data

Rights

Events

Agents

“a coherent set of contentthat is reasonably

described as a unit”For example, a web site, data set or collection of data sets

“a coherent set of contentthat is reasonably

described as a unit”For example, a web site, data set or collection of data sets

“a discrete unit of information in digital form”

For example, a data file

“a discrete unit of information in digital form”

For example, a data file

“assertions of one or more rights or permissions

pertaining to an objector an agent”

e.g., copywrite notice, legalstatute, deposit agreement

“assertions of one or more rights or permissions

pertaining to an objector an agent”

e.g., copywrite notice, legalstatute, deposit agreement

“an action that involves atleast one object or agent

known to the preservationrepository”

e.g., created, archived,migrated

“an action that involves atleast one object or agent

known to the preservationrepository”

e.g., created, archived,migrated

“a person, organization, orsoftware program associatedwith preservation events in

the life of an object”e.g., Dr. Spock donated it

“a person, organization, orsoftware program associatedwith preservation events in

the life of an object”e.g., Dr. Spock donated it

PREMIS - Entity-Relationship Diagram

IntellectualEntities

Objects

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 14: Archive Information Packages for NASA HDF-EOS Data

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Is there a metadata standard for AIP information?

1 Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.

PREMIS

ISO 19115

Page 15: Archive Information Packages for NASA HDF-EOS Data

NOAA Data Stewardship Prototype

• NSIDC and THG demonstrated the feasibility of migrating NASA data to a standard HDF-AIP format

• Motivation:

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Technologies change regularly, organizations come and go, but data must

survive

But preserving data takes more than just preserving the bits, all the components of an

AIP are critical

Page 16: Archive Information Packages for NASA HDF-EOS Data

Project Goals

• Prototype development of Archive Information Packages for HDF data: For entire data sets For individual “granules”

• Test usability of digital library standards with geospatial data

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 17: Archive Information Packages for NASA HDF-EOS Data

NetCDF4 / HDF5 Data

METS

NSIDC/ ECS

HDF4-data

ISO-19115

H4toH5

ECS to METS

(Data Set)

CDM/NetCDF4

ECS toMETS(Granule)

NSIDC/ECS

Metadata

HDF5-AIP

NetCDF4/HDF5-data

NetCDF4 / HDF5 Data

NSIDC/ ECS

HDF4-data

H4toH5NetCDF4/HDF5-data

Program Plan (Modified)

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 18: Archive Information Packages for NASA HDF-EOS Data

Data file HDF5

METS

Primary Schema Extension Schema

|<mets>|---<dmdSec>----------------<ISO 19115>|---<amdSec>--------------|--<techMD>| |--<rightsMD> PREMIS| |--<sourceMD>|----<fileGrp>|----<structMap>

http://www.hdfgroup.uiuc.edu/papers/papers/AIP/HDF5_AIP_White_Paper.pdf

HDF5 AIP Components

Metadata file

HDF5 Granule Level Archive Information Packages

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 19: Archive Information Packages for NASA HDF-EOS Data

File Level AIP Activity Status

• Developed a map from NSIDC/ECS metadata to METS/PREMIS/ISO 19115 components

• Prototype software completed• Issues

What goes in PREMIS vs ISO 19115? Auxillary file handling - own AIP or not?

o E.g., browse files, processing history, PGE’s

Granules vs files

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 20: Archive Information Packages for NASA HDF-EOS Data

Issues and Questions

• Inconsistent use of terminology between standards – for example, what is a data set?

• Many of the standards care about distribution formats Are these even relevant concepts any

more? Do you really want to have to update the

metadata record just because a new distribution format was added?

What about new access services?

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 21: Archive Information Packages for NASA HDF-EOS Data

Next Steps

• NSIDC is updating our non-ECS data systems handling of metadata including support for PREMIS, etc. metadata on all holdings

• Work underway to upgrade granule level metadata for NSIDC flagship sea ice products (PREMIS/METS/ISO AIP packages)

• Work to improve archivability of data stored in HDF formats on-going – NASA implementing a standard XML description of contents across its archives

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Page 22: Archive Information Packages for NASA HDF-EOS Data

Acknowledgement

This work was supported under NOAA Scientific Stewardship Program grant number

NA07OAR4310286. Any opinions, findings, and conclusions or recommendations

expressed in this material are those of the author(s) and do not necessarily reflect the

views of NOAA. 

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII