Lifecycle Metadata for Digital Objects
description
Transcript of Lifecycle Metadata for Digital Objects
Lifecycle Metadata for Digital Objects
November 15, 2004Preservation Metadata
Definition from PADI Preservation metadata is intended to store technical details on the format, structure and use of
the digital content, the history of all actions performed on the resource
including changes and decisions, the authenticity information such as technical features or
custody history, and the responsibilities and rights information applicable to
preservation actions.
What is Preservation Metadata? Object stability (OAIS “content data object”)
– What elements of the object’s content should be preserved? What is it? What is it for?
– What functions of the object should be preserved?– (i.e., how can it remain itself into the future, and
what do we mean by “itself”?) Environmental support (OAIS “environment”)
– What kind of environmental characteristics does the object need to stay alive (software, hardware)?
– (i.e., how do we specify its life support system?)
Object Stability I: Content
Authenticity revisited: stability for what?– Access to genuine article– Historical truth– Guarantee of prior art– Intellectual property guarantee
Range of attributes needed for each– What does “content” mean?– What is needed for it to remain “the same”
Object Stability II: Functionality
Static objects (e.g. text)– Is “content” enough?– Look and feel
Dynamic objects (e.g. computer game)– Look and feel (“experiential” elements)– Connectivity– Interactivity
Environmental Support I: Emulation Making it possible to see the object as it was
originally seen Making it possible for the object to function as
it originally did Providing software support for that to happen
– Running the original program (in an environment that emulates the original environment)
– Running something that looks like (emulates) the original program
Environmental Support II: Migration Deciding what to migrate (deciding what
to lose) Transformations to the object
– If reversible, no need to keep original object (this is no longer acceptable: note its connection with space terror)
– If not, retention of original object necessary
Documentation requirements for preservation What the object was What the object is What happened in between
OAIS metadata model I
OAIS metadata model II
SIP (send), AIP (archive), DIP (disseminate) Parts of an object
– Content– Preservation description
• Reference (unique identifier)• Provenance (history in and out of repository)• Context (archival bond)• Fixity (message digest)
– Packaging– Descriptive
OAIS metadata model II What is “representation information”?
– How much must be kept?– Monitoring changes
What is the “knowledge base”?– Designated User Community– How do ontologies and the Semantic Web fit here?
Remember the DUC will need access through automated tools (cf. metadata and software registries)
– Where does bootstrapping stop?– DUC as “the public” also means a much broader
universe of discourse
NEDLIB I: object layers Significant focus on emulation Part of OAIS “context” here OAIS model dictates layered view of original
object (NEDLIB uses format + program)– Physical (storage format + hardware
dependencies)– Binary (file system + operating system)– Structure (representation of human-viewable
object in digital environment + interpreter)– Object (format of object + routines to interpret)– Application (needed to render the object)
NEDLIB II: adding rest of OAIS
Reference (identifier) Fixity (how to know it’s the same) Context (parts) + provenance = change history Most of this summed in: change history
– Date– Old version– New version– Tool– Reverse
CEDARS preservation metadata thinking Distributed archives preservation project Development of representation network Formal development of “significant
properties” idea– Functionality required by viewers– Always retain original!!
Migration on request for end users
National Library of Australia preservation metadata See table containing 25 elements Note that many elements have
subelements Note influence of notion of versioning
OCLC/RLG preservation metadata thinking Attempt to provide summary/unfication
of state of play See recommended metadata set:
http://www.oclc.org/research/projects/pmwg/pm_framework.pdf
Ultimately served to underlie OCLC implementation
How does METS fit here?
Harvard Digital Repository Service XML example OCLC/RLG 2001 white paper examples
– DRS instruction file• Note that the file contains “instructions” in the
form of names of actions• This instruction file assumes that instructions
are executed at ingest– DRS DTD for images
Link to the DRS overview description: http://hul.harvard.edu/ois/systems/drs/policyguide.html
DSpace and friends
DSpace as a framework Version 1.2 Roadmap SDSC alliance
PREMIS project (OCLC/RLG) PREMIS (Preservation Metadata:
Implementation Strategies), 2003 Elements working group for preservation
metadata core (report due end 2004): http://www.oclc.org/research/projects/pmwg/core_elements.htm
Implementation subgroup polled for best practices
22 institutions have implemented repositories; only 11 have preservation strategies in place (see summary report RLG Diginews October 15, 2004)
Other ideas
Why has there been so little progress? Thinking through loss in the cultural record
– Where/how have greatest losses happened?– What happened as a result?– What does it mean to have an adequate record?
“Good enough” fixity– Peer-to-peer schemes (LOCKSS)– Using “evidence” concept to restrict authenticity
requirements