Lifecycle Metadata for Digital Objects

27
Metadata for Digital Objects November 15, 2004 Preservation Metadata

description

Lifecycle Metadata for Digital Objects. November 15, 2004 Preservation Metadata. Definition from PADI. Preservation metadata is intended to store technical details on the format, structure and use of the digital content, - PowerPoint PPT Presentation

Transcript of Lifecycle Metadata for Digital Objects

Page 1: Lifecycle Metadata for Digital Objects

Lifecycle Metadata for Digital Objects

November 15, 2004Preservation Metadata

Page 2: Lifecycle Metadata for Digital Objects

Definition from PADI Preservation metadata is intended to store technical details on the format, structure and use of

the digital content, the history of all actions performed on the resource

including changes and decisions, the authenticity information such as technical features or

custody history, and the responsibilities and rights information applicable to

preservation actions.

Page 3: Lifecycle Metadata for Digital Objects

What is Preservation Metadata? Object stability (OAIS “content data object”)

– What elements of the object’s content should be preserved? What is it? What is it for?

– What functions of the object should be preserved?– (i.e., how can it remain itself into the future, and

what do we mean by “itself”?) Environmental support (OAIS “environment”)

– What kind of environmental characteristics does the object need to stay alive (software, hardware)?

– (i.e., how do we specify its life support system?)

Page 4: Lifecycle Metadata for Digital Objects

Object Stability I: Content

Authenticity revisited: stability for what?– Access to genuine article– Historical truth– Guarantee of prior art– Intellectual property guarantee

Range of attributes needed for each– What does “content” mean?– What is needed for it to remain “the same”

Page 5: Lifecycle Metadata for Digital Objects

Object Stability II: Functionality

Static objects (e.g. text)– Is “content” enough?– Look and feel

Dynamic objects (e.g. computer game)– Look and feel (“experiential” elements)– Connectivity– Interactivity

Page 6: Lifecycle Metadata for Digital Objects

Environmental Support I: Emulation Making it possible to see the object as it was

originally seen Making it possible for the object to function as

it originally did Providing software support for that to happen

– Running the original program (in an environment that emulates the original environment)

– Running something that looks like (emulates) the original program

Page 7: Lifecycle Metadata for Digital Objects

Environmental Support II: Migration Deciding what to migrate (deciding what

to lose) Transformations to the object

– If reversible, no need to keep original object (this is no longer acceptable: note its connection with space terror)

– If not, retention of original object necessary

Page 8: Lifecycle Metadata for Digital Objects

Documentation requirements for preservation What the object was What the object is What happened in between

Page 9: Lifecycle Metadata for Digital Objects

OAIS metadata model I

Page 10: Lifecycle Metadata for Digital Objects

OAIS metadata model II

SIP (send), AIP (archive), DIP (disseminate) Parts of an object

– Content– Preservation description

• Reference (unique identifier)• Provenance (history in and out of repository)• Context (archival bond)• Fixity (message digest)

– Packaging– Descriptive

Page 11: Lifecycle Metadata for Digital Objects

OAIS metadata model II What is “representation information”?

– How much must be kept?– Monitoring changes

What is the “knowledge base”?– Designated User Community– How do ontologies and the Semantic Web fit here?

Remember the DUC will need access through automated tools (cf. metadata and software registries)

– Where does bootstrapping stop?– DUC as “the public” also means a much broader

universe of discourse

Page 12: Lifecycle Metadata for Digital Objects

NEDLIB I: object layers Significant focus on emulation Part of OAIS “context” here OAIS model dictates layered view of original

object (NEDLIB uses format + program)– Physical (storage format + hardware

dependencies)– Binary (file system + operating system)– Structure (representation of human-viewable

object in digital environment + interpreter)– Object (format of object + routines to interpret)– Application (needed to render the object)

Page 13: Lifecycle Metadata for Digital Objects

NEDLIB II: adding rest of OAIS

Reference (identifier) Fixity (how to know it’s the same) Context (parts) + provenance = change history Most of this summed in: change history

– Date– Old version– New version– Tool– Reverse

Page 14: Lifecycle Metadata for Digital Objects

CEDARS preservation metadata thinking Distributed archives preservation project Development of representation network Formal development of “significant

properties” idea– Functionality required by viewers– Always retain original!!

Migration on request for end users

Page 15: Lifecycle Metadata for Digital Objects

National Library of Australia preservation metadata See table containing 25 elements Note that many elements have

subelements Note influence of notion of versioning

Page 16: Lifecycle Metadata for Digital Objects

OCLC/RLG preservation metadata thinking Attempt to provide summary/unfication

of state of play See recommended metadata set:

http://www.oclc.org/research/projects/pmwg/pm_framework.pdf

Ultimately served to underlie OCLC implementation

Page 17: Lifecycle Metadata for Digital Objects

How does METS fit here?

Page 18: Lifecycle Metadata for Digital Objects

Harvard Digital Repository Service XML example OCLC/RLG 2001 white paper examples

– DRS instruction file• Note that the file contains “instructions” in the

form of names of actions• This instruction file assumes that instructions

are executed at ingest– DRS DTD for images

Link to the DRS overview description: http://hul.harvard.edu/ois/systems/drs/policyguide.html

Page 19: Lifecycle Metadata for Digital Objects
Page 20: Lifecycle Metadata for Digital Objects
Page 21: Lifecycle Metadata for Digital Objects
Page 22: Lifecycle Metadata for Digital Objects
Page 23: Lifecycle Metadata for Digital Objects
Page 24: Lifecycle Metadata for Digital Objects
Page 25: Lifecycle Metadata for Digital Objects

DSpace and friends

DSpace as a framework Version 1.2 Roadmap SDSC alliance

Page 26: Lifecycle Metadata for Digital Objects

PREMIS project (OCLC/RLG) PREMIS (Preservation Metadata:

Implementation Strategies), 2003 Elements working group for preservation

metadata core (report due end 2004): http://www.oclc.org/research/projects/pmwg/core_elements.htm

Implementation subgroup polled for best practices

22 institutions have implemented repositories; only 11 have preservation strategies in place (see summary report RLG Diginews October 15, 2004)

Page 27: Lifecycle Metadata for Digital Objects

Other ideas

Why has there been so little progress? Thinking through loss in the cultural record

– Where/how have greatest losses happened?– What happened as a result?– What does it mean to have an adequate record?

“Good enough” fixity– Peer-to-peer schemes (LOCKSS)– Using “evidence” concept to restrict authenticity

requirements