Establishing the significant properties of digital research

18
Establishing the significant properties of digital research Gareth Knight Centre for e- Research King’s College London iPRES 2008 29 September 2008

description

Slides for the talk on establishing the significant properties of digital research at iPRES 2008 on 29-30 September 2008.

Transcript of Establishing the significant properties of digital research

Page 1: Establishing the significant properties of digital research

Establishing the significant

properties of digital researchGareth Knight

Centre for e-ResearchKing’s College London

iPRES 200829 September 2008

Page 2: Establishing the significant properties of digital research

2

Overview

•Definitions•Potential risks to significant properties•Criteria for deciding significance•Recording and comparing SPs•General observations

Page 3: Establishing the significant properties of digital research

3

InSPECT Project

•Project: Investigating the Significant Properties of Electronic Content over Time•Development Partners: Centre for e-Research, KCL; The National Archives; The British Library (advisory)•Objectives

– Expand and articulate the concept of ‘significant properties’– Determine the properties that are significant to the long-term

accessibility of different types of digital object (email, presentation structured text, audio, raster images)

– To develop methods for expressing and measuring properties to:• validate the results of preservation actions• support the needs of user communities

Page 4: Establishing the significant properties of digital research

4

Many definitions…“The characteristics of digital objects that must be preserved over time in order to ensure the continued accessibility, usability and meaning of the objects”Wilson, 2007

“Significant properties are those properties of digital objects that affect their quality, usability, rendering, and behaviour.”Hedstrom & Lee, 2002

“Those characteristics (both technical, intellectual, and aesthetic) agreed by the archive or by the collection manager to be the most important features to preserve over time.”The Cedars Project Report, 2001

Closely tied to notions of authenticity and integrity

Page 5: Establishing the significant properties of digital research

5

Representation InformationRI consists of:•Structure information that describes the encoding scheme in which data is stored, e.g. format, encoding algorithm

•Semantic information that indicate how the values are to be interpreted. E.g. documentation that indicates how numeric values in a CSV or tab-delimited format must be interpreted.

Page 6: Establishing the significant properties of digital research

6

Interpreting SPs in abstract

Source Process PerformanceIn te rp re ted

v iaY ie lds

NAA Performance Model

OAIS Reference Model

Data ObjectRepresentation

InformationInformation

ObjectIn te rp re tedus ing

Y ie lds

E ncod ingP rope rties

S ignif icantP rope rties

Page 7: Establishing the significant properties of digital research

7

Interpreting SPs in practice

Source Process PerformanceIn te rp re ted

v iaY ie lds

NAA Performance Model

data

=

informationcontent

computer

+

OS

+ +

application

Page 8: Establishing the significant properties of digital research

8

Risk scenarios

“traditionally, preserving things meant keeping them unchanged; however … if we hold on to digital information without modifications, accessing the information will become increasingly more difficult, if not impossible.” Su-Shing Chen, “The Paradox of Preservation”, Computer, March, 2001, pp. 2-6.

1. Recreation of source data• Hardware (e.g. upgrades, virtual machines, emulators)• Operating system• Software application

2. Conversion – format normalisation/migration• File Format• Encoding format

Page 9: Establishing the significant properties of digital research

9

Differences in rendering…Previous slide in OpenOffice Impress 2.0

Page 10: Establishing the significant properties of digital research

10

What is significant?

“A digital object’s Significant Properties are not empirical; archives will make judgments at levels appropriate to fulfil their preservation responsibilities and meet the needs of the archive’s user communities”The Cedars Project Report, 2001

“Definitions of Significant Properties that affect the aesthetics, implied meaning, and affordances of digital objects tend to be much more subjective and tied to the context of creation and use.”Hedstrom & Lee, 2002

Fundamental Questions of digital preservation:1. What must you retain to ensure the integrity and

authenticity of the digital object?2. What can you lose without potential implications?

Page 11: Establishing the significant properties of digital research

11

Frameworks

•Rothenberg & Bikson (1999)– Encouraged analysis of business functions, followed by

technological capabilities•Digital Diplomatics (2001)

– Created by InterPARES project, based on archival diplomatics

– Analysis of Records rather than Objects– Examines Documentary form, Annotations, Context of

creation and use•Utility Analysis (2004)

– Developed during DELOS project and used in PLANETS– File characteristics, process characteristics

Page 12: Establishing the significant properties of digital research

12

Criteria for deciding significance•Composition of the digital object

– Form in which the idea is expressed– Expression method in a digital environment

•Purpose– Intended function (e.g. diplomatic analysis)– Type of user

•Organisational investment– Strategic– Financial– Expectation

•Capability– Tools– Legal– Financial

P urpo s e

C apabi l i tyinve s tm e nt

c om po s i t io n

SignificantProperties

Inte nde dfunc t io n

inte nde dc omm unity

To o ls

f inanc ial

e xpre s s io nm e tho d

Em bo dim e ntm e tho d

e xpe c tat io n

f inanc ial

le gal

Ve rs io n

po lic y

Page 13: Establishing the significant properties of digital research

13

Significant property types•Characteristics of the intellectual content itself

– Length (duration of audio recording, no. of characters)– Placement (e.g. audio playback through left/right

speaker, position and size of shape, sequential order of several paragraphs)

•Properties that indicate the environment in which the intellectual content may be reproduced

– Quality level (no. of colours, audio quality)– Access status (viewing, editing)

Page 14: Establishing the significant properties of digital research

14

Cataloguing SPs (1)

Data Dictionary for Significant Properties:– Catalogue significant properties of digital object that

must be maintained– May be applied to range of resource types, file formats

and subject disciplines– Validate that information content is authentic, in

regards to its original meaning– Note any property constraints and their application to

specific functions and designated communities

XML schema in the near future…

Page 15: Establishing the significant properties of digital research

15

Cataloguing SPs (2)

•identifier•title•description•function•genre•preservation Level•specification registry•measurement•relationships

Page 16: Establishing the significant properties of digital research

16

Compare and contrast

Info rm ation O b jec t

Manifes tation 1

p rope rty 1 ... p rope rty np rope rty 2

Page 17: Establishing the significant properties of digital research

17

Compare and contrast

Info rm ation O b jec t

Manifes tation 1

P rope rty 1 ... p rope rty n

Manifes tation 2

p rope rty 2

Page 18: Establishing the significant properties of digital research

18

General Observations

•An understanding of significant properties is useful during creation and distribution of objects, in addition to long-term curation and preservation.•Although some properties may be identified that are important to the form of resources, many decisions on significance require consideration of the context.•Curators should begin to consider ways to capture and retain significant properties information on ingest into repository•The success of preservation activity should be evaluated on the basis of the ability to maintain significant properties•Further collaborative work between creators, archives/repositories and tool developers is required to provide a consistent approach.