Parul Sharma Sally Vermaaten Right Combination

31
The Right Combination: Using DDI and PREMIS for data preservation Parul Sharma & Sally Vermaaten March 2012

description

The Right Combination:Using DDI and PREMIS for data preservationParul Sharma & Sally Vermaaten

Transcript of Parul Sharma Sally Vermaaten Right Combination

Page 1: Parul Sharma Sally Vermaaten Right Combination

The Right Combination:Using DDI and PREMIS for data preservation

Parul Sharma & Sally Vermaaten

March 2012

Page 2: Parul Sharma Sally Vermaaten Right Combination

1. The context – drivers for preservation2. The problem – challenges faced when trying to re-

use data3. Our solution – metadata for data management &

preservation4. Our recommendations– strategies for making the

right metadata choices

2

Outline

Page 3: Parul Sharma Sally Vermaaten Right Combination

1. THE CONTEXT: DRIVERS FOR PRESERVATION

3

Page 4: Parul Sharma Sally Vermaaten Right Combination

Data is a cross-domain concern

Geospatial dataScientific data

4

Statistical dataFinancial and commercial data

Page 5: Parul Sharma Sally Vermaaten Right Combination

5

There are many drivers for data preservation

Legal mandates

Verification

Uniqueness of data

Cost of data collection

Data re-use

Page 6: Parul Sharma Sally Vermaaten Right Combination

6

An example of data re-use at Statistics New Zealand

Page 7: Parul Sharma Sally Vermaaten Right Combination

2. THE PROBLEM: CHALLENGES FACED WHEN TRYING TO RE-USE DATA

7

Page 8: Parul Sharma Sally Vermaaten Right Combination

Common challenges to re-use/preservation of any type of digital object

Page 9: Parul Sharma Sally Vermaaten Right Combination

Common challenges to re-use/preservation of any type of digital object

I can’t find it I can’t open it (wrong hardware/software) I’m not sure it is the right thing

Page 10: Parul Sharma Sally Vermaaten Right Combination

Unique challengesto re-use/preservation of structured data

Page 11: Parul Sharma Sally Vermaaten Right Combination

11

I’m not sure it is the authoritative dataI don’t understand the meaning of the data - data is not self-descriptive I can’t use the data because I can’t harmonize it with other data

Unique challengesto re-use/preservation of structured data

Page 12: Parul Sharma Sally Vermaaten Right Combination

3. OUR SOLUTION: METADATA FOR DATA MANAGEMENT & PRESERVATION

12

Page 13: Parul Sharma Sally Vermaaten Right Combination

13

Our solutions

Page 14: Parul Sharma Sally Vermaaten Right Combination

14

Our solutions

Page 15: Parul Sharma Sally Vermaaten Right Combination

15

Our solutions

Page 16: Parul Sharma Sally Vermaaten Right Combination

16

Our solutions

Page 17: Parul Sharma Sally Vermaaten Right Combination

17

Our solutions

Page 18: Parul Sharma Sally Vermaaten Right Combination

18

To support these processes…Metadata is keyWe could invent our own standard for recording metadata but there is a better way …

Page 19: Parul Sharma Sally Vermaaten Right Combination

How?

19

+ +

Describe!

Data Documentation Initiative (DDI)

Dublin Core

PREservation Metadata: Implementation Strategies (PREMIS)

Discover !

Preserve!

Page 20: Parul Sharma Sally Vermaaten Right Combination

Comparison of standards coverage

20

Dublin Core DDI PREMIS

Discovery information about a resource (e.g. Title, Creator, Publication date)

Surveys and outputs (Series and Studies)

Objects (significant characteristics, checksums, basic identifying information)

Methodology & quality information

Events (preservation actions)

Classifications used Agents

Dataset descriptions Rights

Variables used

Links to documentation

Page 21: Parul Sharma Sally Vermaaten Right Combination

Metadata to support re-use

21

DDIPREMIS

Page 22: Parul Sharma Sally Vermaaten Right Combination

4. OUR RECOMMENDATIONS: STRATEGIES FOR MAKING THE RIGHT METADATA CHOICES

22

Page 23: Parul Sharma Sally Vermaaten Right Combination

Metadata Top Tips

1. Create structures that will allow you to re-use metadata tools

2. Use standards that are fit for your content so users can re-use

3. Consider overlap between standards so you’re using the right standard for the right job

4. Provide standard based tools and capture at point of creation to improve quality and efficiency

23

Page 24: Parul Sharma Sally Vermaaten Right Combination

1. Create structures that will allow you to re-use metadata tools

Set yourself up to be able to use the same tools to harvest and mine your metadata (e.g. handy reports, searching across content types) by:

– developing a standard structure that can support all your content types

– and recording generic information in generic metadata standards

24

Page 26: Parul Sharma Sally Vermaaten Right Combination

2. Use standards that are fit for your content so users can re-use

26

Enable future re-use and understanding by recording format or content-specific metadata in fit-for-purpose standards e.g.

DDI for statistical dataSIARD for databasesMIX for images

Page 27: Parul Sharma Sally Vermaaten Right Combination

3. Consider overlap between standards so you’re using the right standard for the

right job

27

Information DDI PREMIS Dublin Core Useful to duplicate?

Basic identifying information

•Title•Creator•PublicationDate•ID

•Title•Creator•Date•Identifier

yes

Access information

•Access Conditions •Rights entity •Rights No – PREMIS is most expressive and generic location

Page 28: Parul Sharma Sally Vermaaten Right Combination

4. Provide standard based tools and capture at point of creation to improve quality and efficiency

At first, you may need to capture or collate all metadata about data yourself Think ahead about tools you might be able to provide to data experts to allow them to record the information directly in the standard if possible

28

Page 29: Parul Sharma Sally Vermaaten Right Combination

29

Page 30: Parul Sharma Sally Vermaaten Right Combination

Takeaways

1. Organisations have many reasons to re-use data over time 2. There are unique challenges to preserving data3. Where possible, save yourself some work and make your

metadata more harvestable and data more understandable by using international standards like DDI and PREMIS

4. When you use metadata standards like DDI and PREMIS together:• create generic structures• use fit-for-purpose standards for specific content• consider information overlap • ‘delegate’ metadata capture where possible

30