Parul Sharma Sally Vermaaten Right Combination

Post on 17-May-2015

565 views 1 download

Tags:

description

The Right Combination:Using DDI and PREMIS for data preservationParul Sharma & Sally Vermaaten

Transcript of Parul Sharma Sally Vermaaten Right Combination

The Right Combination:Using DDI and PREMIS for data preservation

Parul Sharma & Sally Vermaaten

March 2012

1. The context – drivers for preservation2. The problem – challenges faced when trying to re-

use data3. Our solution – metadata for data management &

preservation4. Our recommendations– strategies for making the

right metadata choices

2

Outline

1. THE CONTEXT: DRIVERS FOR PRESERVATION

3

Data is a cross-domain concern

Geospatial dataScientific data

4

Statistical dataFinancial and commercial data

5

There are many drivers for data preservation

Legal mandates

Verification

Uniqueness of data

Cost of data collection

Data re-use

6

An example of data re-use at Statistics New Zealand

2. THE PROBLEM: CHALLENGES FACED WHEN TRYING TO RE-USE DATA

7

Common challenges to re-use/preservation of any type of digital object

Common challenges to re-use/preservation of any type of digital object

I can’t find it I can’t open it (wrong hardware/software) I’m not sure it is the right thing

Unique challengesto re-use/preservation of structured data

11

I’m not sure it is the authoritative dataI don’t understand the meaning of the data - data is not self-descriptive I can’t use the data because I can’t harmonize it with other data

Unique challengesto re-use/preservation of structured data

3. OUR SOLUTION: METADATA FOR DATA MANAGEMENT & PRESERVATION

12

13

Our solutions

14

Our solutions

15

Our solutions

16

Our solutions

17

Our solutions

18

To support these processes…Metadata is keyWe could invent our own standard for recording metadata but there is a better way …

How?

19

+ +

Describe!

Data Documentation Initiative (DDI)

Dublin Core

PREservation Metadata: Implementation Strategies (PREMIS)

Discover !

Preserve!

Comparison of standards coverage

20

Dublin Core DDI PREMIS

Discovery information about a resource (e.g. Title, Creator, Publication date)

Surveys and outputs (Series and Studies)

Objects (significant characteristics, checksums, basic identifying information)

Methodology & quality information

Events (preservation actions)

Classifications used Agents

Dataset descriptions Rights

Variables used

Links to documentation

Metadata to support re-use

21

DDIPREMIS

4. OUR RECOMMENDATIONS: STRATEGIES FOR MAKING THE RIGHT METADATA CHOICES

22

Metadata Top Tips

1. Create structures that will allow you to re-use metadata tools

2. Use standards that are fit for your content so users can re-use

3. Consider overlap between standards so you’re using the right standard for the right job

4. Provide standard based tools and capture at point of creation to improve quality and efficiency

23

1. Create structures that will allow you to re-use metadata tools

Set yourself up to be able to use the same tools to harvest and mine your metadata (e.g. handy reports, searching across content types) by:

– developing a standard structure that can support all your content types

– and recording generic information in generic metadata standards

24

2. Use standards that are fit for your content so users can re-use

26

Enable future re-use and understanding by recording format or content-specific metadata in fit-for-purpose standards e.g.

DDI for statistical dataSIARD for databasesMIX for images

3. Consider overlap between standards so you’re using the right standard for the

right job

27

Information DDI PREMIS Dublin Core Useful to duplicate?

Basic identifying information

•Title•Creator•PublicationDate•ID

•Title•Creator•Date•Identifier

yes

Access information

•Access Conditions •Rights entity •Rights No – PREMIS is most expressive and generic location

4. Provide standard based tools and capture at point of creation to improve quality and efficiency

At first, you may need to capture or collate all metadata about data yourself Think ahead about tools you might be able to provide to data experts to allow them to record the information directly in the standard if possible

28

29

Takeaways

1. Organisations have many reasons to re-use data over time 2. There are unique challenges to preserving data3. Where possible, save yourself some work and make your

metadata more harvestable and data more understandable by using international standards like DDI and PREMIS

4. When you use metadata standards like DDI and PREMIS together:• create generic structures• use fit-for-purpose standards for specific content• consider information overlap • ‘delegate’ metadata capture where possible

30

Thanks!

31

Sally Vermaaten sally.vermaaten@stats.govt.nzParul Sharma parul.sharma@stats.govt.nz