Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) &...

16
Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of Edinburgh With contributions from David Medyckyj-Scott http:// edina.ac.uk / Activating metadata: the role of metadata in effective spatial data exploitation, Cambridge, 6–7th July 2005 NIEeS Metadata Workshop

Transcript of Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) &...

Page 1: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Data mediators experience with metadata

– A national data centre view

Peter Burnhill (Director) & Tony MathysEDINA National Data Centre

University of Edinburgh

With contributions from David Medyckyj-Scott

http://edina.ac.uk/

Activating metadata: the role of metadata in effective spatial data exploitation,

Cambridge, 6–7th July 2005

NIEeS Metadata Workshop

Page 2: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Overview

• EDINA national data centre

• Acting as a mediator

• Internal use of metadata

• Issues and challenges

• Dataset publishing

Page 3: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

EDINA

• A National Data Centre for Tertiary Education since 1995– based at the University of Edinburgh Data Library

• Our mission... to enhance the productivity of research, learning and teaching

in UK higher and further education • Focus is service but also undertake r&D projects to

services• Major content provider within the acadmia• Strategic move toward interoperability & shared services

role• Substantial experience in handling and delivering key

geospatial data and geo-referenced information

Page 4: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Existing Geo-data Services

Page 5: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Services

Page 6: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Our interest in metadata has a long history…

• Beginning in the 1980s, more than 25 years experience with geospatial metadata initiatives, policies, projects and services e.g.

– ESRC Computer files cataloguing group (1980s)– Register of spatially referenced data for Scotland (1991)– “Metadata in the Geosciences” (published 1991)– Global Environmental Network for Information Exchange (GENIE) 1990s– Rawa Taio – environmental metadata service (NZ, 1996)– MetroGIS, Minneapolis/Saint Paul Metropolitan Organisation for

promoting spatial data sharing (1998)– State representative on ANZLIC metadata WG– Geo-data browser – Go-Geo! portal (2000+)– Advisors to AskGiraffe and now hosting GIGateway service– UK GEMINI (Geo-spatial Metadata Interoperability Initiative)

Page 7: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Simplified workflow

Discover

Locate

Access

Use

Publish

Fit for purpose?

Preserve

Page 8: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Metadata provided by EDINA

Discover

Locate

Access

Use

Support information e.g.• OS user guides• Map sheet metadata e.g. survey date• Legend files• Format descriptions• Explanations of key concepts

Metadata records for OS products, DBDs and agcensus(114 metadata records created by EDINA and published on Go-Geo! – another 100+ still to produce)

EuroGlobalMap metadata records supplied by National Mapping Agencies

Page 9: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

What we are supplied with

• No metadata at all• or partial

– e.g. sheet/tile level and not ‘collection’ level

• or incomplete It lacks– a product specification– lineage information (history, differences between

‘editions’, why changed)– quality statement– descriptions of processing– information on file formats– coding book (definitions of attributes)and so on…

• Not machine readable

Page 10: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Internal metadata activity

• Organisational memory is important to EDINA– “stored information from an organisations history that can be

brought to bear on present decisions”– distributed across different retention facilities and often informal

i.e. it’s in someone’s heads– now trying to formalise it – what, when, where, how, who and why

• Activities– creating discovery level records– documenting processing steps occurring through the life cycle of

a dataset– data quality statement which describe the completeness,

consistency and accuracy of the dataset– created an ISO 19115 data quality extension– how do we code processing steps?

Page 11: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Issues and challenges

• Motivating people to document datasets is a key challenge– seen as onerous task and left undone– we were saying this in ’80s and situation no better now

• Difficult to fully automate – requires human interpretation

• If we don’t do it, risk of data loss or expensive re-acquisition

• Greater ROI from re-use• It’s a people and organisational problem

– but also concerns about IPR, copyright and mechanisms for sharing

Page 12: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Dataset publishing

• Re introduce the concept of Dataset Publishing (Callahan, Johnson, and Shelley 1996)

– analogous to publishing papers– rewards people for publishing datasets (e.g. promotion,

RAE)– involves establishment of procedures (e.g. standards to

use, peer review) & resources to manage procedures* Should minimise time and effort required

– a dataset description is the equivalent of the bibliographic record

– need tools to assist in creation, maintenance and dissemination of dataset descriptions

• EDINA involved in two related activities– Go-Geo! Portal Phase 4b– GRADE – (Geospatial Repository for Academic

Deposit and Extraction)

Page 13: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

EDINA data publishing support projects

Go-Geo! Portal – phase 4b• JISC funded, 18 month project• Go-Geo! portal serves as a

discovery tool now extending to become a publication tool

• Promote and encourage geospatial metadata creation within UK tertiary education

• A pilot study with 4 universities to establish a business model for metadata creation and maintenance based on the use of Go-Geo! resources as local data management tools

GRADE• JISC funded project, 18 months• Looking at utility of geospatial

data repositories for storing and sharing of geospatial data

• Comparing thematic v. institutional v. informal

• Compendium of use cases of intended data sharing

• Assess interoperability aspects of geospatial data repositories

• www.gogeo.ac.uk• www.gogeo.ac.uk/Phase4b.html• edina.ac.uk/projects/grade/

Page 14: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Comments and observations

• We need to understand better the life cycles of data and metadata as they are disseminated across the academic community– Authorship of data and metadata as data are merged,

generalised, augmented, new data derived, new editions published

– Tracking and recording digital rights as this happens

• Are we documenting what users really want to know?– Subject and content

• On the annotation of datasets and metadata• Thesauri v. controlled terms v. ontologies • Making metadata actionable

Page 15: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Conclusions

• Metadata creation should happen close to data creation

• Metadata population and maintenance must be viewed as an on-going long term process

• Need to think more about what happens once metadata and data is published

• Can we really call ourselves spatial data management professionals?

Page 16: Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.

Contact details

Peter BurnhillDirector Edina National Data Centre

[email protected]

Tony MathysGo-Geo Project

[email protected]

Tel.: +44 (0)131 650 3302Fax: +44 (0)131 650 3308

EDINA web site: http://edina.ac.ukGo-Geo!: www.gogeo.ac.uk