A centre of expertise in digital information management UKOLN is supported by: An Introduction to...

42
A centre of expertise in digital information management www.ukoln.ac.u k UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG London, Thursday 17 November 2005 Pete Johnston Research Officer, UKOLN, University of Bath www.bath.ac.u k

Transcript of A centre of expertise in digital information management UKOLN is supported by: An Introduction to...

Page 1: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

UKOLN is supported by:

An Introduction to Dublin Core

Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG

London, Thursday 17 November 2005

Pete Johnston

Research Officer, UKOLN, University of Bath

www.bath.ac.uk

Page 2: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

An Introduction to Dublin Core

• A brief history• What is Dublin Core, really?• The DCMI Abstract Model• Encoding Dublin Core metadata• DC Application Profiles• DC in practice

Page 3: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

A Brief History

Page 4: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

A brief history (1)

• Mid 1990s: rapid growth of World Wide Web• Challenge of resource discovery

– search engines providing many hits, but little precision– recognition that library approach to cataloguing could not

scale to Web resources• 1995 OCLC/NCSA Workshop in Dublin, Ohio

– interdisciplinary consensus on 13 "metadata elements"– for discovery of "document-like objects"– relatively simple, usable by non-cataloguers

• 1996 OCLC/CNI Workshop in Dublin, Ohio– expand to 15 elements– explicitly cross-domain– for discovery of broad range of "resources"

Page 5: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

The Dublin Core Metadata Element Set

– Title– Subject– Description– Creator– Publisher– Contributor– Date

– Type– Format– Identifier– Source– Language– Relation– Coverage– Rights

Page 6: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

A brief history (2)

• 1997-2000 Development of notion of "qualification"– tension between simplicity and complexity– element refinement

• Narrow the meaning of a DC element• e.g. "date modified" v "date"

– encoding scheme• Provide additional information about a value• e.g. that a subject is a Library of Congress Subject Heading

– the "Dumb-Down" principle• Rules for transforming "qualified" description into "simple"

description– the "One-to-One" rule

• A DC description describes exactly one resource

Page 7: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

A brief history (3)

• 1997-2000 What is a "resource"?– e.g. Can the DCMES be applied to people?– DCMI Type Vocabulary

• Collection, Dataset, Event, Image (Still or Moving), Interactive Resource, Service, Software, Sound, Text, Physical Object

– But still fairly non-prescriptive

• 1998- Emergence of Resource Description Framework (RDF)

• 2000-2001 "Grammatical Principles" as informal data model

Page 8: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

A brief history (4)

• 2000-2005 Development of notion of DC "Application Profile"– tailoring metadata standards for context– providing local guidelines, constraints – combining components from different sources

• 2003-2005 Formalisation of DCMI Abstract Model– concepts used in DC metadata– different types of terms used in DC metadata – how those terms used in combination to construct

descriptions

Page 9: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

What is Dublin Core, really?

Page 10: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

Dublin Core is...

1. a conceptual framework/set of rules...– DCMI Abstract Model– describes how to use certain types of terms– ... to make statements...– ... that form descriptions (of resources)

2. a "core" vocabulary/set of terms...– managed by DCMI (Usage Board)– growing (relatively) slowly as new requirements arise – each identified by a Uniform Resource Identifier (URI)

3. a set of specifications for representing or encoding DC metadata descriptions in various formats

Page 11: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DCMI Abstract Model(a slightly simplified view)

Page 12: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

DCMI Abstract Model

• A description – describes exactly one resource– may specify a resource URI– consists of a set of statements

Page 13: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Resource URI

Statement

Description

DCMI Abstract Model: Descriptions

Page 14: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DCMI Abstract Model

• A statement must contain – a reference to a property

• property URI• all DC "elements" are properties• properties may be defined by agencies other than

DCMI– a reference to a second resource (value)

• value URI, and/or• one or more value representations

– value string– rich representation

Page 15: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Resource URI

Property URI Value URI

Property URI Value string

Property URI Rich representation

Statement

Description

DCMI Abstract Model: Statements

Page 16: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DCMI Abstract Model

• A statement may contain– a reference to a vocabulary encoding scheme

• vocabulary encoding scheme URI• type of value

– a reference to a syntax encoding scheme• syntax encoding scheme URI• how value string is interpreted

Page 17: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Resource URI

Property URI Rich representation

Property URI Value URI Vocab Enc Scheme URI

Property URI Value string Syntax Enc Scheme URI

Statement

Description

DCMI Abstract Model: Statements

Page 18: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DCMI Abstract Model

• A description describes one resource• Applications typically based on description

sets– groups of descriptions– where the described resources may be related in

some way

• Description sets encoded or serialised as records– according to rules of binding

Page 19: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Resource URI

Property URI Rich representation

Property URI Value URI Vocab Enc Scheme URI

Property URI Value string Syntax Enc Scheme URI

Statement

Resource URI

Property URI Rich representation

Property URI Value URI Vocab Enc Scheme URI

Property URI Value string Syntax Enc Scheme URI

Description

Description Set

Page 20: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

Encoding Dublin Core metadata(a very brief introduction!)

Page 21: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DCMI Abstract Model and Bindings

• For transfer between applications, descriptions must be represented as digital objects

• Binding maps between constructs in conceptual model and components in a digital format

• Two way– encoding application: description set -> record– decoding application: record -> description set

• DCMI currently provides three "encoding guidelines" specifications– Other agencies may also provide bindings

Page 22: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

<link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" /><link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />

<meta name="DC.title" content="A guide to DC metadata" />

<meta name="DCTERMS.audience" content="information managers" />

<meta name="DC.language" scheme="DCTERMS.ISO639-2" content="eng" />

<link rel="DCTERMS.references"href="http://dublincore.org/documents/dcq-html" />

Using X/HTML meta & link elements• The set of meta/link elements represent a single DC description.• The resource described is the X/HTML document in which the metadata is

embedded.• Each meta/link element represents a single statement• Property and Encoding Scheme URIs encoded as prefixed names

Property URI Value string Encoding Scheme URI Value URI

Page 23: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

<?xml version="1.0"?><meta xmlns="http://www.ukoln.ac.uk/metadata/dcdot/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/">

<dc:identifier>http://example.org/doc/1234/</dc:identifier>

<dc:title>A Guide to DC Metadata</dc:title>

<dc:language xsi:type="dcterms:ISO639-2">eng</dc:language>

<dcterms:references>http://dublincore.org/documents/dcq-html</dcterms:references>

</meta>

Using the DC-XML format• Supports only limited subset of Abstract Model (revision forthcoming)• The container element, here <meta>, represents a single DC description.• Each child element represents a single statement• Property URIs and Encoding Scheme URIs encoded as XML QNames

Property URI Value string Encoding Scheme URI

Page 24: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Using the Resource Description Framework (RDF)

• Specifications for DC in RDF do exist…

• … but currently work in progress to– resolve ambiguities– revise in light of DCAM

Page 25: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

Dublin Core Application Profiles

Page 26: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DC Application Profile

• Implementers adapt metadata standards to the context of their application – Tension between localisation and interoperability

• A DC Application Profile – specifies the terms (properties, vocabulary/syntax

encoding schemes) used in a class of description sets– describes how those terms are used

• supplementary information on how properties applied/interpreted in context

• constraints on occurrence of properties• constraints on values and value representations (encoding

schemes)

Page 27: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DC Application Profiles: Examples

• "Simple Dublin Core"– use of the 15 properties of the DCMES– all optional and repeatable– values represented by value strings– no vocabulary or syntax encoding schemes

• UK eGMS– use of selected properties from DCMI vocabularies, additional

properties– guidelines on use of properties– some properties mandated/recommended– some vocabulary encoding schemes mandated/recommended– guidance on content of value strings

Page 28: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DC Application Profiles: Examples

• JISC Information Environment Service Registry (IESR) Metadata Schema– supports description of several related resources

(Collection, Service, Agents)– use of selected properties from DCMI vocabularies,

selected properties from RSLP CD vocabularies, some properties created for IESR

– for each subject resource type, guidelines on use of properties

– some properties mandated/recommended– many vocabulary encoding schemes

mandated/recommended

Page 29: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

IESRProperties

IESRVocabEncodingSchemes

DC ApplicationProfile B:

IESR

DC ApplicationProfile A:

"Simple DC"

DCMIProperties

DCMIVocabEncodingSchemes

Page 30: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

DC in Practice

Page 31: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

Dublin Core in X/HTML

• Initial implementation focused on DC-in-HTML– Robot crawls individual HTML pages to extract metadata

• But today little/no use by large Web search engines– Problems of spamming/trust– Lack of take-up by authors/publishers– Success of full-text crawling/indexing, esp. Google!

• However, some use in controlled domains– Intranets

– Trusted groups of providers (e.g. eGMS)• Embedding DC in XHTML useful if you know a

search engine exploits it

Page 32: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Web Sites

Harvester

HTTP GET

Page 33: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Picture Australia- images "related to all things Australian" from 40+ cultural agencies"– central search service based (initially at least) on crawling HTML-embedded DC metadata– providers migrating to OAI-PMH– currently hybrid approach?

http://www.pictureaustralia.org/

Page 34: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,
Page 35: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,
Page 36: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

Dublin Core and OAI-PMH

• Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)– Fairly simple mechanism for sharing metadata records

between applications– Has origins in “e-prints” community– Built on HTTP, XML– Allows a harvester to ask a repository for all or some of

its metadata records (in a specified metadata format)• i.e. supports "incremental harvesting"• "Give me all your records updated since yyyy-mm-dd"

• "OAI-DC" (Simple DC) is mandatory format– But no limitation on format that can be transferred (as

long as can be described by XML Schema)

Page 37: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

Repositories

Harvester

OAI-PMH

Page 38: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

OAIster (University of Michigan)–"academically-oriented digital resources"–"5,947,627 records from 557 institutions" (2005-11-15)

http://oaister.umdl.umich.edu/

Page 39: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,
Page 40: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,
Page 41: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

Summary

• DCMES/"Simple DC" as a "core" for discovery of wide range of resources

• "Simple DC" is, by definition, simple!– Limitations in terms of functions/services that

can be offered

• DCMI Abstract Model provides a framework for extensibility and modularity

• A DC Application Profile describes a real-world usage of that model

Page 42: A centre of expertise in digital information management  UKOLN is supported by: An Introduction to Dublin Core Making Sense of Metadata,

                                                             

A centre of expertise in digital information management www.ukoln.ac.uk

UKOLN is supported by:

An Introduction to Dublin Core

Making Sense of Metadata, Society of Archivists EAD/Data Exchange SIG

London, Thursday 17 November 2005

Pete Johnston

Research Officer, UKOLN, University of Bath

www.bath.ac.uk