Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and...
-
Upload
thomas-powell -
Category
Documents
-
view
216 -
download
0
Transcript of Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and...
![Page 1: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/1.jpg)
Dublin Core Metadata Tutorial
July 9, 2007
Stuart WeibelSenior Research Scientist
OCLC Programs and Research
![Page 2: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/2.jpg)
Tutorial Roadmap
Principles of Metadata Dublin Core Metadata Basics The Dublin Core Abstract Model Syntax Alternatives for DC Metadata Mixing and Matching Metadata History and workings of the Dublin Core Metadata
Initiative
Acknowledgements: I have borrowed liberally from tutorial slides sets from Tom Baker, Diane Hillman, Andy Powell, and Marty Kurth, available at Dublincore.org
![Page 3: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/3.jpg)
Basic Principles of Metadata
The Web as an information systemThe Internet CommonsInteroperability is keyMARC livesThe varieties of metadataModularitySome Challenges
![Page 4: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/4.jpg)
State of the Web as an Information System
Search systems are motivated by business models, not functionality
Index coverage is broad, but unpredictable Too much recall, too little precision Index spam abounds Resources (and their names) are volatile What about versions, editions, back issues? Archiving is presently unsolved Authority and quality of service are spotty Managing Intellectual Property Rights is difficult
![Page 5: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/5.jpg)
Metadata: Part of a Solution
Structured data about other data• helps to impose order on chaos• enables automated discovery/manipulation
Full Text Web indexing is the dominant idiom for search
Metadata is more useful in structured collections, used in combination with applications designed to take advantage of structured descriptions
![Page 6: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/6.jpg)
Internet Commons includes Multiple Communities
ScientificData
HomePages Geo
InternetCommons
Library
Museums
Commerce
Whatever...
![Page 7: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/7.jpg)
Interoperabilityrequires conventions about:
Semantics• The meaning of the elements
Structure• human-readable• machine-parseable
Syntax• grammars to convey semantics and
structure
![Page 8: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/8.jpg)
Haven’t we done metadata already?
The MARC family of standards is the single most successful
resource description standard in the world
![Page 9: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/9.jpg)
MARC Cataloging…
Is really MARC-AACR2 cataloging• MARC is the communications format • AACR2 (Anglo-American Cataloging Rules)
defines the cataloging rules (semantics MARC and AACR2 are evolving
• Closer alignment with XML as a syntax option• RDA is an effort to modernize AACR2, and
alignment it with networked environments RDA and Dublin Core are cooperating on
alignment of a common underlying data model.
![Page 10: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/10.jpg)
What’s wrong with this model on the Web?
Expensive• Complex • Professional Catalogers required
Bias towards bibliographic artifacts• Fixed resources• Incomplete handling of resource evolution and
other resource relationships Anglo-centric
• MARC 21 accounts for ¾ of MARC records, but there are many other varieties
![Page 11: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/11.jpg)
Metadata Takes Many Forms
resourcediscovery
documentadministration
rightsmanagement
contentrating
security andauthentication
archivalstatus
products andservices
databaseschemas
process controlor description
![Page 12: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/12.jpg)
Warwick Framework: Modular Metadata
Conceptual Architecture for metadata from the Warwick Metadata Workshop (DC-2)
Conceptual architecture to support the specification, collection, encoding, and exchange of modular metadata
Provide context for metadata efforts (including Dublin Core)• avoids the “black-hole” of comprehensive
element sets• focuses interoperability issues at package level
A conceptual framework, NOT an application
![Page 13: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/13.jpg)
Modularity and Extensibility: the Lego metaphor
DC is a beginning, not an end An architecture for modular, extensible
metadata The simplest common denominator
• Add stuff you need for• Local requirements• Domain specific functionality• Other dimensions of description
• Eg cloud cover… management… structural metadata….
![Page 14: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/14.jpg)
Descriptive Metadata Standards
IEEE LOM (Learning Object Metadata)• Descriptive and structural metadata to support
instructional systems ONIX (Online Information Exchange) – bookseller metadata FGDC – Federal Geographic Data Committee: rich
descriptive and structural metadata for GIS applications Encoded Archival Description – description of archival
collections MPEG Multimedia Metadata – large, complicated, still in
progress – descriptive, structural, rights management Dublin Core – core descriptive metadata
![Page 15: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/15.jpg)
Metadata Creation
Metadata is expensive and error prone• A MARC Record costs about $100 USD to
create one record at the Library of Congress• Competes with indexing at… $ 00.001 ???
Capture it as close to point of creation as possible Capture as much automatically as possible Should be designed with close attention to the
functional requirements it serves Re-use existing standards whenever possible Always tension between completeness of
description, intended purpose, and cost
![Page 16: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/16.jpg)
Metadata Challenges
Accommodate multiple varieties of metadata
Tension: functionality and simplicity
Tension: extensibility and interoperability
Human and machine creation and use
Community-specific functionality, creation, administration, access work at cross purposes to global interoperability
![Page 17: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/17.jpg)
Interoperability barriers cost time and moneyA Common data model helps avoid this
![Page 18: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/18.jpg)
Dublin Core Basics
Design Philosophy – useful metaphors Language and pidgins Characteristics of DC metadata The simple bucket (properties) Resource Types Metadata grammar Dublin Core Principles One-to-one Dumb-down rule Context appropriate values Translations
![Page 19: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/19.jpg)
Dublin Core: Starting Assumptions and Essential Features
Simple • true to a point: the elements are simple, the
underlying model is not Consensus-based
• Crucial to early success, both in attracting expertise and deployment. Bottom up
Based on the experience of practitioners, but hard to capture and capitalize on lessons learned
Cross-disciplinary and International• Central success factor
![Page 20: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/20.jpg)
Essential Features (continued)
The Web is the strategic application• On the mark
International• Also central success factor, but hard (20
languages in the Registry) Lego-like modularity & extensibility
• Partially realized promise• Application Profiles are the means
Syntax independence• An ongoing nightmare (HTML…XML…RDF/XML)
Authors will describe their own works• Laughably naïve
![Page 21: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/21.jpg)
A Pidgin for Digital Tourists
Metadata is language
Dublin Core is a small and simple language -- a pidgin -- for finding resources across domains
Speakers of different languages naturally "pidginize" to communicate• E.g., tourists using simple phrases to order
beer ("zwei Bier bitte" "dva pivo" "biru o san bai"...)
We are all "tourists" on the Internet.
![Page 22: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/22.jpg)
A Grammar of Dublin Core
By design not as rich as mother tongues, but easy to learn and useful in practice
Pidgins: small vocabularies (Dublin Core: fifteen special nouns and lots of optional adjectives)
Simple grammars: sentences (statements) follow a simple fixed pattern...
http://www.dlib.org/dlib/october00/baker/10baker.html
![Page 23: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/23.jpg)
Basic Structures in Dublin Core Metadata
The basic unit of metadata is a statement:• Statements consist of a property (a metadata element)
and a value• Metadata statements describe resources
• More about the Dublin Core Abstract model later
resource statement
value
property
![Page 24: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/24.jpg)
What are the properties and values in the following metadata statements?
245 00 $a Amores perros $h [videorecording]
<title> Nueve reinas </title>
<type> MovingImage </type>
• Different models for conveying related information
• Dublin Core syntax fits in more naturally with the structure of the Web
![Page 25: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/25.jpg)
Resource has property
DC:CreatorDC:TitleDC:SubjectDC:Date...
X
implied subject
impliedverb
one of 15properties
property value(an appropriateliteral)
[optional qualifier]
[optional qualifier]
qualifiers(adjectives)
![Page 26: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/26.jpg)
The fifteen elements (properties)
Creator Title Subject
Contributor Date Description
Publisher Type Format
Coverage Rights Relation
Source Language I dentifier
![Page 27: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/27.jpg)
Varieties of qualifiers:Element Refinements
Make the meaning of an element narrower or more specific.• a Date Created versus a Date Modified• an IsReplacedBy Relation versus a Replaces
Relation
If your software does not understand the qualifier, you can safely ignore it.
![Page 28: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/28.jpg)
Varieties of Qualifiers:Value Encoding Schemes
Says that the value is• a term from a controlled vocabulary (e.g.,
Library of Congress Subject Headings)• a string formatted in a standard way (e.g.,
"2001-05-02" means May 3, not February 5) Even if a scheme is not known by software, the
value should be "appropriate" and usable for resource discovery.
![Page 29: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/29.jpg)
Resource has Date "2000-06-13"Revised
ISO8601
Resource has Subject "Languages -- Grammar"LCSH
![Page 30: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/30.jpg)
Dumb-Down Principle for Qualifiers
Simple DC does not use element refinements or encoding schemes – statements contain only value strings
Qualified DC uses features of the DCMI Abstract Model, including element refinements and encoding schemes
Dumbing-down is translating Qualified DC to simple DC
Qualifiers refine meaning (but may be harder to understand)
![Page 31: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/31.jpg)
The One to One Principle
Each resource should have one metadata description• For example, do not describe a digital image of
the Mona Lisa as if it were the original painting
Group Related descriptions into description sets• Describe an artist and his or her work
separately, not in a single description
![Page 32: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/32.jpg)
Appropriate Values
There are generally tradeoffs between local requirements and global requirements
Use elements and qualifiers to meet the needs of your local context, but…
Keep in mind that machines and people use and interpret metadata, so…
Consider whether the values used will help discovery outside your local context
![Page 33: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/33.jpg)
Dublin Core as a multilingual metadata language
Dublin Core has been translated into 20 + languages• machine-readable tokens are shared by all• human-readable labels are defined in different
languages• translations are distributed, maintained in
many countries• eventually linked in DCMI registry
![Page 34: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/34.jpg)
![Page 35: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/35.jpg)
One token – labels in many languages
dc:creator“Verfasser”label
“Creator”label
“Pencipta”
label
[Server inGermany]
[Server inJakarta]
[DCMI Server]
![Page 36: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/36.jpg)
Metadata languages are "multilingual"
Metadata is not a spoken language The words of metadata -- "elements" -- are
symbols that stand for concepts expressible in multiple natural languages
Standards may have dozens of translations
Are concepts like "title", "author", or "subject" used the same way in English, Finnish, and Korean?
![Page 37: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/37.jpg)
DCMI Open Metadata Registry
Managing vocabularies defined by the DCMI• Languages• Versioning• Controlled vocabularies
Foundation for modular, incremental integration and evolution
The Registry working group is a Dublin Core Community with participants around the world
![Page 38: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/38.jpg)
The Dublin Core Abstract ModelTerminology Simple versus Qualified DC Resources Descriptions Description sets Value Strings Element refinements Encoding SchemesGraphical representation of the Abstract ModelSummary of general ideas
![Page 39: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/39.jpg)
Important DCMI Document concerningthe Abstract Model and Syntax alternatives DCMI Abstract Model http://dublincore.org/documents/abstract-model/
Expressing Dublin Core in HTML/XHTML meta and link elements
http://dublincore.org/documents/dcq-html/
Expressing Dublin Core metadata using the Resource Description Framework (RDF)
http://dublincore.org/documents/dc-rdf/
Expressing Dublin Core metadata using XML http://dublincore.org/documents/dc-xml/
![Page 40: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/40.jpg)
Simple versus Qualified DC
Simple DC supports single descriptions using the 15 base elements and value strings
Qualified DC supports the richer features of the Abstract Model, and allows the use of all DCMI terms as well as other, non-DCMI terms.
An application profile is used to specify a metadata application that includes DCMI terms in combination with non-DCMI terms (mix & match metadata).
![Page 41: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/41.jpg)
The DCMI Abstract Model
A data model for Dublin Core Agreed upon underlying structure for metadata
statements Many years in the making -- long term contention Describes the structure of statements about
resources that we make in our metadata language:
resource statement
value
property
![Page 42: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/42.jpg)
What is a resource?
W3C definition:• “anything that has identity… electronic document,
an image, a service”• “not all resources are network retrievable; e.g.
human beings, corporations, and bound books can also be considered resources”
In other words, a resource is anything we can identify:• Physical things (books, people, airplanes….)• Digital things (Images, web pages, services….)• Concepts (colors, subjects, eras, places)
In the DC context, the DCMI Type list describes the stuff we describe with DC metadata
![Page 43: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/43.jpg)
Resource types for which DC is often used
Collection Dataset Event
Image InteractiveResource
Moving Image
Physical Object
Service Software
Sound Still Image
Text
DCMI TYPE Vocabulary
![Page 44: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/44.jpg)
Abstract Model: Descriptions
A description is composed of:• One or more statements about a single
resource• Optionally, the URI of the resource being
described
Each statement is made up of • A property URI (that identifies a property)• A value URI (that identifies a value) and/or one
or more representations of the value (a value string)
![Page 45: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/45.jpg)
Terminology: Value Strings
A value string is a human-readable string that represents the value of the property
Each value string may have an associated value string language that is an ISO language tag (e.g., pt-BR)
![Page 46: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/46.jpg)
Terminology: Element Refinements
Elements are the same as properties Element refinements are the same as sub-
properties An element refinement is a special case of an
element that shares the meaning of its ‘parent’, but has narrower semantics
Paulo is illustrator of a book, therefore he is also a contributor to the bookIllustrator is an element refinement of
contributor
![Page 47: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/47.jpg)
Terminology: Encoding Schemes
Values and value strings can be ‘qualified’ by encoding schemes in order to clarify their meaning• A Vocabulary Encoding Scheme is used to indicate
a terminology set from which a value is taken:Stem cells—Research is a value from LCSH616.02774 is a value from DDC-22
• A syntax encoding scheme is used to indicate the structure of a value string2004-10-12 is structured according to the
W3CDTF rules for date encoding
![Page 48: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/48.jpg)
Terminology: Description Sets
The 1:1 principle dictates that each description describes one, and only one, resource
We often need to describe grouped sets of descriptions, which are known in the abstract model as description sets• An article and its authors• A painting and its artist
When description sets are exchanged between software applications, they are generally encoded according to a particular syntax in a metadata record
![Page 49: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/49.jpg)
Record (encoded as html, XML, or RDF/XML
Description set
Resource Description (URI)Resource Description (URI)Resource Description (URI)
Statement
Statement
Statement
language
(pt-BR)
Abstract Model summary (after Andy Powell)
value string
value URIproperty (URI)
syntax encodingscheme
Vocabulary encoding scheme
![Page 50: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/50.jpg)
General Ideas
DC is not just the 15 elements, though they comprise the foundation for simple DC
50+ properties (elements) have been approved by DCMI
The model supports local declarations of additional properties
The model supports application profiles (mixing DC elements with those of other sets)
The model allows the grouping of descriptions to create more complex description entities
![Page 51: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/51.jpg)
Syntax Alternatives
Choosing among alternatives HTML XML RDF/XML
![Page 52: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/52.jpg)
Syntax AlternativesHTML… XML… RDF/XML
Three Web-based models for deploying metadata Each has advantages and disadvantages What is ‘best’ depends on local constraints
• What is the objective of the system? How do these syntax alternatives support local functional requirements?
• Are there services and software to ‘consume’ the metadata created?
• Are trained practitioners available to create and support the systems?
![Page 53: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/53.jpg)
Syntax Alternatives: HTML
Advantages:• Simple – META tags embedded in content• Widely deployed tools and knowledge• Resource carries its metadata around with it• Metadata is openly harvestable
![Page 54: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/54.jpg)
Syntax Alternatives: HTML (continued)
Disadvantages• Limited structural richness (does not support
hierarchical, tree-structured data• Management of metadata is less reliable (the
metadata is out in the wild)
Describe one thing (the HTML document) and no more!
![Page 55: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/55.jpg)
Dublin Core in HTML (example)
<head><link rel="schema.DC" href="http://purl.org/dc"> <meta name="DC.title"
content=“DC Metadata Tutorial”<meta name="DC.creator"
content=“Stuart L. Weibel"> <meta name="DC.subject" xml:lang= “en-US’
content=“Metadata"> <meta name="DC.date" scheme=“DCTERMS.W3CDTF"
content=“2007-07-08"><meta name=“DCTERMS.audience”
content =“technical librarians”</head><body>… [ rest of html document ]
![Page 56: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/56.jpg)
The namespaces for HTML encoding
All DCMI terms (elements, element refinements, and encoding schemes) are found in:
DCMI Metadata Termshttp://dublincore.org/documents/dcmi-terms/
The namespaces are a result of historical developments• DC: [original elements]• DCTERMS: [later elements]
![Page 57: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/57.jpg)
Syntax Alternatives: XML
XML = eXtensible Markup Language The standard for networked text and data Wide-spread tool support
• Parsers are widely available • Extensibility (XML namespaces) • Type definitions (XML Schema)• Transformation and Rendering (XSLT)• Rich linking semantics (XLINK)
![Page 58: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/58.jpg)
XML Schema
Rich XML-based language for expressing data-type semantics
Replaces arcane and limited DTD (origin in SGML) Facilities:
• Data typing (both complex and primitive)• Constraints (ranges, cardinality…)• Defaults (specify defaults for certain
properties)
![Page 59: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/59.jpg)
Dublin Core fragment in XML
<metadata xmlns:dc="http://www.openarchives.org/OAI/dc.xsd"> <dc:creator>Carl Lagoze</dc:creator> <dc:title>Accommodating Simplicity and Complexity in Metadata</dc:title> <dc:date>2000-07-01</dc:date> <dc:publisher>Cornell University, Computer Science</dc:publisher></metadata>
Where is the rest of the stuff? In the schema!
![Page 60: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/60.jpg)
Case Study: OAI-PMHOAI Protocol for Metadata Harvesting
Open Archives Initiativehttp://www.openarchives.org
• Simple Protocol for sharing metadata records Based on HTTP, XML, XML Schema, and XML
namespaces Allows a harvester to query a remote repository
for some or all of its metadata records DC is the default native metadata format in the
OAI protocol
![Page 61: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/61.jpg)
Syntax Alternatives: RDF
RDF (Resource Description Format) Syntax expressed in XML W3C recommendation for encoding metadata (a
semantic Web technology) Enabling technology for richly-structured metadata Rich data model (the DC Abstract Model is a
constrained version of RDF) Metadata can be shared easily among independent
applications that understand RDF W3C – Resource Description Framework (RDF)
http://www.w3.org/RDF/
![Page 62: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/62.jpg)
Summary: Syntax alternatives
Choices should be driven by local requirements and objectives• Available expertise• Costs of Deployment • Objectives and functional requirements
![Page 63: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/63.jpg)
Association ModelsWhere do we keep the metadata?
Embedded• HTML META tags or XML or RDF-XML can be embedded
in the resource, and hence travels with the resource• Simple, but limited in structural richness
Loosely coupled• Shadow Files (like Adobe’s XMP Sidecar files)• Requires a system to manage and insure that they stay
in synch• RDF or XML descriptions
Third Party Metadata• Stored in repositories such as library catalogs• Easier to manage and maintain, and provide service• Library catalogs, for example
![Page 64: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/64.jpg)
Questions about syntax alternatives?
![Page 65: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/65.jpg)
Application Profiles:Mixing and Matching Metadata
What is an Application Profile? Why bother? Creating new properties Documenting and declaring new properties Some examples
![Page 66: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/66.jpg)
Application Profiles: Mixing and Matching Metadata
• The mixing and matching of elements (properties) from separate metadata sets
• An expression of metadata modularity• Implementers can benefit from peer applications• Communities can harmonize their metadata,
picking complementary properties• Promotes convergence over time• For application profiles to work, there must be
public declarations of properties that conform to a common data model (or nearly so)
![Page 67: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/67.jpg)
Application Profile: Definition
Declaration of metadata properties used in a given organization or application or community
Documentation of encodings, constraints, and creation guidelines
Implies formal schemas (xml schemas or RDF schemas)
Should promote both human understanding and machine interoperability
The concept of application profiles applies to any metadata community of practice, not just DC
DC has promoted their use and leads by example
![Page 68: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/68.jpg)
Why bother?
One-size-fits-all metadata results in bloated, unmanageable specifications and applications
APs allow tailoring a given metadata application to match the element set to specific functional requirements based on local or community needs, while retaining interoperability with a larger metadata community
![Page 69: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/69.jpg)
Creating an Application Profile
Find out what others have done… don’t re-invent wheels!
Develop community consensus Define Name, Label, definition relationships (see
the DCMI Usage Board guidelines) Determine an appropriate URI (a home on the
Web)
Dublin Core Application Profile Guidelines http://dublincore.org/usage/documents/profile-guidelines/
![Page 70: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/70.jpg)
Document New Properties
At very least: a Web page with relevant information
Better: a web page with a public schema using new terms in an application profile
Better still: all properties available as part of a metadata registry
![Page 71: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/71.jpg)
Example Application Profiles
DC-Library AP
DC-Collection Description AP
DC-Government AP
DC-Education AP
![Page 72: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/72.jpg)
Some History of the Dublin Coreand
How the Initiative Works
• The Beginnings
• Landmarks• Workshops and Conference series• What the initiative does• Standardization• Some example applications
![Page 73: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/73.jpg)
Dublin Core: The Beginning
A casual discussion at WWW-2 in Chicago, October of 1994• How to make things on the Web easier to find?
OCLC & NCSA co-sponsored an invitational workshop in March of 1995
The workshop became a workshop series, and eventually a conference series
DCMI: Dublin Core Metadata Initiative• Governance and process evolved over time• De facto standards maintenance body
![Page 74: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/74.jpg)
Dublin Core Landmarks
1994: Simple tags to describe Web pages 1995: The Dublin Core is one of many
vocabularies needed ("Warwick Framework")
1996: The Dublin Core: 13 elements expanded to 15 - appropriate for Text and Images
1997: WF needs formal expression in a Resource Description Framework (RDF)
![Page 75: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/75.jpg)
Dublin Core Landmarks (continued)
2000: Dublin Core Metadata Initiative recommends qualifiers, broadens its organizational scope beyond the Core
2001: Workshop Series becomes a conference series
DCMI Affiliates and a board of trustees 2005: Abstract Model (Finally)
![Page 76: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/76.jpg)
The Dublin Core Workshop Series
Workshop Venues:US DC 1, 3, 6UK DC 2Australia DC 4Finland DC 5Germany DC 7Canada DC 8
ConferencesTokyo (2001) China (2004)Florence (2002) Spain (2005)Seattle (2003) Mexico (2006)
![Page 77: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/77.jpg)
![Page 78: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/78.jpg)
DCMI Activities
Standards development and maintenance Metadata registry and infrastructure Technical working groups and periodic
workshops Tutorial materials and user guides Education and training Open source software Liaisons with other standards or user
communities
![Page 79: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/79.jpg)
Governance of DCMI
DCMI has a Board of Trustees that oversees the operation and goals of the initiative
Managing Director• Makx Dekkers
Director of Specifications and Documentation• Tom Baker
An Advisory Board of metadata experts provides guidance on metadata issues
![Page 80: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/80.jpg)
The DCMI Usage Board
The Usage Board is an editorial committee that evaluates proposals for new elements or revisions
International selection of metadata experts Meet twice yearly Documents decisions and updates DCTERMS
document
![Page 81: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/81.jpg)
Affiliate Program
DCMI has National Affiliates which support the Initiative and are represented on the Board of Trustees• Finland• UK• Singapore• New Zealand• Korea
OCLC has been the Host from the start
![Page 82: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/82.jpg)
The Three I’s
Independent: DCMI is not controlled by specific commercial or other interests and is not biased towards specific domains nor does it mandate specific technical solutions
International: DCMI encourages participation from organizations anywhere in the world, respecting linguistic and cultural differences
Influenceable: DCMI is an open organization aiming at building consensus among the participating organizations; there are no prerequisites for participation
![Page 83: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/83.jpg)
The Work gets done by Communities and task groups Accessibility Community Collection Description Community Education Community Environment Community Global Corporate Circle Government Community Kernel Community Libraries Community Localization and Internationalization Community Preservation Community Registry Community Social Tagging Community Standards Community Tools Community
![Page 84: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/84.jpg)
Standardization of the Dublin Core
IETF RFC 2413• http://www.ietf.org/rfc/rfc2413.txt
CEN Workshop Agreement (Europe)• endorse Dublin Core elements as
CWA13874 NISO Z39.85
• National Information Standards Organization, an ANSI affiliate
ISO 15836
![Page 85: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/85.jpg)
Metadata Applications - examples
Governments• 7 governments have adopted DC metadata• Adobe products
• XMP – Adobe’s variant of RDF• Dublin Core is a base schema
IPTC – International Press and Telecommunications Council• Dublin Core based standard for journalism
Knowledge Management systems commonly use DC metadata
Visual materials require metadata for findability Library Systems (mostly MARC cataloging, but
increasingly other metadata as well)
![Page 86: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/86.jpg)
Metadata applications (continued)
Search Systems• Full text indexing is enormously useful• Structured metadata improves search• The Amazoogles are all aggressively courting
metadata aggregators Cameras
• Automatically create metadata for each image• Some even include GPS data
Commerce systems require metadata Social Software applications are largely about
enriching resource information with tags, reviews, and automated linking
![Page 87: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/87.jpg)
To Sum Up…
Many purpose-built metadata standards Few have explicit data models Few interoperate Some will survive, others will not The Web demands convergence
• Break down silos between domains and communities of practice
• RDF should help promote convergence, but we are not there yet
Expect more metadata standards, but hope for fewer
![Page 88: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/88.jpg)
How to Participate
Join the DC-General mailing list
Join a working group
Information on lists and working groups is available at http://dublincore.org
![Page 89: Dublin Core Metadata Tutorial July 9, 2007 Stuart Weibel Senior Research Scientist OCLC Programs and Research.](https://reader035.fdocuments.net/reader035/viewer/2022070305/5514e839550346b0478b5ac8/html5/thumbnails/89.jpg)
Stuart L. Weibel
Visit me at: http://weibel-lines.typepad.com
Contact me at: [email protected]
Thank you for your attention