A Registry for controlled vocabularies at the Library of Congress Rebecca Guenther Network...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of A Registry for controlled vocabularies at the Library of Congress Rebecca Guenther Network...
A Registry for controlled vocabularies at the Library of
Congress
Rebecca Guenther
Network Development & MARC Standards Office,
Library of Congress
October 29, 2008
Oct. 29, 2008 ASIST 2008
Outline of presentation
Types of controlled vocabularies Vocabularies maintained at LC An introduction to SKOS Establishing concept databases at LC Examples of concept schemes: ISO 639-2 and
PREMIS event type Providing the registry as a web service
Oct. 29, 2008 ASIST 2008
Why establish controlled vocabularies?
Control values that occur in metadata Document and publish for reuse Reduce ambiguity Control synonyms Establish formal relationships among terms (where
appropriate) Test and validate terms
Oct. 29, 2008 ASIST 2008
Types of Controlled Vocabularies used in metadata standards
Lists of enumerated values Code lists (e.g. language, country) Taxonomies Formal Thesauri Locally controlled enumerated lists
Oct. 29, 2008 ASIST 2008
Enumerated lists
Simple list of terms used in a pull-down menu or Web site pick list Values enumerated in an XML schema Little additional information or structure about each value Examples:
– Code and value from a MARC 21 fixed field, e.g. code “e” in Leader/06 is “cartographic material”
– Enumerated value “MD5” for METS CHECKSUMTYPE– Enumerated value “born digital” in MODS digitalOrigin
Oct. 29, 2008 ASIST 2008
Code lists
Some established as ISO standards and used worldwide in many communities for many purposes
The standard standardizes the code, not a particular name for it
Codes are used as identifiers Examples (maintained by LC):
– ISO 639-2 (language codes)– MARC relator codes– MARC country codes
Oct. 29, 2008 ASIST 2008
Thesauri
A thesaurus is a controlled vocabulary with multiple types of relationships
Example:Rice UF paddyBT CerealsBT Plant productsNT Brown riceRT Rice straw
Oct. 29, 2008 ASIST 2008
Standards maintained at LC that use controlled vocabularies
MARC (including code lists) MODS METS MIX (XML schema for Z39.87 Technical metadata for
digital still images) PREMIS ISO 639-2 (language codes) Thesaurus of Graphic Materials LCSH … and some others
Oct. 29, 2008 ASIST 2008
SKOS: What is it?
Simple Knowledge Organisation System(s)
SKOS is … for declaring and publishing taxonomies, thesauri or
classification schemes, for use in a distributed, decentralised information system (i.e. a semantic web).
for describing Concepts and creating relationships between Concepts and Terms
A practical application of RDF a formal language for representing controlled, structured
vocabularies
Oct. 29, 2008 ASIST 2008
The SKOS data model
…views a knowledge organization system as a concept scheme comprising a set of conceptual resources (concepts).
– These concept schemes and conceptual resources are identified by URIs.
– The model is multilingual and extensible
10
Oct. 29, 2008 ASIST 2008
Concepts can be…
labeled with any number of strings. One label, in any given language, can be indicated as the "preferred" label for that language, and others as "alternate“ labels, "hidden“ labels, or using a notation:
– skos:prefLabel– skos:altLabel– skos:hiddenLabel– skos:notation
11
Oct. 29, 2008 ASIST 2008
Concepts can be…
linked to other concepts within the same concept scheme. Hierarchical links:
– skos:broader and skos:narrower– skos:broaderTransitive and
skos:narrowerTransitive
Associative links: – skos:related
12
Oct. 29, 2008 ASIST 2008
Concepts can be…
grouped into collections, which can be labeled and/or ordered. A concept can be in one or more collections
– skos: Collection– skos: OrderedCollection– skos: member– skos: memberList
13
Oct. 29, 2008 ASIST 2008
Concepts can be…
mapped to other concepts in different concept schemes.
Hierarchical mapping:– skos:broadMatch – skos:narrowMatch
Associative mapping:– skos:relatedMatch– skos:closeMatch– skos:exactMatch
14
Oct. 29, 2008 ASIST 2008
Advantages to using SKOS
SKOS has a defined element set which is particularly relevant for controlled vocabularies
Relationships between entries in a thesaurus can be expressed (broader, narrower, etc.)
Relationships between entries in different thesauri can be expressed (exactMatch, related)
Having a dereferencable URI for concepts and their concept schemes enhances the ability to provide web services for consumers of these standards
Oct. 29, 2008 ASIST 2008
Controlled vocabularies registry at LC
Library of Congress is establishing databases with controlled vocabulary values for standards that it maintains
Controlled lists are represented using SKOS as well as alternative syntaxes Lists currently in progress:
– ISO 639-2 and MARC language code list– MARC geographic area codes– MARC country code list– MARC relators– PREMIS controlled value lists– Thesaurus of Graphic Materials
Other possibilities– Enumerated values in MODS schema– Coded and uncoded value lists in MARC
Oct. 29, 2008 ASIST 2008
Reasons for developing a registry
Facilitate development and maintenance process
Make controlled lists openly available Develop a web service where comprehensive
information about controlled terms is available Experiment with semantic web technologies Expose vocabularies to a wider communities
http://www.loc.gov:8081/standards/registry/lists.html
Oct. 29, 2008 ASIST 2008
Example: ISO 639-2 vocabulary
One in the family of ISO 639 language coding standards
Has a close relationship with other language coding standards (ISO 639-1 and -3, MARC)
LC is maintenance agency The standard is the CODE, not the language
name; multiple names are given
ISO 639-2 language code example
<rdf:Description rdf:about= "http://www.loc.gov/standards/registry/vocabulary/iso639-2/por">
<rdf:type rdf:resource="http://www.w3.org/2008/05/skos #Concept"/>
<skos:prefLabel xml:lang="x-notation">por</skos:prefLabel>
<skos:altLabel xml:lang="en-Latn">Portuguese</skos:altLabel>
<skos:altLabel xml:lang="fr-Latn">portugais</skos:altLabel>
<skos:notation rdf:datatype="xs:string">por</skos:notation>
<skos:definition xml:lang="en-Latn">This Concept has not yet been defined.</skos:definition>
<skos:inScheme rdf:resource="http://www.loc.gov/standards/registry/vocabulary/iso639-2"/>
<vs:term_status>stable</vs:term_status> <skos:historyNote rdf:datatype="xs:dateTime">2006-07-
19T08:41:54.000- 05:00</skos:historyNote><skos:exactMatch rdf:resource=
"http://www.loc.gov/standards/registry/vocabulary/iso639-1/pt"/> <skos:changeNote rdf:datatype="xs:dateTime">2008-07-09T13:49:05.321-04:00</skos:changeNote>
</rdf:Description>
Oct. 29, 2008 ASIST 2008
PREMIS controlled lists
PREMIS Data Dictionary for Preservation Metadata Some semantic units call for controlled vocabularies
and have suggested lists A central registry could document and make them
available Users could submit their own terms PREMIS schema could be enhanced with enumerated
values for validation generated dynamically
PREMIS event type example
<rdf:Description rdf:about= "http://www.loc.gov/standards/registry/vocabulary/preservationEvents/creation">
<rdf:type rdf:resource= "http://www.w3.org/2008/05/skos#Concept"/>
<skos:prefLabel xml:lang="en-latn"> creation</skos:prefLabel>
<skos:narrower rdf:resource= "http://www.loc.gov/standards/registry/vocabulary/preservationEvents/migration"/>
<skos:narrower rdf:resource= "http://www.loc.gov/standards/registry/vocabulary/preservationEvents/normalization"/>
<skos:definition xml:lang= "en-latn">the act of creating a new object</skos:definition>
<skos:inScheme rdf:resource= "http://www.loc.gov/standards/registry/vocabulary /preservationEvents"/>
</rdf:Description>
XML Database using XQuery
(eXist)
RDF Triple Store(Sesame)
Registry Web service
Interprets URIFormulates SPARQL query
HTTP request
User
Runs queryGets resultsSends back to database and then to user
Oct. 29, 2008 ASIST 2008
Further development
Consider programming changes to improve speed
Develop mechanisms to output all public documentation from database
Include additional coding about relationships to other concept schemes and controlled vocabularies (facilitating crosswalks)
Encourage experimentation
Oct. 29, 2008 ASIST 2008
Questions?
Contacts:– Rebecca Guenther: [email protected]– Clay Redding: [email protected]