INLS 520
description
Transcript of INLS 520
INLS 520 – Erik Mitchell
INLS 520
Information Organization
INLS 520 – Erik Mitchell
Review
• Controlled vocabularies– Term Lists, Hierarchies, Trees, Paradigms,
Facets, Folksonomies
• Knowledge organization systems– Term Lists, Thesauri, Taxonomies,
Ontologies
INLS 520 – Erik Mitchell
Today
• Poster topic review
• Poster creation concepts
• Automation in metadata & organization– RDF– OWL
• Guest Speaker – Barrie Hayes
Creating poster presentations
• Content– Methodological approach
• Title, question, overview, methods, findings, observations
– Conceptual approach• Title, question, model or framework
• Structure• Balancing words & images
• Document flow
• Examples - 1, 2, 3, 4INLS 520 – Erik Mitchell
Creating poster presentations
• Technology– Creation tools
• PowerPoint Templates in blackboard• An image editor• HTML
• Display– Flickr, Printed
• Some guidelines 1, 2, 3, 4
INLS 520 – Erik Mitchell
Automation
• Group discussion– Based on the articles by Hlava, Hearst, and
Stearns what are some of the primary uses of automation in information organization?
– Is automation primarily a tool to create representations of resources or to enable retrieval of resources?
INLS 520 – Erik Mitchell
Automatic Indexing
• Automatic Extraction/representation– Lancaster calls this: “Words or phrases appearing in a text are extracted and
used to represent the content of the text as a whole” (Lancaster 284)
• Automatic Classification/Categorization– The computer compares terms in the document against thesauri or
controlled vocabularies to map a word in the resource to an accepted index term (Lancaster 287)
• Automatic Abstract/Surrogate Generation– The extraction of sentences from documents to create an abstract
(Lancaster 298)
• Automatic index/tool Generation– Creating spelling dictionaries, word lists, indexes, etc to be used for other
automation techniques• Latent Semantic Indexing - Concepts are extracted and analyzed to detect relationships. These
relationships are then used to help identify & rank documents (overview)
INLS 520 – Erik Mitchell
Automatic metadata use
• Metadata Harvesting– Using automated techniques to discover and extract metadata from documents
• OAI/PMH
• Knowledge representation & discovery– Using structured ontologies to make statements and inferences about
resources– OWL Ontologies
• Metadata interoperability– Using structured metadata to automatically connect with other metadata
systems– OpenURL, XML Schema
• Transformation– Creating new documents through the automatic extraction, analysis, and
processing of resources
– XML/XSL
INLS 520 – Erik Mitchell
Metadata standards
• XML– Structure & syntax but no ‘semantic constraints’
• XML Schema– Restrict structure of XML, extend XML with datatypes
• RDF– Data-model showing relationships among resources
• RDF Schema– ‘Vocabulary for describing properties and classes of RDF
resources’
• OWL– Added vocabulary for describing properties & classes
• Relationships, cardinality, equality, charicteristics
INLS 520 – Erik Mitchell
INLS 520 – Erik Mitchell
RDF
• Subject, property, object triples
• Transmitted in xml
• RDFS extends RDF with an ontology language– Properties, specialization
• OWL – More powerful extension of RDFS– Uses same syntax of RDF
INLS 520 – Erik Mitchell
RDF Model
Webpage: http://www.stuff.com “Saki Knafo”
Author
(Value)
Object
(Property type)
Predicate
(Resource)
Subject
“The author of the stuff webpage is Saki Knafo”
-A literal, a triple, a statement
INLS 520 – Erik Mitchell
How is RDF different?
• RDF is a descriptive model that – Allows variable contextualized description– Deconstructs the descriptive process– Allows more granular automated
processing of data– Uses exact markup to indicate the context
of values (namespaces, schemas)– Bags, Sequences, Alternative values,
parseType
INLS 520 – Erik Mitchell
Encoding RDF in XML<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdf:Description rdf:about="http://www.stuff.com/"> <dc:title>The Hang: The Island of Black
Jeans</dc:title> <dc:creator>SAKI KNAFO</dc:creator> <dc:date>Sun, 16 Sep 2007 01:04:40 GMT</dc:date> <dc:description>descriptive
content</dc:description> </rdf:Description></rdf:RDF>
INLS 520 – Erik Mitchell
Iterative RDF description<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:vcard="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_schemas/vcard.xsd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdf:Description rdf:about=“http://www.stuff.com"> <dc:title>The Hang: The Island of Black Jeans</dc:title> <dc:creator rdf:href = "#Creator_001"/> <dc:identifier>http://www.stuff.com</dc:identifier> <dc:date>Sun, 16 Sep 2007 01:04:40 GMT</dc:date> <dc:description>descriptive content</dc:description> </rdf:Description> <rdf:Description ID="Creator_001">
rdf:about="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_,,,"> <vcard:given>Saki</vcard:given>
<vcard:family>Knafo</vcard:family><vcard:email>
<vcard:userid>[email protected]</vcard:userid></vcard:email>
</rdf:Description></rdf:RDF>
INLS 520 – Erik Mitchell
RDFS
• RDF Schema– Defines additional rdf elements that help
type relationships
• Special Classes– Based on RDF Classes / Properties /
Attributes with additional• http://www.w3schools.com/rdf/rdf_reference.asp
• Allows the creation of vocabularies / ontologies
INLS 520 – Erik Mitchell
Ontology Definitions
• “The study of being or existence”
• “A conceptualization of a specification” (Gruber)
• “An ontology formally defines a common set of terms that are used to describe and represent a domain.” (OWL)
INLS 520 – Erik Mitchell
Ontology Concepts
• Classes – Names of objects in the domain
• Relationships between classes• Connections between classes
• Properties of classes• Background or identifying knowledge of these objects
• Constraints on these properties & relationships
• Limits and parameters of the relationships
INLS 520 – Erik Mitchell
A good ontology has
• Features:– Meaningful – all classes have instances– Accurate / correct– Non-redundant – each class/instance is
represented in a single way– Rich in description – context, content
• Enabled functionality:– Able to use queries to connect new pieces of
information– Use XML & definitions to integrate knowledge
across domains
INLS 520 – Erik Mitchell
Ontology Continuum
• Keyword Lists
• Basic Thesauri
• Complex Thesauri
• Taxonomies
• Simple Ontologies (wordnet)
• Complex Ontologies (OWL)
INLS 520 – Erik Mitchell
SHOE Ontology project – • Possible to build an ontology for anything
– Simple HTML Ontology Extensions (SHOE) Project
• http://www.cs.umd.edu/projects/plus/SHOE/
• http://www.cs.umd.edu/projects/plus/SHOE/html-pages.html
• Sample projects
– Beer Ontology• http://www.cs.umd.edu/projects/plus/SHOE/onts/index.html#beer
– Document Ontology• http://www.cs.umd.edu/projects/plus/SHOE/onts/docmnt1.0.html
INLS 520 – Erik Mitchell
Ontology Concepts
• Multiple inheritance• Vertical and horizontal
relationships• Decomposed subject/object• Predicate based description
(isRelatedto, hasVersion)
INLS 520 – Erik Mitchell
Creating an Ontology
• Determine Scope of field, define boundaries
• Check for existing ontologies, vocabularies
• Select a top-down/bottom-up approach– Identify concepts, vocabulary, parameters,
constraints
• Identify relationships– Multiple hierarchies, inheritance
• Build, test, maintain
INLS 520 – Erik Mitchell
OWL (Web Ontology Language)
• An ontolgy that is geared towards representing information on the web– Classes, properties, and relationships that describe
URIs and their facets.
• Based on the Triple concept– Subject, Predicate, Object– 3 versions: OWL-Lite, OWL-DL, OWL-Full
• Formatted in RDF/XML– Uses RDF and RDFS as a foundation– Adds new elements in the owl namespace
INLS 520 – Erik Mitchell
OWL Versions
• OWL-Lite– Simple hierarchies, constraints
• OWL-DL– Uses description logics
• Logic-based semantic markup based on first-order predicate logic
– Still guarantees finite relationship processing– Adds ‘reasoning’ capacity to infer
information/relaitonships
• OWL-Full– Most complex– Open ended, possible to get into infinite processing
INLS 520 – Erik Mitchell
OWL Example<?xml version="1.0"?> <rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#xmlns:rdfs="http://www.w3.org/2000/01/rdfschema#" xmlns:owl=http://www.w3.org/2002/07/owl#xmlns=http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl#xml:base="http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl">
<owl:Ontology rdf:about=“> <owl:versionInfo rdf:datatype="http://www.w3.org/2001/X...">1.0</owl:versionInfo> <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >
An ontology containing the basic part relations: partOf, hasPart, partOf_directly, and hasPart_directly. These are described in the accompanying note. Author: Chris Welty
</rdfs:comment> </owl:Ontology> <owl:TransitiveProperty rdf:ID="partOf">
<owl:inverseOf> <owl:TransitiveProperty rdf:ID="hasPart"/>
</owl:inverseOf>
</owl:TransitiveProperty> <owl:ObjectProperty rdf:ID="hasPart_directly">
<rdfs:subPropertyOf rdf:resource="#hasPart"/> <owl:inverseOf>
<owl:ObjectProperty rdf:ID="partOf_directly"> <rdfs:subPropertyOf rdf:resource="#partOf"/> </owl:ObjectProperty>
</owl:inverseOf>
</owl:ObjectProperty> </rdf:RDF>
(Chris Welty)
OWL – Lite features
• Class• A collection of things related to each other by properties
• rdfs:subClassOf• A way of showing hierarchical class relationships
• rdf:Property• A stated relationship between an thing and a value
(hasChild, hasRelative, hasSibling, hasAge)• Bi-directional, Transitive (hasAncestor),
• Rdf:subPropertyOf• Similar to subClassOf, a way of showing property
hierarchies• Individual
• Instances of classes (objects)INLS 520 – Erik Mitchell
OWL relationships
INLS 520 – Erik Mitchell
Practical guide to OWL ontologies
INLS 520 – Erik Mitchell
Some OWL Examples
• Airport
• Pizza
INLS 520 – Erik Mitchell
Next Week(s)
• 10/28 – No class
• 11/4 – Metadata based services, guest speaker
• 11/11 – no class
• 11/18 – no class
• 11/25 – semantic web
• 12/2 – final projects due