Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan...

27
Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan...

Page 1: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Reflections from the FACET Project

Doug Tudhope

Hypermedia Research Unit

University of Glamorgan

NKOS Workshop, JCDL 2005

Page 2: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Presentation

• FACET Project– Faceted Knowledge Organisation Systems (KOS)– Semantic expansion– Web Demonstrator

• Reflections / Current work– Need for standard representations and API– Pilot Terminology Services– KOS and Semantic Web– Cost/Benefit issues

Page 3: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET - Faceted Access to Cultural hEritage Terminology

FACET - a collaborative project investigating the potential of semantic term expansion in retrieval

Aims:• Integration of thesaurus into the interface• Semantic term expansion and matching function

taking advantage of facet structure

http://www.comp.glam.ac.uk/~FACET/

Page 4: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET Collaborators

• Research Council Funding: EPSRC 3 years

• National Museum of Science and Industry (NMSI):

National Railway Museum and Science Museum Collections Database

• J. Paul Getty Trust

Art and Architecture Thesaurus (AAT)

• Museum Documentation Association (MDA)

Railway Thesaurus

• Canadian Heritage Information Network (CHIN)

Advisors

Page 5: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

NRM Collection examples of free text object descriptor fields

• Chair, London Midland & Scottish Railway, straight wooden back initials carved on back, green leatherette seat.

• Chair, Railway Clearing House, Curved back with blue leather inset & blue leather seat. R. C.H. carved on back

• Chair, M.S. & L.R., Straight back, blue leather seat with M.S. & L.R. carved across back

• Armchair, Pullman, green plush, fringed from Pullman section.• Carver chair, Oak with oval brocade seat. Prince of Wales crest on back

from Royal Saloon of 1876• Armchair, Upholstered in blue maquette with curved, buttoned back &

scroll arms. Wooden legs• Occasional table, Oak with drawer, ornately carved. From Royal Saloon

of 1876• Set of 4 chairs, High-backed carver chairs upholstered in floral maquette• Clock, made by Jno Walker, 250 Regent Street. Metal face/Roman

numerals. Carved wooden square case. 20"x18"x10"

Page 6: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Semantic Term Expansion

Reasoning over thesaurus semantic relationships

allows the system to play an active role

• Ranking of matching items in a result set• Automatic suggestion of terms to be considered for query• Query reformulation and ‘more like this’ option• Augmented Browsing tools – semantic expansion

Underpinning technologies:• Measures of distance over the semantic index space • Matching Function for sets of terms

Page 7: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET Prototype

• SQLServer database: collections DB and Thesaurus

• C++ thesaurus term expansion engine• Dual thesaurus representations

– database

– in-memory data structure

• Visual Basic and Web client interfaces– ‘Find Term’ mapping to terms, alternates, scope notes

– Browse hierarchies

– Semantic browsing

– Query Builder

– Ranked results

Page 8: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Faceted Knowledge Organisation Systems

Faceted classifications based on primary division

into fundamental, high-level categories (facets)

Compound descriptors (multi-concept headings) are synthesised

by combination of terms from limited number of fundamental facets

In constructing AAT, adjectival noun phrases very common:

e.g. painted oak furniture

“Rather than enumerate the nearly infinite number of object and subject descriptions needed by thesaurus users, the AAT decided to pursue the building blocks of these descriptors in the form of a faceted vocabulary”

(Guide to Indexing and Cataloging with the Art & Architecture Thesaurus)

Page 9: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Matching Problem

“The major problem lies in developing a system whereby individual parts of subject headings containing multiple AAT terms are broken apart, individually exploded hierarchically, and then reintegrated to answer a query with relevance”

(Toni Petersen, AAT Director)

Query: mahogany, dark yellow, brocading, Edwardian, armchair

Descriptor: oak, light yellow, crests, ovals, brocade, Victorian, Carver chair

Potentially extra / missing / partially and non-matching terms

Page 10: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

System Architecture

Transact SQLStored

Procedures

SQL Server Database -Museum collection & indexing

thesaurus

Active-X Data Objects (ADO) Data accesscomponents

Database

Applicationdata objects

Termexpansion

engineand datastructure

Query andmatchingfunctions

Compiled VB client interfaceand web browser interface

Applicationinterfaces

Database interaction module

PersistentXML data:

Queries,parameters

etc.

Page 11: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET standalone system

http://www.comp.glam.ac.uk/~facet/webdemo/

[email protected]

Page 12: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET Web Demonstrator

• illustrates thesaurus content and semantic expansion in a fairly realistic Web prototype application

• Intended more as an exploration of FACET research outcomes as dynamically generated Web components than a general interface but suggestive of possible interface components

• Not rely on pre-built static HTML pages -

thesaurus content is generated dynamically

http://www.comp.glam.ac.uk/~FACET/webdemo/

Page 13: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET Web Demonstrator implementation

• Browser-based interface (ASP application), using a combination of server-side scripting and compiled components

• Persistence of state information between page requests a problematic issue - HTTP protocol is (by design) stateless

• Solution adopted for current demonstrator involved small 'scriptlet' interface components to communicate with server without causing a browser to refresh the entire page.

• But side effect of introducing some (IE) platform dependence

Page 14: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

FACET Web Demonstator

Page 15: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Some lessons learned

• Results from FACET show potential of faceted KOS for – Query expansion (ranked results based on semantic closeness)

– Semantic expansion as a browsing tool when wishing to use KOS behind the scenes

• Web demonstrator first step – Based on custom API

– KOS and database on same server (but need not be)

– How to generalise these techniques?

need for• Common KOS representations and APIs

for general terminology (KOS) services

Page 16: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

KOS integration into DL services from Hill et al Research Agenda (SigCR Workshop 2002)

Taxonomy of KOS - KOS types linked to DL service protocols

Registries of KOS and KOS-level metadata to represent them

RDF/XML KOS representations - customisable

Core set of relationship types across all KOS

General KOS service protocol

from which protocols for specific types of KOS can be derived

Robust linking model in which DL entities (collections, objects, and services) can refer to KOS entities (concepts, labels, and relationships)

Visualization tools that fully use and display the rich semantics embedded in KOS

Page 17: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Towards Terminology Services

• KOS-based services as elements of applications with some form of search/indexing component

• Next phase of work looks at common KOS representation formats and API protocols - making content available via programmatic interfaces

• Eg SKOS Core (RDF/XML) Schema and SKOS API deliverables of SWAD-Europe Thesaurus Activity - http://www.w3.org/2001/sw/Europe/reports/thes

• Experiments with XPATH-based KOS interfaces (using XML and SKOS schemas) promising for relatively small KOS held within the web browser

Page 18: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Pilot KOS Browser Client Web Service

• SKOS API designed to provide programmatic access to thesauri and related KOS via the web– Builds on Zthes, ADL Protocols

• DREFT demonstration web services server based on SKOS API available(?) at ILRT http://www.w3.org/2001/sw/Europe/reports/thes/dreft/

• Only a subset of SKOS API calls were available at time of work we investigated possibilities with just 2 API calls –

pilot SKOS API browsing client demonstrates browsing of online thesaurus (GEMET - GEneral

Multilingual Environmental Thesaurus) via web service calls.

• Also GEMET thesaurus own work on web service API

Page 19: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Pilot SKOS API Web Service Browser

getConcept

getAllConceptRelatives

show semantically connected

concepts but not relationships

Navigation history and

local cache of retrieved concepts

implemented

API needs more work

but is a basis for web services

Page 20: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Semantic Expansion Service

• API should reflect use patterns and include composite calls in addition to returning atomic KOS data elements

• Ongoing work - semantic expansion as a service– as an API protocol element

would yield• different configurations KOS interface displays by single call • novel interfaces, such as navigation via semantic expansion• Query expansion for various ranked result query services • Term suggestion to assist indexing/annotation

• More details:KOS at your Service: Programmatic Access to Knowledge Organisation Systems http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/

Page 21: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Future work - KOS and Semantic Web?

• Important to provide a bridge/migration between KOS and Ontologies. KOS can be an element of higher level ontologies and schemas and can help leverage them.Eg utilising SKOS RDF/XML SchemasEg DELOS JPA semantic interoperability project mapping a thesaurus to CRM Upper Ontology

• Ontologies as formal precise definition of relationships can be combined with inference rules and automated systems many useful applications (eg e-Science) where well defined objects and operationsbut also• Take advantage of existing KOS in Semantic Web Some confusion as to how KOS intended to be used Need for education as to KOS design context/purpose

Page 22: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

The ‘ontological ideology’ (Adorno)

• Assumption that allocation of instances to categories is unproblematic (in everyday life)

– tendency to make invisible the ‘interpretive work’ in assigning objects to concepts, the bending of categories and evolution of the meaning of concepts through use

• DL application of concepts to ‘documents’ in indexing/search is also not unproblematic

– Related via “aboutness” not clear-cut instance relationship – Indexer - Searcher (and Indexer) variation in concept selection– Use of results based on probable relevance judgements

Page 23: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

KOS (intellectual) usually

• Designed in order to assist generalised retrieval

• Basis of construction is perceived assistance in indexing/ searching/browsing as much as logical properties of attributes

• Recognition that the semantic structure is to some extent ‘conventional’ with different possible cognitive viewpoints

but that users can be assisted to explore a given structure

and make use of it for own purposes

Page 24: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

How to apply KOS?

• Domain dependent level of precision in concept use Important to take into account how applications will process concepts

• Current KOS relationships at a useful level of generality for many applications (with some specialisation?) where results are based on probable relevance judgements

Eg Thesaurus pragmatic toolincludes semantics, domain lexicon (UF/ALTs, Scope Notes)

• Cost/benefit issues for KOS applications in granularity of relationships and degree of formalisation

• Role for knowledge-based interactive tools in semantic web– old debates on Expert Systems Vs Systems for Experts

Page 25: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

NKOS Workshop at ECDL 2005on related theme to this workshop

• NKOS Workshop – Mapping Knowledge Organisation Systems:

User-centred Strategies

EDCL2005, September 22nd, Vienna

see http://www2.db.dk/nkos2005/

• Selected papers from the NKOS workshop

will be considered for forthcoming special issue

of journal New Review of Hypermedia and Multimedia

along with an open call for papers.

Page 26: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

References

Binding C., Tudhope D. 2004. KOS at your Service: Programmatic Access to Knowledge Organisation Systems. JoDI 4(4), http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/

FACET Case Study, DigiCult Thematic Issue 6: Resource Discovery Technologies for the Heritage Sector,http://www.digicult.info/pages/Themiss.php [pdf]

FACET website. http://www.comp.glam.ac.uk/~FACET/

FACET Web demonstrator http://www.comp.glam.ac.uk/~FACET/webdemo/

FACET Xpath work http://www.comp.glam.ac.uk/~FACET/formats/

Hill et al. 2002. Integration of Knowledge Organization Systems into Digital Library Architectures. ASIST SigCR - http://www.lub.lu.se/SEMKOS/docs/Hill_KOSpaper7-2-final.doc

Tudhope D., Binding C., Blocks D., Cunliffe D. 2002. Compound Descriptors in Context: A Matching Function for Classifications and Thesauri. JCDL 2002, 84-93. full paper (pdf)

Page 27: Reflections from the FACET Project Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS Workshop, JCDL 2005.

Contact Information

Doug Tudhope

School of Computing

University of Glamorgan

Pontypridd CF37 1DL

Wales, UK

[email protected]

http://www.comp.glam.ac.uk/pages/staff/dstudhope