Collection & Service Description and the NISO Metasearch Initiative Juha Hakala, Director (IT),...
-
Upload
parker-hillers -
Category
Documents
-
view
220 -
download
1
Transcript of Collection & Service Description and the NISO Metasearch Initiative Juha Hakala, Director (IT),...
Collection & Service Description and the NISO Metasearch Initiative
Juha Hakala, Director (IT), Helsinki University LibraryChair, NISO Metasearch Initiative Task Group 2
Pete Johnston, UKOLN, University of BathMember, NISO Metasearch Initiative Task Group 2
Special Session, DC-2004,
Shanghai, China, Wednesday 13 October 2004
http://www.ukoln.ac.uk/
http://www.ukoln.ac.uk/
Collection & Service Description for the NISO Metasearch Initiative
• The Metasearch problem• The NISO Metasearch Initiative• Collections and Services• Collection Description & Service
Description
http://www.ukoln.ac.uk/
The Metasearch problem
http://www.ukoln.ac.uk/
The problem
• Content providers make their collections available through their own separate “presentation services”
• User wants to access/use items from multiple content providers
• User has to discover, access and interact with multiple presentation services
• But– Each service has different user interface for discovery– Results human-readable (HTML), but difficult to merge,
reuse, manipulate– Different authentication/access requirements
http://www.ukoln.ac.uk/
The solution (ideally…)
• The provision of "Metasearch" services that – enable user to search across the metadata databases of
multiple content providers from a single interface– manage multiple result sets and present to user– manage authentication/access– (etc!)
• Technologies exist e.g.– (Real-time) cross-searching (Z39.50, SRW/U, service-specific
APIs) – Harvesting (OAI-PMH)
• Seamless (to the user) discovery of and access to heterogeneous, distributed resources!
• However…..
http://www.ukoln.ac.uk/
The problems with Metasearch today
• User requires/expects resources from increasing range of content providers
• Many content providers have not implemented standards-based search interfaces– Many proprietary APIs– Some "screen scraping" (parsing of HTML)
• Metasearch services do work, but – fragile, susceptible to changes by content provider – labour-intensive, scalability issues– duplication of effort
• Also content provider concerns about– efficiency/effectiveness of search– access management, logging etc– branding/IPR/presentation of results
http://www.ukoln.ac.uk/
What is needed
• For effective Metasearch services, content providers and service providers need agreement on (at least…)– Transport protocol(s)– Query language(s)
• syntax and semantics– Metadata schemas
• syntax and semantics– Intellectual property rights issues
• how metadata records and resources are presented, used– Authorisation / authentication– Disclosure / discovery of collections and services
http://www.ukoln.ac.uk/
The NISO Metasearch Initiative
http://www.lib.ncsu.edu/niso-mi/http://www.lib.ncsu.edu/niso-mi/
http://www.ukoln.ac.uk/
The NISO Metasearch Initiative
• Response to content provider/service provider concerns
• Bring together– Content providers– System vendors– Library service providers– Standards developers
• "To identify, develop, and frame the standards and other common understandings that are needed to enable an efficient and robust information environment"
http://www.ukoln.ac.uk/
The NISO Metasearch Initiative
• Aims to enable– metasearch service providers to offer more
effective and responsive services– content providers to deliver enhanced content
and protect their intellectual property – libraries to deliver services that distinguish their
offerings from other free web services
http://www.ukoln.ac.uk/
The NISO Metasearch Initiative
• Standardisation of metasearch applications (portals) must be accomplished– Traditional integrated library systems are fairly
well standardised• ISO 2709 Exchange format (since early 1970s)• MARC formats & AACR2 cataloguing rules• Z39.50 Information retrieval protocol• ISO ILL, NCIP
– For metasearch applications, many relevant standards will be developed in the NISO MI
• This will e.g. enable libraries and other users to exchange metadata between them
http://www.ukoln.ac.uk/
The NISO Metasearch Initiative: Current Activity
• Task Group 1: Access Management– Gather requirements for access/authentication– Describe existing processes– Develop use cases
• Task Group 2: Collection Description– Establish metasearch services' requirements for description of
• Collections• Services which provide access to Collections ("Informational
Services")– Select/develop metadata schemas– Recommend syntax for representation & data exchange
• Task Group 3: Search & Retrieval– Describe existing practice– Metadata to describe result sets– Metadata to describe article-level citations
http://www.ukoln.ac.uk/
Collections and Services
http://www.ukoln.ac.uk/
Collections and Services
• Item– A physical or digital entity
• Collection– An aggregation of one or more items
• Service– The provision of, or system of supplying, one or more functions
of interest to an end-user or software application.– Physical or digital– Digital services may be "structured" or "unstructured"
• Informational services– Services that provide access to, or metadata about, items
and/or collections– JISC Information Environment Architecture: Glossary
http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/
OAIrepository
Harvestvia OAI-PMH
Z39.50target
Search/retrievevia Z39.50
Collection of digital
metadata records
Collection of digital orphysical
items
Informational services
unstructured network service
structured network service
RSSchannel
Alert via RSS/HTTP
Website
"Screen-scrape"
Harvest
Search
Alert
"Screen-scrape"
Website
OAIrepository
Z39.50target
RSSchannel
Functional Model: “Surveying the landscape”
• Agent– "Enters" information landscape
• Views a default set of collections, based on information about the agent
– "Surveys" landscape• Modifies landscape by adding/removing collections, based on
information about the collections– "Discovers" items of interest within collections
• "Drills down" into selected collections
• N.B. Agent may be– Human researcher– Human administrator of presentation service– Software application acting on behalf of human researcher
http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/functional-model/
http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/functional-model/
http://ccinterop.cdlr.strath.ac.uk/http://ccinterop.cdlr.strath.ac.uk/
My default landscape
Coll B Coll C Coll D Coll E
Surveying the information landscape
My default landscape
Coll B Coll C Coll D Coll E
My default landscape
Coll B Coll C Coll D Coll E
My default landscape
Coll A Coll B Coll C Coll D
modified
http://www.ukoln.ac.uk/
Functional Requirements
• Allow an agent to– Discover collections of potential interest– Identify a collection– Select one or more collections from amongst a
number of discovered collections– Identify the informational services that provide
access to the collection
– Select a service with which to interact– Interact with service
• Subject to "knowledge" of interface semantics
Collectiondescription
Servicedescription
http://www.ukoln.ac.uk/
Relations between Collections and Services
• Relationships exist– Between collections and services– Between collections
• In NISO MI conceptual model– A collection is-made-available-by zero or more
services– A service makes-available exactly one
collection– A collection is-part-of zero or more (super-)
collections (parent)– A collection has-parts zero or more (sub-)
collections (child)
Collection
Collection
is-Part-Of
Collection
is-Part-Of
Service
is-Made-Available-By
Serviceis-Made-
Available-By
Serviceis-Made-
Available-By
Collection
is-Part-Of
Collection
is-Part-OfService
is-Made-Available-By
http://www.ukoln.ac.uk/
Collection Description &Service Description
http://www.ukoln.ac.uk/
Collection Description & Service Description
• NISO MI TG2 specifying metadata for collections & services
– Data model– Metadata semantics– Syntax(es) for representation and data exchange– Guidelines for use
• Should build on/reuse existing work where possible• Make recommendations for future work• N.B. TG2 is not
– building a service; or– specifying the architecture within which a service might
operate– specifying the protocols for the exchange of collection/service
metadata
http://www.ukoln.ac.uk/
Collection Description
http://www.ukoln.ac.uk/
Collection Description
• Collection as “an aggregation of one or more items”– "functional granularity"
• Collection-level description– Description of the collection as a whole– Unitary finding-aid
• Considerable recent work on collection-level description• Research Support Libraries Programme (UK, 1999-2002)
– support for academic research– improve disclosure/discovery of library/archive collections– also collaborative collection management – recognition of CLD as important mechanism for
disclosure/discovery
http://www.ukoln.ac.uk/
RSLP Collection Description Project, 1999-2000
• Funded by RSLP, OCLC • RSLP CD Model
– Entity-Relation model (Michael Heaney, University of Oxford)– Implementation independent– Intended to be applicable to wide range of collections– Informed by IFLA FRBR approach as well as existing
descriptive standards• RSLP CD Schema
– DC-based metadata schema (Andy Powell, UKOLN)– Expresses subset of RSLP model– Simplification of model
• Significant influence on other initiatives• But concerns over status, ownership, visibility, persistence,
maintenance, etc
http://www.ukoln.ac.uk/metadata/rslp/http://www.ukoln.ac.uk/metadata/rslp/
RSLP CD Schema v Model
ContentCreatorcreates
Collector
Owner
collects
owns
Administratoradministers
ItemProducerproduces
is-embodied-in
Collection
is-gathered-into
Location
is-located-in
http://www.ukoln.ac.uk/
DC Collection Description Working Group
• Active 2001 (really 2003!) -• Provide forum for sharing information about CLD
activity• Develop a DC Application Profile for collection-
level description– Specification of how DC (and other) properties are used
for describing collections
• Develop supporting materials for use of AP • Informed by experience of RSLP CD implementers
and other CLD initiatives– RSLP projects, TEL, JISC IESR, IMLS, others
http://dublincore.org/groups/collections/http://dublincore.org/groups/collections/
DC Collection Description Application Profile (DC CD AP)
• A "core" set of collection description properties– For simple collection-level descriptions – Suitable for a broad range of collections– Primarily to support discovery of collections
• Examine collection attributes (only) of RSLP CD Schema as starting point
• DC CD AP building on Heaney E-R model– introduces Service as entity-type – describes Collection-Location, Collection-Service,
Collection-Agent relationships– but excludes Location, Service, Agent description
http://www.ukoln.ac.uk/metadata/dcmi/collection-ap-summary/http://www.ukoln.ac.uk/metadata/dcmi/collection-ap-summary/
Item
Collection
is-gathered-into
m
n
Location
is-located-in
m
n
Service
is-Made-Available-By
m
1
mprovides
n
administers
n
mAgent
collects
m
n
owns
m
n
CollectionDescription
is-described-by1 m
http://www.ukoln.ac.uk/
DC Collection Description Application Profile (DC CD AP)
• Draft 2004-08-20 covers– Identification of collection– Content of items in collection– Form of items in collection– Process by which items gathered into collection– Ownership of collection– Rights of access to/use of collection– Location of collection– Services that provide access to collection– Relationships between collections
• Instances can be represented using DC guidelines (RDF/XML, DC-in-XML)
http://www.ukoln.ac.uk/
NISO MI TG2 & DC CD AP
• DC CD AP still work in progress– Some issues
• data model (Location/Service)• one-to-one rule
– Some terms still to be assigned URIrefs
• Scope and specificity of DC CD AP– NISO MI addressing (primarily) library service providers
– Some library-specific requirements • e.g. completeness of collection
– NISO MI may require superset of DC CD AP
– May require non-DCMI naming authority for some metadata terms
http://www.ukoln.ac.uk/
Service Description
http://www.ukoln.ac.uk/
Service Description
• Led by Larry Dixson, Library of Congress• (Informational) Service
– Means of accessing collection• Service description must provide
– Indication of protocol used– Access point for service– Authentication/authorisation information– Operations/queries supported
• N.B. does not describe the "syntax" of service– Assumption that protocols described elsewhere
• NISO MI evaluating use of Zeerex
http://www.ukoln.ac.uk/
Zeerex background
• Z39.50-based specification• Based on earlier work, including Z39.50 Explain
Service and Explain Lite, developed in the ONE2 project
• Relatively easy to implement, yet allows detailed description of services (e.g. Z39.50 servers)
• Sufficiently expressive/flexible to describe similar types of service– Services that provide access to a “database”
http://www.ukoln.ac.uk/
Zeerex
• ServerInfo– protocol– host/IP– port– database/service– authentication*
• DatabaseInfo– access restrictions
• IndexInfo– search– scan (browse)– sort
• RecordInfo– record syntax– element set name
http://www.ukoln.ac.uk/
Zeerex and access protocols
• Access protocols “in scope” for NISO MI– Z39.50– SRW/SRU– OAI-PMH– HTTP– LDAP/X.500?– GRID metasearch?– GIS search facility?
• Can Zeerex support description of services using all of these protocols?
• Tests currently in progress
http://www.ukoln.ac.uk/
Summary
• NISO MI bringing together different stakeholders to develop shared approaches to common problems
• Disclosure/discovery of collections and informational services critical to effective metasearch services
http://www.ukoln.ac.uk/
Acknowledgements
• UKOLN is funded by the UK Museums, Libraries and Archives Council (MLA), the Joint Information Systems Committee (JISC) of the UK higher and further education funding councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.
• http://www.ukoln.ac.uk/
http://www.ukoln.ac.uk/
Collection & Service Description and the NISO Metasearch Initiative
Juha Hakala, Director (IT), Helsinki University LibraryChair, NISO Metasearch Initiative Task Group 2
Pete Johnston, UKOLN, University of BathMember, NISO Metasearch Initiative Task Group 2
Special Session, DC-2004,
Shanghai, China, Wednesday 13 October 2004
http://www.ukoln.ac.uk/