Bieber et al., NJIT ©2005 - Slide 1 Lightweight Integration and Recommendation of Documents and...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
3
Transcript of Bieber et al., NJIT ©2005 - Slide 1 Lightweight Integration and Recommendation of Documents and...
Bieber et al., NJIT ©2005 - Slide 1
Lightweight Integration and Recommendation
of Documents and Services-------
Digital Library Service Integration,IntegraL and IntLib Projects
Michael Bieber*, Il Im*, Vincent Oria**
Richard Sweeney***, Yi-Fang Wu*
* Information Systems Department *** Robert Van Houten Library ** Computer Science Department
College of Computing Sciences
New Jersey Institute of Technology
http://is.njit.edu/integralApril 2005
Bieber et al., NJIT ©2005 - Slide 2
Outline• Motivation• Illustrations• Structural Relationships• 3 Types of Integration• Personalizing Links• Federated Metasearch• Recommendations• Contributions and Vision• Call for Collaboration • Project Details
Bieber et al., NJIT ©2005 - Slide 3
Challenges for Library Users• Need to know what resources to use before they can access
them• Finding related information outside current system• Need to leave current page to do related tasks
Why?• Library resources aren’t
integrated well
==> Project Goal: – Bring relevant resources directly to the user
Library resources: databases (e.g., EBSCOhost,
ACM Digital Library), external digital libraries,
on-line catalog, special collections, library services (e.g., interlibrary loan)...
Bieber et al., NJIT ©2005 - Slide 4
Integration through Linking• automatically generate link anchors on elements
recognizedbased on:– structural relationships– lexical relationships
• automatically generate links – to related information – to relevant services
==> lightweight integration of – documents containing links and– documents/services the links point to
Bieber et al., NJIT ©2005 - Slide 5
Prototype
Services for a launch-date element:- search by launch date- search by month and year- search by year
Bieber et al., NJIT ©2005 - Slide 6
Prototype
Services for a document element:- open- summarize in 3 sentences
Bieber et al., NJIT ©2005 - Slide 7
Mock-up for alibrary database
Services from multiple systems(customized to user tasks/preferences)
Bieber et al., NJIT ©2005 - Slide 8
Benefits of Integrationfor a system (collection/service)
• Users: direct access to related systems– enlarges a system’s feature set
• Links leads users to a system– systems gain wider use
• Users become aware of other systems– systems gain wider awareness
• Direct access to a system’s features– streamlined access (bypassing menus)
Bieber et al., NJIT ©2005 - Slide 9
structural elementsand links
lexical elementsand links
Two Types of Links:(1) structural based on element type * title, author, source(2) lexical (found in a glossary)
Bieber et al., NJIT ©2005 - Slide 10
Structural Relationships
• Links generated based on application structure, not search or lexical analysis
– You cannot do a search on the display text “$127,322.12” to find related information…
– But you can find relationships for the element Sales[2002]
$85,101.99$127,322.12
2002 Expenses2002 Sales
Bieber et al., NJIT ©2005 - Slide 11
Outline• Motivation• Illustrations• Structural Relationships• 3 Types of Integration• Personalizing Links• Federated Metasearch• Recommendations• Contributions and Vision• Call for Collaboration • Project Details
Bieber et al., NJIT ©2005 - Slide 12
Three Types of Integration:(1) for documents to receive anchors and links(2) to provide services (which become links)(3) to provide glossaries for content analysis
Require a document schema mapper to recognize structural elements:- wrapper- fixed template - XML markup- etc.
Bieber et al., NJIT ©2005 - Slide 13
Three Types of Integration:(1) for documents to receive anchors and links
(2) to provide services (which become links)(3) to provide glossaries for content analysis
Linking Rules represent * every service * that a system can provide * for each kind of element.
Bieber et al., NJIT ©2005 - Slide 14
Three Types of Integration:(1) for documents to receive anchors and links
(2) to provide services (which become links)(3) to provide glossaries for content analysis
Linking Rules represent * every service * that a system can provide * for each kind of element.
Example ==>
Bieber et al., NJIT ©2005 - Slide 15
Example Linking Rulefrom the AskNSDL system
– a) element type (“concept”)
– b) link display label (“Ask an expert about this”)
– c) relationship metadata
– d) destination collection or service (“Ask NSDL”)
– e) the exact command to send to the destination system
• (logs the user into AskNSDL, opens question template, fills in the element instance (i.e., “physics teaching”) as the subject, and places the cursor in the question area)
– f) any relevant conditions for including this relationship
Bieber et al., NJIT ©2005 - Slide 16
Three Types of Integration:(1) for documents to receive anchors and links(2) to provide services (which become links)
(3) to provide glossaries for content analysis
Lexical analysis by:• NJIT Noun Phrase
Extractor• NJIT Ontology
Developer
Bieber et al., NJIT ©2005 - Slide 17
Each system is integrated independently:(1) Schema mappers for individual systems(2) Linking rules are plugged in” independently
for each service(3) Glossaries and thesauri can be independent
of other systems
Bieber et al., NJIT ©2005 - Slide 18
Outline• Motivation• Illustrations• Structural Relationships• 3 Types of Integration• Personalizing Links• Federated Metasearch• Recommendations• Contributions and Vision• Call for Collaboration• Project Details
Bieber et al., NJIT ©2005 - Slide 19
Personalizing the LinksCustomize the list of links according to:• Collaborative Filtering
– Matching user’s “click stream” to other users’
• time spent at each destination• asking users to rate links• user task information
Bieber et al., NJIT ©2005 - Slide 20
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Federated Metasearch• Searches, merges & ranks
Bieber et al., NJIT ©2005 - Slide 21
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Federated Metasearch• Searches, merges & ranks• Clusters results by concept
Bieber et al., NJIT ©2005 - Slide 22
Federated Metasearch:Clustering by Concept
concepthierarchy
search resultsare clusteredby concept
Bieber et al., NJIT ©2005 - Slide 23
Outline• Motivation• Illustrations• Structural Relationships• 3 Types of Integration• Personalizing Links• Federated Metasearch• Recommendations: General Recommendation Engine• Contributions and Vision• Call for Collaboration• Project Details
Bieber et al., NJIT ©2005 - Slide 24
Integration With Partner Libraries
Minimum integration More complex integration
Bieber et al., NJIT ©2005 - Slide 25
GRE Manager
User TaskDatabase
CF EngineClickstream/EvaluationDatabase
Users
GRE
Collection A
CB EngineDocumentDatabase
KB Engine OntologyDatabase
Final Recomm-endations
CB Recomm-endations
KB Recomm-endations
CF Recomm-endations
User Info.ClickstreamsDocumentsEvaluations
DocumentsClickstreams
User Info.ClickstreamsEvaluations
User Info.Documents
Final Recomm-endations
User Info.ClickstreamsDocumentsEvaluations Users
Collection N
Final Recomm-endations
User Info.ClickstreamsDocumentsEvaluations
......
Bieber et al., NJIT ©2005 - Slide 26
General Recommendation Engine Research Goals
• Integrate three major recommendation technologies
– Collaborative filtering (CF), Content-based (CB),
and Knowledge-based (KB) recommendation
• Automatically identify users’ current task
(search mode)
• Study the impacts of the recommendations on
information search
Bieber et al., NJIT ©2005 - Slide 27
Collaborative Filtering (CF)
• Recommendations based on similarities of people
• Traditional CF requires direct user inputs
• Clickstream-based CF (CCF) does not require direct
user inputs
• Works well for preference goods
• Does not work so well for information-intensive items
Bieber et al., NJIT ©2005 - Slide 28
Content-based Filtering (CB)
• Recommendations based on similarities of contents
– titles, authors, abstracts, or full texts
• Information retrieval (IR) techniques are used
– e.g., tf.idf value
• Documents similar in content are recommended
• Demo: http://highlight.njit.edu/ais/
Bieber et al., NJIT ©2005 - Slide 29
Knowledge-based Recommendation (KB)
• CF and CB lack a holistic view
– why a document is relevant for a user
• KB recommends items based on a certain
knowledge structure (ontology)
• KB requires knowledge engineering
• Goal: to build a automated (or semi-automated)
ontology engine based on the ‘Self-organizing
tree’ algorithm (Khan and Luo, 2002)
Bieber et al., NJIT ©2005 - Slide 30
Automatic User Profile Extractor
• Records a user’s recent documents
• The user’s profile is represented by a set of
keywords from those documents
• As the user visits more documents, his/her
profile is updated
Bieber et al., NJIT ©2005 - Slide 31
Outline• Motivation
• Illustrations
• Structural Relationships
• 3 Types of Integration
• Personalizing Links
• Federated Metasearch
• Recommendations
• Contributions and Vision
• Call for Collaboration
• Project Details
Bieber et al., NJIT ©2005 - Slide 32
Contributions• straightforward, sustainable approach for
integrating documents and services– Lightweight integration through linking
• combining structural links with content-based links
• next-generation collaborative filtering• federated metasearch• next-generation recommendations• integrating traditional and digital libraries• widespread dissemination
Bieber et al., NJIT ©2005 - Slide 33
VisionA nationwide virtual library• to and from
– your local library
– other physical libraries
– digital libraries
• incorporating– traditional library resources
– digital library resources
Bringing relevant resources directly to the user!
Bieber et al., NJIT ©2005 - Slide 34
Looking for Collaboration• Additional document systems, digital
library collections, services and glossaries to integrate
• Physical library partners
• Digital library partners
• Web services to integrate
• Other suggestions welcome!
Bieber et al., NJIT ©2005 - Slide 36
Outline• Motivation• Illustrations• Structural Relationships• 3 Types of Integration• Personalizing Links• Federated Metasearch• Contributions and Vision• Call for Collaboration• Project Details
Bieber et al., NJIT ©2005 - Slide 37
Digital Library Service IntegrationNSF National Science Digital Library Award #DUE-0226075; 2002-2005
Tasks• Develop Integration
Infrastructure• Integrate digital library
collections and services• Collaborative filtering• Evaluation
Partners• NASA GSFC Library• AskNSDL• Earth Science Picture of
the Day System• Atmospheric
Visualization Collection• Metis Workflow (University
of Colorado, Boulder)
• University of Arizona
Bieber et al., NJIT ©2005 - Slide 38
IntLibInstitute of Museum and Library Services Award #LG-02-04-0002-04; 2004-2007
Tasks - to integrate:• EBSCOhost • Gale’s Discovery
Collection • ProQuest• On-line Catalog Systems• New Jersey Digital
Highway
The IntLib Project focuses on integrating the resources of public libraries primarily (and university libraries secondarily) with digital libraries.
Additional Partner• Newark Public Library
Bieber et al., NJIT ©2005 - Slide 39
IntegraLNSF National Science Digital Library Award #DUE-0434581; 2004-2007
Tasks - to integrate:• ACM Digital Library• Elsevier Science Direct
(permission pending)
• NJIT Electronic Thesis collection
• JerseyClicks• StartingPoint• Digital Library for Earth
Science Education (DLESE)• Science@NASA• NSDL Core Integration
features• an on-line bookstore
The IntegraL project focuses on integrating specific resources of college libraries with those of the NSDL.
Additional Partners• Cumberland C.C.• Ramapo College• Olin College of Engineering
Bieber et al., NJIT ©2005 - Slide 40
General Recommendation EngineNSF National Science Digital Library Award; 2004-2007
Tasks - to integrate:• Collaborative filtering
recommendations• Content-based
recommendations• Knowledge-based
recommendations
Partners• Digital Library for Earth
Science Education (DLESE)• Eisenhower National
Clearinghouse for Mathematics and Science Education
• Computer Vision Education Digital Library