Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which...
-
Upload
eileen-hopkins -
Category
Documents
-
view
215 -
download
0
Transcript of Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which...
![Page 1: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/1.jpg)
Logging in Digital Libraries
![Page 2: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/2.jpg)
Last week ….
• Introduction to quality indicators and the way in which these are formalized and made computable, according to one view.
• Making a digital library as good as it cam be requires understanding what it is and how it is being used.
• Information comes from logs
![Page 3: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/3.jpg)
Another aspect
• A category of quality indicator that comes from seeing what happens when users visit the library
• An important tool -- the logs• All web based systems have logs of
interaction from the outside world to the web server– Not specifically designed for digital libraries
• We will look at a proposed standard for digital library specific log analysis
![Page 4: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/4.jpg)
This work
Done by Marcos André Gonçalves, Ming Luo, Rao Shen, Mir Farooq Ali, and Edward A. Fox -Virginia Tech
and
Lillian N. Cassel, Filip Jagodzinski - Villanova
![Page 5: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/5.jpg)
Motivation
Log analysisSource of information about:
• How patrons really use DL services• How systems behave while supporting user information seeking
activities• Examples: patterns
Used to:EvaluateEnhance servicesHelp design user interfacesBetter allocation of resources
Common practice in the web settingSupported by web servers, proxy caching
![Page 6: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/6.jpg)
Motivation (cont.)
• DLs differ from the web– DL collections are explicitly organized, described, managed, and
preserved– Users with more specific tasks and needs– Digital objects and collections more structured
DL Logging should offer much richer information and opportunities Tradeoff : user privacy
• Current DL logs – Differences in formats and recorded information– Problems:
• Lack of interoperability• No reuse of analysis tools• Comparability of log analysis results
![Page 7: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/7.jpg)
Related Work
• Problems with existing DL logs– Incompatibility– Incompleteness– Complexity of analysis– Lack of organization– Ambiguity– Inflexibility– Verboseness
--- Generally, lack of a global view of need for understanding how the DL is or is not serving its users
![Page 8: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/8.jpg)
The Digital Library Standardized Log Format
• Comprehensive• Reflective of the actual DL system behavior• Easily readable• Precise• Flexible to accommodate varying systems• Succinct enough to be implemented• Concern: user privacy
![Page 9: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/9.jpg)
DL Standarized Log Format Design
Capture high level user and system behaviors– Hierarchical organization– Encapsulated in transactions
• Interactions between the users and the system or among the system components
• Log format designed to record a number of different kinds of transactions
• Examples:Login to the system Submission of search queryBrowsing a result listRecording of a user failure
![Page 10: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/10.jpg)
Log format design (cont.)
Design– Reflective of DL function– Based on the 5S formal theory
• Unifying, mathematical theory to describe formally the semantics of DL components
• Guidance for how to organize the log structure
![Page 11: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/11.jpg)
Log design and 5S5S Definition Use in Log Design
Streams Represent static and dynamic multimedia content
Structures Labeled directed graphs; provide organization within the DL
Spaces Sets, properties and operations on those sets
Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement.
Societies Sets of communities and relationships among them
![Page 12: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/12.jpg)
Log design and 5S5S Definition Use in Log Design
Streams Represent static and dynamic multimedia content
Temporal events, types of digital objects
Structures Labeled directed graphs; provide organization within the DL
Spaces Sets, properties and operations on those sets
Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement.
Societies Sets of communities and relationships among them
![Page 13: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/13.jpg)
Log design and 5S5S Definition Use in Log Design
Streams Represent static and dynamic multimedia content
Temporal events, types of digital objects
Structures Labeled directed graphs; provide organization within the DL
Structured documents and metadata; structured searches, collection, metadata catalog; hypertext, classification scheme
Spaces Sets, properties and operations on those sets
Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement.
Societies Sets of communities and relationships among them
![Page 14: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/14.jpg)
Log design and 5S5S Definition Use in Log Design
Streams Represent static and dynamic multimedia content
Temporal events, types of digital objects
Structures Labeled directed graphs; provide organization within the DL
Structured documents and metadata; structured searches, collection, metadata catalog; hypertext, classification scheme
Spaces Sets, properties and operations on those sets
Retrieval mode, Presentation information,
Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement.
Societies Sets of communities and relationships among them
![Page 15: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/15.jpg)
Log design and 5S5S Definition Use in Log Design
Streams Represent static and dynamic multimedia content
Temporal events, types of digital objects
Structures Labeled directed graphs; provide organization within the DL
Structured documents and metadata; structured searches, collection, metadata catalog; hypertext, classification scheme
Spaces Sets, properties and operations on those sets
Retrieval mode, Presentation information,
Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement.
Organization of the user and system actions into transactions, statements, events and actions; DL services as sets of scenarios.
Societies Sets of communities and relationships among them
![Page 16: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/16.jpg)
Log design and 5S5S Definition Use in Log Design
Streams Represent static and dynamic multimedia content
Temporal events, types of digital objects
Structures Labeled directed graphs; provide organization within the DL
Structured documents and metadata; structured searches, collection, metadata catalog; hypertext, classification scheme
Spaces Sets, properties and operations on those sets
Retrieval mode, Presentation information,
Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement.
Organization of the user and system actions into transactions, statements, events and actions; DL services as sets of scenarios.
Societies Sets of communities and relationships among them
User information
![Page 17: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/17.jpg)
DL Log Format Specification
• Organization in structured logical way – XML- XML Schema
• Standard syntax
• Guarantee quality, correctness
• Rich set of basic types help standardization
• Abundance of XML parsers helps construction of analysis tools
![Page 18: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/18.jpg)
Log Format - Structure
• Top Level Hierarchy
Log
Log Entry
Transaction
SessionId
MachineInfo
TimeStamp
Statement
. . . . . .
![Page 19: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/19.jpg)
DL Log Format - Structure (cont)
• Decomposition of statement into different types
AdmInfo
Statement
SessionInfo
Event
ErrorInfo
HelpInfo
RegisterInfo
![Page 20: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/20.jpg)
Log Format - Structure (cont.)
• Decomposition of event
AdmInfo
Statement
SessionInfo
Event
ErrorInfo
HelpInfo
RegisterInfo
Action StatusInfo
Search Browse StoreSysInfoUpdate
![Page 21: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/21.jpg)
DL Log Format Structure (cont)
• Search attributesSearch
QueryString
TimeFrame
PresentationInfo
SearchBy
Format NumberOfResultsSortBy CutOff
Collection
Catalog
![Page 22: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/22.jpg)
DL Log Tool Implementation
Digital LibraryUser Layer
XMLLogManagerwriteLogEntry(parameters)
c1
XMLLogData
c2
Log middleware
Systemevent
storelogData(parameters)
Userevent
Analysistool
getLogData(parameters)
logData
Analysisrequest
result
DLpatron
DLanalyst
![Page 23: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/23.jpg)
Log Tool example: login
• Example 1: Login to the system
< TransId = "3452"> <SessionId > 987654usr3 </SessionId> <SessionInfo> <SessionStart> Start </SessionStart> <LoginInfo> <UserId> mhabib <UserId> </LoginInfo> </SessionInfo> <TimeStamp> 2002-05-31T20:10:55.000-05:00 </TimeStamp> <MachineInfo> <IPAddress> 128.173.244.56 <IPAddress> <Port> 8000 </Port> </MachineInfo></TransId>
![Page 24: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/24.jpg)
Log tool example: query a collection
Example 2: query all Dirline records about “low back pain”.. <Event> <Action> <Search> <Collection>Dirline</Collection> <ObjectType>CommunityRecord</ObjectType> <SearchBy>SearchByAnyParts</SearchBy> <SearchType>NonPersistant</SearchType> <QueryString>low back pain</QueryString> <TimeFrame> <StartTime>2002-05-31T20:11:07.000-05:00</StartTime> <EndTime>2002-05-31T20:11:09.000-05:00</EndTime> </TimeFrame> <PresentationInfo> <Format>List</Format> <SortBy>ByRank</SortBy> <NumberOfResults>217</NumberOfResults> <Cutoff>20</Cutoff> </PresentationInfo> ...
Ref to GMT
![Page 25: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/25.jpg)
Log Analyzer OverviewXML Log
Log Data Parser/ Error Checker Routine
<TimeStamp> module
<Transaction ID = 3452> <SessionId> 987654usr3</SessionId> <TimeStamp> 2002-05-31T20:10:55.0-05.00</TimeStamp> <Statement> … <QueryString>low back pain</QueryString> … <DocID> 5114 </DocID> </Statement></TransID>
Step 1: Extract
Browse
Query String User ID
Search Error
Doc ID
Step 3: Populate Databases, Increment Global Variables, etc.
Step 4: Create Final Statistics
<Session ID> module
<Query String> module
<Error> module
Step 2: Parse XML; Send Log Line
Final Report/
Statistics
Databases
Step 1: Extract Log Data, SAX parser can be employed here
Step 2: Parse Log Data and check for log errors (eg. server stalls and incomplete log line is output to XML Log)
Step 3: The different modules populate various databases and/or increment the appropriate counters. Each module can adjust various databases, as for example the <TimeStamp> module, which increments the appropriate month hit counter and records that the user 987654usr3 made a request at time T.
Step 4: Aggregate data and output final statistics; all databases are made available
![Page 26: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/26.jpg)
Summarizing this class and last week
• Looked at a view of DL quality– By examining the components of the DL
independent of usage (explicit computation)
– By looking at the view of the DL obtained by a visitor (log analysis)
• Each is a view that has been widely promulgated and well received, but is not an industry standard
![Page 27: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/27.jpg)
Next week
• Joseph Lucia, Director of Villanova’s Falvey Library will talk about what is happening in this very innovative and significant digital library.
• Come prepared with questions and ready to comment and discuss what he presents.
![Page 28: Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.](https://reader035.fdocuments.net/reader035/viewer/2022070404/56649f345503460f94c50e6e/html5/thumbnails/28.jpg)
References
• Gonçalves, M. A., Luo, M., Ali, M. F., and Fox, E. A. “An XML Log Standard and Tool for Digital Library Logging Analysis” In Research and Advanced Technology for Digital Libraries, 6th European Conference, ECDL 2002, Rome, Italy, September 16-18, 2002, Proceedings
• Klas, C., et al "A Logging Scheme for Comparative Digital Library Evaluation” Research and Advanced Technology for Digital Libraries, 10th European Conference, ECDL 2006, Alicante,Spain, September 17-29, 2006, Proceedings