Las 5 S: Modelo formal del la Biblioteca Digital Biblioteca Central Universidad Nacional del Sur...

Post on 12-Jan-2016

220 views 0 download

Tags:

Transcript of Las 5 S: Modelo formal del la Biblioteca Digital Biblioteca Central Universidad Nacional del Sur...

Las 5 S: Modelo formal del laBiblioteca Digital

Biblioteca CentralUniversidad Nacional del Sur

Bahia Blanca, Argentina May 17-18, 2004

Edward A. Fox

fox@vt.edu http://fox.cs.vt.edu

Acknowledgements (Selected)• Sponsors: ACM, Adobe, AOL, IBM, Microsoft,

NASA, NLM, NSF, OCLC, SUN, US Dept. of Ed.

• VT Faculty/Staff: Debra Dudley, Weiguo Fan, Gail McMillan, Manuel Perez, Naren Ramakrishnan, Layne Watson, …

• VT Students: Yuxin Chen, Shahrooz Feizabadi, Marcos Gonçalves, Nithiwat Kampanya, S.H. Kim, Bing Liu, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ricardo Torres, Wensi Xi, Baoping Zhang, Qinwei Zhu, …

ACKNOWLEDGEMENTS (NDLTD)• NDLTD Board of Directors, previous Steering Committee + other

NDLTD committees; those running Electronic Thesis & Dissertation (ETD) initiatives in universities, regions, countries

• Helpful sponsorship by many organizations, especially Adobe (new initiative!), CONACyT, DFG, FIPSE (US Dept. Education), IBM, Microsoft, NSF (IIS-9986089, 0086227, 0080748, 0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, VTLS, many governments (Australia, Germany, India, …), …

• Colleagues at Virginia Tech (faculty, staff, students), and collaborators at many universities

• Slides included from: Vinod Chachra, Thom Hickey, Joan Lippincott, Gail McMillan, Axel Plathe, Hussein Suleman, …

Other Collaborators (Selected)

• Brazil: FUA, UFMG, UNICAMP• Case Western Reserve University• Emory, Notre Dame, Oregon State• Germany: Univ. Oldenburg• Mexico: UDLA (Puebla), Monterrey• College of NJ, Hofstra, Penn State, Villanova• University of Arizona• University of Florida, Univ. of Illinois• University of Virginia

• Endowment: VTLS

UNESCO

• Cláudio Menezes [cmenezes@unesco.org.uy]• Purpose:

• Reinforce local solutions, commitments

• Emphasize:• ETD does not need many resources.• Open source and free software is available.• International cooperation can help.• Local training is crucial. • => Inclusion of ETD in practices, processes• => Schedule for ETD projects

Part 2

The 5S Model:

A Formal Model for the

Digital Library

Motivation

• DLs are not benefiting from formal theories as have other CS fields: DB, IR, PL, etc.

• DL construction: difficult, ad-hoc, lacking support for tailoring/customization

• Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development.• Lack of specific DL models, formalisms,

languages

5S Layers

Societies

Scenarios

Spaces

Structures

Streams

Definition: Digital Libraries are complex systems that

• help satisfy info needs of users (societies)

• provide info services (scenarios)

• organize info in usable ways (structures)

• present info in usable ways (spaces)

• communicate info with users (streams)

DL Student Research: Gonçalves

• 5S as a basis for developing digital libraries

• Theory

• Syntax, Semantics; Definitions, Relationships

• Specification of requirements

• Generation of systems

• Quality

DL Services/Activities Taxonomy (Gonçalves)

BrowsingCollaboratingCustomizingFilteringProviding accessRecommendingRequestingSearchingVisualizing

AnnotatingClassifyingClusteringEvaluatingExtractingIndexing

MeasuringPublicizing

RatingReviewing (peer)

SurveyingTranslating (language)

ConservingConverting

Copying/ReplicatingEmulatingRenewing

Translating (format)

AcquiringCataloging

Crawling (focused)DescribingDigitizingFederatingHarvestingPurchasingSubmitting

PreservationalCreational

AddValue

Repository-Building

Information SatisfactionServices

Infrastructure Services

Defining Quality in Digital LibrariesDL Concept Dimensions of Quality

Digital object Accessibility

Pertinence (*)

Preservability (*)

Relevance

Similarity

Significance

Timeliness (*)

Metadata specification Accuracy

Completeness

Conformance

Collection Completeness

Impact Factor

Catalog Completeness

Consistency

Repository Completeness

Consistency

Structures for Navigation Navigability (*)

Services Composability

Efficiency

Effectiveness

Extensibility

Reusability

Reliability

5S Model: Examples, Objectives

Models Examples ObjectivesStream Text; video; audio; image Describes properties of the DL

content such as encoding and language for textual material or particular forms of multimedia data

Structures Collection; catalog; hypertext; document; metadata; organization tools

Specifies organizational aspects of the DL content

Spatial Measure; measurable, topological, vector, probabilistic

Defines logical and presentational views of several DL components

Scenarios Searching, browsing, recommending,

Details the behavior of DL services

Societies Service managers, learners, Teachers, etc.

Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

Document Models, Representations, and Accesses

• Doc = stream + structure + use-scenario; hybrid (paper/electronic), digital only

• Multilingual: content, summary, metadata• Multimedia: structure, quality (oS), search• Structured: MARC, SGML, by user: MVD• Distributed collection: Kleisli, CIMI, Z39.50• Federated search: collecting, picking site(s),

parallel search / fall-back, fusing results• Access: IPR, payment, security, scenarios

Architectural Issues

• Internet middleware• Independent system / part of federation• Decompositions vary

• search engine, browser, DBMS, MM support• repository, handle server, client• information resources + mediators, bus or agent

collection + client with workspace/environment• Metrics: e.g., for federated search

Standards

• Protocols/federation• Z39.50, CIMI• Dienst, NCSTRL• OAI protocol

• Metadata• TEI: inline, detailed (structure in stream)• MARC: two-level, fine-grained• Dublin Core: high-level, 15 elements• RDF: describing resources/collections, annotation• OAMS -> DC and others used in OAI

Digital Library Courseware

• http://ei.cs.vt.edu/~dlib/• WWW pages or large PDF copy files• Online quizzes based on book by Michael Lesk

(Morgan Kaufmann Publishers)• Contents based on book, with several other

popular topics added (e.g., agents)• Separate pages to supplement: Definitions,

Resources (People, Projects), and References• UNC-CH proposal; book plans for 2005

Topical Outline - Foundations

• Early visions

• Definitions

• Resources

• References

• Projects

Topical Outline – IR Areas

• Search, Retrieval, Resource Discovery• Information storage and retrieval• Boolean vs. natural language• Search engines• Indexing, phrases, thesauri, concepts• Federated search and harvesting, OAI• Integrating links and ratings• Crawlers, spiders, metasearch, fusion

• Details following – Li Wang indep. study

What is a Crawler?

• A Program• An Important Module For Web Search Engine• Crawls On The Web According To Its Algorithm• Retrieves Web Pages• Gets Useful Information• Stores The Web Pages For Future Refining

Jobs For Threads

Get A New URLFrom Buffer

Contact The ServerFor File Type

Download TheFile

Parse TheWeb Page

Put New URLsInto Buffer

Advanced Functions

• Backward Linkage Information Collector

A Web Page

Topical Outline - Multimedia

• Multiple media types, representations

• Text, audio, image, video, graphics, animation

• Capture, digitization, standards, interchange

• Compression, content-based retrieval

• Playback (Real), SMIL, QoS

• JPEG, MPEG (and versions)

Topical Outline - Architectures

• Distributed, centralized

• Modular, componentized

• Bus (InfoBus), hierarchical, star

• Mediators, wrappers (TSIMMIS)

• Light weight protocols

• Architecture of OAI and XOAI

Topical Outline – Interfaces

• Taxonomy of interface components

• Workflow

• Visualization

• Environments

• Design

• Usability testing

Topical Outline – Metadata

• MARC

• Dublin Core

• RDF

• IMS

• OAI (Open Archives Initiative)

• Crosswalks, mappings

• Ontologies

• Topics maps, concept maps

Topical Outline – Epub, SGML, XML

• Authoring

• Rendering, presenting

• Structure

• Tagging, Markup, DOM

• Semi-structured information

• Dual-publishing, eBooks

• Styles (XSL, XSLT)

• Structure queries

Topical Outline – Databases

• Extending database technology

• Structured and unstructured info

• Multimedia databases

• Link databases

• Performance

• Replicated storage, I2-DSI (details following)

Topical Outline – Agents

• Protocols

• Knowledge interchange

• Negotiation, registries

• Distributed issues

• Ontologies (standard upper)

• Webbots (automatic indexing)

Topical Outline – Economics

• E-commerce

• Sustainability

• Preservation and archiving• DLF, Besser, Lorie, Gladney

• Self-archiving

• Open collections

• Economic models, business plans

Topical Outline – IPR

• Intellectual property rights (IPR)

• Legal issues

• Terms and conditions

• Copyright

• Patents, trademarks

• Distributed rights management

• Security

Topical Outline – Social Issues

• Cooperation, collaboration• Annotation, ratings• Digital divide• Educational applications• Cultural heritage• Museums (AMICO)• Organizational acceptance• Personalization• Internationalization

5S Model: Definitions

5S DefinitionStreams Sequences of elements of an arbitrary

type

Structures Labeled directed graphs

Spatial Sets and operations on those sets

Scenarios Sequences of events that modify states of a computation in order to accomplish some functional requirement.

Societies Sets of communities and relationships among them

Overview of 5S and DL formal definitions and compositions (Gonçalves)

5S

structures (d.10)streams (d.9) spaces (d.18) scenarios (d.21) societies (d. 24)

structural metadataspecification(d.25)

descriptive metadataspecification(d.26)

repository(d. 33)

collection (d. 31)

(d.34)indexingservice

structured stream (d.29)

digitalobject (d.30)

metadata catalog (d.32)

browsingservice

(d.37)

searchingservice (d.35)

digital library(minimal) (d. 38)

services (d.22)

sequence (d. 3)

graph (d. 6)function (d. 2)

measurable(d.12), measure(d.13), probability (d.14), vector (d.15), topological (d.16) spaces

event (d.10)state (d. 18)

hypertext(d.36)

sequence (d. 3)

transmission(d.23)

relation (d. 1) language (d.5)

grammar (d. 7)

tuple (d. 4)*

Semantic relationships among DL concepts:

Partial concept map (Gonçalves)

Service

Repository

CatalogCollection

Digital Object DescriptiveMetadata

extenduse

is_version_oftranslation_of

1 11

n1 n

Society

scenario

actorsmanagers

participates_in has

useruns

has

containscontains

storesstores

is_described_ by

Metadata Format

conforms_with

built_over

is_described_by

5S Framework and DL Development (Gonçalves)

Requirements Analysis Design Implementation Test

5S 5SLOO ClassesWorkflow Components

DLEvaluation

5SGraph 5SLGenFormalTheory/Metamodel

DL XMLLog

5SLGen: Automatic DL Generation

5S Meta

Model5SLGraph

DL Expert

DL Designer

5SL DL

Model

5SLGen

Practitioner

Researcher

TailoredDL

Services

Teacher

componentpool

ODLSearch,ODLBrowse,ODLRate,ODLReview,

…….

Requirements (1) Analysis (2)

Implementation (4)

Design (3)

MARIAN DL Generation

MARIANDigital Library

Generator

5SLDesign XML

PARSERS:DOM, SAX

MARIANAPI

ComponentPool

Classmanagers

Loader User interfaces

Indexing Classes

Resource ManagerConfiguration and Processing Classes

Challenges with Approach

• The designer should know the 5S theory very well and be very familiar with the syntax and semantics of 5SL to be able to write correct 5SL files.

• It is difficult to get the big picture of a digital library just from a textual 5SL file.

• Overall objective of 5SGraph:Help users model their own instances of a digital library (DL) in the 5S language (5SL).

• A simple modeling process which enables rapid generation of digital libraries is needed.

• Support non-expert users.• Speed-up development process.• Increase the quality of final product.

5SGraph: A DL Modeling Tool(Qinwei Zhu MS Thesis)

Goals of 5SGraph

• To help digital library designers understand the 5S model quickly and easily

• To help digital library designers build their own digital libraries without difficulty

• To help digital library designers transform their models into 5SL files automatically

• To help digital library designers understand, maintain, and upgrade existing digital library models conveniently

5SGraph

How does 5SGraph work?

• 5SGraph loads and displays a metamodel in a structured toolbox.

• The structured editor of 5SGraph provides a top-down visual environment for the DL designer.

• 5SGraph produces correct 5SL files according to the visual model built by the designer.

Overview of 5SGraph

Workspace

(instance model)

Structured

toolbox

(metamodel)

Overview of 5SGraph(cont.)

• Structured toolbox• Show the available concepts in metamodel and

the relationships between those concepts.• Visualize the Metamodel• Concepts in structured toolbox can be added

into workspace.

• Workspace• Visualize the model• The place where the user creates his/her model.

Visualization Features

• The structured toolbox• Visualization of the metamodel• Visual components that can be added

• Truncated display of trees• Node-link representation• Deep-node problem

• Icons• Type/Instance relationship

• Cardinality

Component Reuse

• Components can be loaded/saved.• Load and save sub-trees

• Component reuse saves time and effort.• Full reuse from component pool• Partial reuse: adapting components

Functionalities of 5SGraph

• Load/Close a metamodel• Load/Save/Close a model• Explore the structure of metamodel/model• Add concepts from metamodel to model• Delete concepts from model• Change the properties of concepts• Load/Save a existing concept• Specifying inter-model constraints

Open/Close metamodel

Load/Save/Close a model

Explore the structure of metamodel and model

Add a concept to user model

• Top-down: Before you want to add a concept, make sure you have added its parent.

• You can only add a concept to its parent node• Make sure the parent node is chosen before you

add a new concept. • If the highlight color is blue, the concept has satisfied

all the requirements and can be added.

• If the highlight color is yellow, click the parent node in workspace and then add the concept.

Add a concept to user model(cont.)

• Double-click the concept in toolbox

or

• Right-click and choose the item in the pop-up menu

Add a concept to user model(cont.)

Add a concept to user model(cont.)

Add a concept to user model(cont.)

Delete a concept

• If the concept has no child concepts, click the concept first, then press “Delete” key.

• If the concept has child concepts, delete the child concept first, and then delete this concept.

Change the name and properties of concepts

Change the name and properties of concepts (cont.)

Change the name and properties of concepts (cont.)

Load/Save concepts

Semantic Constraints

• There are inherent semantic constraints in the hierarchical structure of the 5S model.

• 5SGraph maintains the constraints and enforces these constraints over the instance model to ensure correctness.

Example 1 (Constraint Enforcement)

• An actor can only participate in the services that have been defined in the Scenario Model.

Example 2 (Constraint Enforcement)

• A catalog has descriptive metadata for digital objects in a specific collection.

• Therefore, a catalog must have a 1:1 relationship with an existing collection.

• Thus, a catalog is not independent.

The Preliminary Test of 5SGraph

• Research Questions• Does the tool help users understand and use

the 5S model to build their own digital libraries?

• Does the tool help users efficiently describe digital library models in the 5SL language?

• Are users satisfied with the tool?

The Preliminary Test of 5SGraph: Experimental Design

• Three tasks1. Build a simple digital library using existing

components.2. Complete the partial model for CITIDEL.3. Build a model for NDLTD from scratch.

• Three measures• Effectiveness• Efficiency• User satisfaction

• 17 subjects

Measures

• Effectiveness• Completion rate

• Goal achievement

• Efficiency• Task completion time

• Closeness to expertise: minimum task time divided by task time

• Satisfaction• Subjective rating

Test Results  Task 1 Task 2 Task 3

Completion Rate (%)

100 100 100

Mean Task Time (min)

11.3 11.4 15.1

Mean Closeness to Expertise

0.483 0.752 0.712

Mean Goal Achievement (%)

97.4 97.4 98.2

Satisfaction and Usefulness

• The average rating of user satisfaction is 91%.

• The average rating of usefulness of the tool is 92%.

• Statistical analysis shows that the mean value of post-understanding of the 5S model is significantly greater than that of pre-understanding.

Educational Use

Understanding of 5S Theory

0.0

2.0

4.0

6.0

8.0

10.0

Pre Post

Learnability

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 3 5 7 9 11

13

15

17

Task1Task2Task3

Semantic Modeling of Digital Library with Concept Maps

• Customized “plugin” tool to model scenarios and societies

• Tools with common principles, abstractions, graphical notations, and operations

• Solution: Concepts Maps• Conceptual tools for organizing knowledge

and representation

Conclusions

• Presented a domain specific visual modeling tool for DLs.

• Evaluated the tool and proved efficiency, effectiveness, and learnability.

• Built new tools based on concept maps for scenario and societies modeling.

Future work on 5SGraph

• Integration of tools

• Further usability studies with “digital librarians”

• Usiing the tools as educational aids for teaching about digital libraries

Motivating Problems – Toward 5SLGen(MS Thesis of Rohit Kelapure)

• Lack of general models for Digital Libraries (DLs)

• Little focus on simplifying the process of modeling and building DLs

• Divergent DL architectures• Monolithic: Tightly integrated and generally inflexible

• Componentized: A network of interoperable components aggregated without a design methodology

Problems (contd.)

• Lack of DL-specific modeling languages, software toolkits, prototyping and CASE tools

• Lack of a scenario-based requirements analysis and design approach to DLs

• Implication: Problems with• Interoperability

• Customizability

Approach

• Based on the formal 5S theory• Streams, Structures, Spaces, Scenarios and Societies

• Use of • Domain-specific declarative languages (5SL)• Scenario-based requirements analysis and design • Componentized architectures

• Automatic transformations/mappings from models to code

• Special attention paid to issues of flexibility, reusability, and extensibility

Approach: 5SLGen

• 5SLGen is a new generic digital library generator.• It has been developed, implemented, and

deployed in several applications.

• 5SLGen yields implementations of digital library services from models of DL “societies” and “scenarios”

(and from the other “Ss”).

5S Model/ 5SL

Model Objective Primitives in 5SL

Streams Describes properties of the DL content

text, audio, video, pictures, …

Structures Specifies organizational aspects of the DL content

digital object, metadata schema, collection, …

Spaces Defines logical properties and presentational views of a DL

vector, probabilistic, boolean, …

Scenarios Details the behavior of DL services service, event, message, condition, action, state, …

Societies Defines managers, (responsible for running DL services); actors (those who use services) and their relationships

Service Managers, actors (e.g., learners, teachers, naïve users)

5SLGen: Model

Workflow

has

CompositeService

composed of

Service

CollectingService

composed of

binding

ElementaryService

LinkingService

IndexingService

RatingService

Scenarioscomposed of

generates

Manager

UserInterface

(component)

ActorsdisplaysSpaces

Societies

SocietiesStreams/Structures

communicates

Scenarios

participates

5SLSocieties Model

• Service Manager characteristics:• Name, attributes, operations, type, visibility

• Service Manager relationships:• Associations, generalizations (extends), dependencies

5SLScenarios Model

Overview Architecture for DL Modeling and Generation

5S Meta

Model5SGraph

DL Expert

DL Designer

5SL DL

Models

5SLGen

Practitioner

Researcher

TailoredDL

Services

Teacher

component pool

ODLSearch,ODLBrowse,ODLRate,ODLReview,

…….

5SLGen:Architecture5SLScenarios

Model

DLDesigner

5SLGen

5SLSocieties Model

Societiesconverter

Scenariosconverter

JavaClassesModel

XMI Serialized

5SLSocieties model

JavaController

Class

SynthesizedStatechart

import

import

ComponentPool

ODLBrowse

Java

Wrapping

ODLSearch

Java

WrappingJSPUser

InterfaceView

WebDesigner

DLServices Implementation

5SFramework

Societies-converter: Workflow

JDOMTransform

XMI:ClassModel

Xmi2Java

JavaClassesModel

JavaMapper

XMISerializer

Societies- converter

5SLSocietiesModel

DLDesigner

JDOMTransform

XMI:ClassModel

Xmi2Java

JavaClassesModel

JavaMapper

XMISerializer

XMISerializer

Societies- converter

5SLSocietiesModel

5SLSocietiesModel

DLDesigner

JavaRepresen-

tation.

5SLGen:Architecture

Scenarios-Converter: Workflow

Scenarios-converter

JDOMTransform

5SLScenarios Model

ScenarioSynthesizer

JavaController

Class

Synthesized

Statechart State Machine

Compiler

DLDesigner

State-design pattern

Relevance Feedback Search Service UML Sequence Diagram

Event seq.no. =

3

5SLScenarios instance

Scenarios-converter: Scenario-Synthesis

Scenarios-converter: Scenario-Synthesis (contd.)

Synthesized-Statechart

Component statecharts

Generated DLs

• Union Catalog• Simple DL with maximum reuse

• 2 components used: Search and Browse

• CITIDEL, including VIADUCT• Aggregates all the 5SLSocieties and

5SLScenarios models for its elementary services

Generated DL Services

• CITIDEL: Relevance Feedback Search Service• Demonstrate extensibility with the ODL Search component

• CITIDEL: Profile Based Filtering Service• Demonstrate reusability with the ODL Browse component

• CITIDEL: Multi-Classification Browsing Service• Generate complex services without any component reuse

• CITIDEL: Binding Service• Complete the set of CITIDEL services

Profile Based Filtering (PBF) Service 5SFramework

Model

Controller

View

ODL-Browse componenent

Conclusion

• Introduced a scenario-based approach to the generation of componentized DLs

• Applied the 5SFramework for generation of DLs

• Partially validated the theory of 5S

• Demonstrated that complex DLs can be built on the basis of a formal theory

• Adherence to open standards (OAI-PMH, ODL, XMI, UML) and established design patterns (MVC, GOF’s State) ensures relevance and extensibility of our work.

Future Work

• Integration of 5SLGen with 5SGraph

• Improvements to 5SFramework architecture• Scalability of the generated DLs and DL services

• Automated construction of user-interfaces with statecharts

• Support for transaction scoping and error handling

• Web services support

• Incorporating the uPortal framework

• Model Validation

• Personalization of the 5S approach using PIPE

DL Student Research: Torres

• Search in collections of fish images

• using combination of

• image properties (CBIR) and

• textual descriptions

Textual information retrieval

Query on Google using Sunset and Rio de Janeiro

Query result

Content BasedInformationRetrieval

Torres: Visualizations

Spiral Pattern

Concentric Rings Pattern

DL Student Research: Shen

• 5S and component architecture to allow handling of very complex DL applications: archaeology

• Information visualization, clustering

• Mappings across streams, structure, spaces

Case Study (Archaeology):ETANA

• NSF ITR with CWRU (and Vanderbilt …)

• Faster DL development• for complex application domains,• with suitable tailoring

• Approach• ODL – pool of components• 5S – theory-based generation of systems

ETANA Website

Lahav Website

Megiddo Opening Screen

Locus Screen: Pictures

View all

Area Screen: Distribution of Artifacts

ETANA-DL Website

Archaeology DL – Approach

• Solve the following DL problems:• interoperability,• making primary data available,• data preservation

• Modeling archaeological information systems• using 5S theory to design system and services

• Rapidly prototyping DLs that handle• heterogeneous archaeological data using• componentized frameworks

ETANA-DL Schema Design

Bone Seed Figurine

ETANA-DLObject

Count

Animal

……

Species

Name

……

Description

Dimensions

……

Owner

Subpartition

PartitionLocus

ID Container

Collection

……

Data Mapping

ETANA-DL Architecture

Users Services DataETANA-DL

UnionServices Users

DigBase

DigKit

ETANA-DL ArchitectureDigBase and DigKit

Lahav

Nimrin

Umayri

Hisban

Megiddo

Jalul

New Sites

DATABASE

WRAPPERS

ETANA-DLUNION

CATALOG

SearchUSER

INTERFACE

Browse

Recommend

Note

Personalize

Review

Visualizations

ArchaeologySpecific

Work in progress

Architecture

UnionCatalog

Inverted Files

Services DB

Index

Index

BrowseComponent

SearchComponent

Browse DB

OtherETANA-DL

Services

Web

Interface

XOAI

XOAI

DigBase

DB

DataMapping

Component

OA

I Data P

rovider

OAI

Archaeological Site ETANA-DL

DigKit

Configure

Searching – Search Results

Searching – Advanced Search

Searching – Advanced Search Results

Review of Gonçalves Achievements in Past Year

• Book Chapters1. Fox, E. A., Gonçalves, M. A., Luo, M., Chen, Y., Krowne, A., Zhang, B., McDevitt,, K.

Pérez-Quiñones, M., Cassel, L. N. Harvesting: Broadening the Field of Distributed Information Retrieval. In Multimedia Distributed Information Retrieval, eds. Fabio Crestani, Mark Sanderson, and Jamie Callan, 2003.

2. Fox, E., McMillan, G., Suleman, H., Gonçalves, M., Networked Digital Library of Theses and Dissertations. Invited chapter for “Digital Libraries: Policy, Planning, and Practice”, eds. Judith Andrews and Derek Law, Ashgate Publishing, 2003

• Journal papers1. 5S TOIS paper (April 2004, issue)2. S. Perugini, M. A. Gonçalves, and E. A. Fox. A Connection-Centric Survey of

Recommender Systems Research. Journal of Intelligent Information Systems, Jun, 2004.

3. Zhu, Q., Gonçalves, M. A., Fox, E. A.. 5SGraph: A Domain-Specific Visual Modeling Tool for Digital Libraries. Journal of the American Society for Information Science and Technology, submitted 2003, in revision

4. Baoping Zhang, Marcos Andre Goncalves, Yuxin Chen, Edward A. Fox, and Pavel Calado, "Combining Support Vector Machines and Structural Rules for Effective Filtering of OAI-Based Repositories", submitted to Journal of Digital Libraries (Springer Verlag) Special Issue on Asian Digital Libraries, 2004

• Conference papers1. Pável P. Calado, Marcos André Gonçalves, Edward A. Fox, Berthier Ribeiro-Neto, Alberto H.

F. Laender, Altigran S. da Silva, Davi C. Reis, Pablo A. Roberto,Monique V. Vieira, and Juliano P. Lage. The Web-DL Environment for Building Digital Libraries from the Web. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.

2. Marcos André Gonçalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox, Filip Jagodzinski, and Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. Proc. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.

3. Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Lillian Cassel, Edward A. Fox. Visual Semantic Modeling of Digital Libraries. ECDL'2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, 17-22 August, 2003, Trondheim, Norway.

4. Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox. Scenario-Based Generation of Digital Library Services. ECDL'2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, 17-22 August, Trondheim, Norway

5. Marco Cristo, Pavel Calado, Edleno Moura, Nivio Ziviani, Berthier Ribeiro-Neto, and Marcos André Gonçalves. Combining Link-Based and Content-Based Methods for Web Document Classification. CIKM 2003, 3-8 November, New Orleans, Louisiana, USA, 2003.

6. Baoping Zhang, Marcos Andre Goncalves, and Edward A. Fox. An OAI-based Filtering Service for CITIDEL from NDLTD. ICADL 2003, 6th International Conference of Asian Digital Libraries, 8-11 December, Kuala Lumpur, Malaysia, 2003

7. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: A Digital Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

• Conference papers8. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: A Digital

Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

9. M. A. Goncalves, E. A. Fox, A. Krowne, P. Calado, A. H. F. Laender, A. S. da Silva, and B. Ribeiro-Neto. The Effectiveness of Automatically Structured Queries in Digital Libraries. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

10. Alberto H. F. Laender, M. A. Goncalves, Pablo A. Roberto. BDBComp: Building a Digital Library for the Brazilian Computer Science Community. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

11. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. Prototyping Digital Libraries Handling Heterogeneous Data Sources - The ETANA-DL Case Study. European Conference on Digital Libraries (ECDL 2004), Bath, UK, September 12-17, 2004. (submitted)

Other publications1. R. da S. Torres, C. B. Medeiros, M. A. Goncalves, and E. A. Fox. An OAI-based Digital Library Framework for

Biodiversity Information Systems. Department of Computer Science, Virginia Tech, Technical Report No. TR-04-01, 2004.

2. R. da S. Torres, C. B. Medeiros, M. A. Goncalves, and E. A. Fox. An OAI Compliant Content-Based Image Search Component. Demo to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

3. R. da S. Torres, C. B. Medeiros, Renata Q. Dividino, Mauricio A. Figueiredo, M. A. Goncalves, E. A. Fox, and R. Richardson. Using Digital Library Components for Biodiversity Systems. Poster to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

4. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: Managing Complex Information Applications – An Archaeology Digital Library. Demo to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

5. Qinwei Zhu, Marcos André Gonçalves, E. Fox. 5SGraph Demo: A Graphical Modeling Tool for Digital Libraries. Proc. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.

Proposed Outline of Dissertation(Marcos André Gonçalves)

• Chapter 1 – Introduction and Motivation• Chapter 2 – Background and Related Work• Chapter 3 – Streams, Structures, Spaces, Scenarios and Societies: the 5S

Formal Model for Digital Libraries• Chapter 4 – Towards a Digital Library Theory: A Formal Digital Library

Ontology based on 5S• Chapter 5 – Applications of the 5S Model/Ontology

• 5.1 Declarative Specification of DLs: the 5S Language• 5.2 Semantic Visual Modeling of DLs: the 5SGraph Tool• 5.3 (Semi-) Automatic Generation of Componentized DLs: The 5SGen Tool• 5.4 Evaluating DLs: The XML Log Standard for DLs• 5.5 Formally comparing Architectures: Fedora and Buckets (time

permitting)

• Chapter 6 – Defining Quality in Digital Libraries• Chapter 7 – Conclusions and Future Work• Appendix 1- Mathematical Preliminaries

Questions/Discussion?