Edward A. Fox fox@vt fox.cs.vt CS DLRL Internet TIC

Post on 04-Jan-2016

32 views 0 download

Tags:

description

Digital Libraries: Extending and Applying Information Science and Technology ProLISSA October 26-27, 2000. Edward A. Fox fox@vt.edu http://fox.cs.vt.edu CS DLRL Internet TIC Virginia Tech, Blacksburg, VA, USA. Thanks!. Theo Bothma Petrina Bothma Peter Ingwersen - PowerPoint PPT Presentation

Transcript of Edward A. Fox fox@vt fox.cs.vt CS DLRL Internet TIC

Digital Libraries: Extending and Applying Information Science and Technology

ProLISSAOctober 26-27, 2000

Edward A. Fox

fox@vt.edu http://fox.cs.vt.edu

CS DLRL Internet TIC

Virginia Tech, Blacksburg, VA, USA

Thanks!

Theo Bothma Petrina Bothma Peter Ingwersen Irene Wormell ProLISSA and DISSAnet staff DANIDA

Acknowledgements (Selected)

Mentors: JCR Licklider, Michael Kessler, Gerard Salton Sponsors: Adobe, IBM, Microsoft, NLM, NSF, OCLC, SOLINET,

SURA, UNESCO, US Dept. of Ed. (FIPSE), … VT Faculty/Staff: Tony Atkins, Thomas Dunbar, Debra Dudley,

John Eaton, Gwen Ewing, Peter Haggerty, Gary Hooper, Gail McMillan, Len Peters, James Powell, …

VT Students: Emilio Arce, Fernando Das Neves, Brian DeVane, Robert France, Marcos Goncalves, Scott Guyer, Robert Hall, Neill Kipp, Paul Mather, Tim McGonigle, Todd Miller, Constantinos Phanouriou, William Schweiker, Ohm Sornil, Hussein Suleman, Patrick Van Metre, Laura Weiss, …

JCDL 2001

First Joint ACM/IEEE Conference on Digital Libraries

http://www.jcdl.org June 24-28, 2001 in Roanoke, VA Conference Committee: General Chair: Edward A. Fox, Virginia Tech Program Chair: Christine Borgman, UCLA Treasurer: Neil Rowe, Naval Postgraduate School Posters Chair: Craig Nevill-Manning, Rutgers U.

Why this topic today?

Many users (patrons) prefer digital libraries to traditional libraries or the Web

Digital library collections often are free or less expensive, so are heavily used

Most publishers are working toward digital libraries to allow access to their content

Library and information science professionals are key players in building digital libraries

Outline

Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications

Libraries of the FutureJCR Licklider, 1965, MIT Press

World

Nation

Province

City

Community

Licklider – Unified Theory?

Not ready in 1960s Analog – unified field theory in physics “Mess” today – segmented field, specialities

– Database <-> Knowledge <-> Content Mgmnt– Multimedia, Hypermedia, Hypertext– Logic, Algebra, Artificial Intelligence, …

Expensive, annoying for users– Don’t know where to look– Don’t know how to use services

D ig ita l L ib ra r y C o n te n t

A rtic le s ,R e p o rts,

B o o ks

T e xtD o cum e n ts

S p ee ch ,M u s ic

V id eoA u d io

(A e ria l)P h o tos

G e og rap h icIn fo rm ation

M o d e lsS im u la tio ns

S o ftw a re ,P ro g ra m s

G e no m eH u m a n,a n im a l,

p la n t

B ioIn fo rm ation

2 D , 3 D ,V R ,C A T

Im ag es a ndG ra p h ics

C o nte n tT yp e s

Computing (flops)Digital content

Com

mun

icat

ions

(ban

dwid

th, c

onne

ctiv

ity)

Locating Digital Libraries in Computing andCommunications Technology Space

Digital Libraries technologytrajectory: intellectualaccess to globally distributed information

less more

Grand Challenges Can

Mobilize the community Spur creativity Lead to important benefits in society Push researchers to develop relevant theories Force people to work in teams/groups Convince funding agencies to invest Help bring about integration of systems,

interoperability, and seamless interfaces

DLs: Why of Global Interest? National projects can preserve antiquities and

heritage: cultural, historical, linguistic, scholarly Knowledge and information are essential to

economic and technological growth, education DL - a domain for international collaboration

– wherein all can contribute and benefit– which leverages investment in networking– which provides useful content on Internet & WWW– which will tie nations and peoples together more

strongly and through deeper understanding

Digital Libraries --- Objectives

World Lit.: 24hr / 7day / from desktop Integrated “super” information systems: 5S:

streams, structures, spaces, scenarios, societies Ubiquitous, Higher Quality, Lower Cost Education, Knowledge Sharing, Discovery Disintermediation -> Collaboration Universities Reclaim Property Interactive Courseware, Student Works Scalable, Sustainable, Usable, Useful

MARIAN Layers

Database Layer

Search Engine Layer

User Information Layer

User Interface Layer

User User User User

DL Components

User Interfaces

Workflow Mgr

DBMS

Search Engines

Data, MM Info

Gateways

Repository

Rights Mgr

MM/ HT Renderer

Digital LibrariesShorten the Chain from

Editor

Publisher

A&I

Consolidator

Library

Reviewer

DLs Shorten the Chain to

Author

Reader

Digital

LibraryEditor

Reviewer

Teacher

Learner

LibrarianDr. Patient

Benefits

Ease of use Effectiveness

“The benefits of digital libraries will not be appreciated unless they are easy to use effectively.” - IITA Workshop report

Definitions

Library ++ (library+archive+museum+…) Distributed information system + organization

+ effective interface User community + collection + services Digital objects, repositories, IPR management,

handles, indexes, federated search, hyperbase, annotation

Outline

Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications

PetaPlex Top View

4 ft.

side

PetaPlex Side View

4 ft. wide

8 ft.

high

Roles:* Support* Cooling* Power

15

shelves

PetaPlex Complex

FRONT END MACHINERS/6000, 1G RAM, 4 Proc.

Nanoserver

Nanoserver

Nanoserver

Nanoserver

Nanoserver

Nanoserver

Nanoserver

Nanoserver

Nanoserver

Nanoserver Nanoserver

Nanoserver

Nanoserver Nanoserver

Service

Machine 1

Service

Machine 2

Service

Machine 3

Service

Machine 4

PetaPlex

Digital Library Machine (“super” object store): Parallel computer / storage utility

Research: inverted files, video server, … Knowledge Systems Incorporated is supplying

VT-PetaPlex-1 with 2.5 terabytes through 100 nodes:

Net connection + 25GB disk + 233 MHz Pentium + Linux

Structured Video Browser(making video into hypermedia)

www.learn.umd.edu

IBrowse

Expository multimedia Narrative Structures

ICUInformation and CommunicationUniversity

Users Web Search Engines

WWW

Servlet Engine

Web Server

OSDB

Search Server

Servlet Servlet ServletMPEG-7

DescriptionModule

1

2

3

4

5

3’

4’

5’

MPEG-7 Image Library Systems Tech. M

PE

G-7 Im

age Library System

s

MP

EG

-7 Video Library S

ystems T

ech.

ICUInformation and CommunicationUniversity

MPEG-7 Video Library Systems Tech.

Video Data

Description GeneratorDescription Schemes

Design Tool

DescriptionScheme

MetaDatabase

VideoDatabase

Retrieval ServerModule

PlayerP

resentation

Module

Architecture

LMDS offers a LOT of bandwidth(comparison to previous auctions)

0 200 400 600 800 1000 1200MHz

Interactive & Video Data

Wireless Communications Service

PCS D-F Block

Digital Audio Radio Service

Cellular Unserved

PCS A-C Block

DBS

MMDS

LMDS

LMDS is:- 1300 MHz in two “Blocks” ( 28-31 GHz)- Over 2X bandwidth of AM/FM radio, VHF/UHF television, and Cellular telephone combined.- More than sum of previous 16 auctions

LMDS Hub Site at Slusher Hall

Radio Hut

Wavtrace Tower 1 Wavtrace Tower 2

Eventually LMDS could be used in combination with other wireless and wireline

technologies to reach individual homes

SPIRE Visualization

CAVE-ETD CAVE-ETD is a simulation of a library that

runs in a CAVE (VR environment). Populated with a subset of ETD records.

Main Foyerroom

room

room

room

Reading Book Abstract

Integrated Integrated CCLINC CCLINC Translingual Information SystemTranslingual Information System

Integrated Integrated CCLINC CCLINC Translingual Information SystemTranslingual Information System

DARPA

Extraction

What is th

e north korean

movement in th

e front li

ne?

CCLINC SERVER

Info Detection

Summarization

It seems that North Korea launch a missile againAfter North Korea launched a Daipodong missilelast month, NK is perceived to proceed to an additionaltest launch. Korea, US and Japan enter into an alertstate, and prepare for a joint response policy. Korea estimates that the additional launch will be on 09/05. Japan estimates that NK’s missile range is short. USinformation says that there is no sign of launch yet.

Translation

What is th

e status of nk

missile la

unch against japan?

BugHanI IlBonE Ddo MiSaIlEul

BalSaHan Deus HaDa

2-w

a yS

pe e

c h T

ran

s ati

on

Outline

Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications

Definitions

Library ++ (library+archive+museum+…) Distributed information system + organization

+ effective interface User community + collection + services Digital objects, repositories, IPR management,

handles, indexes, federated search, hyperbase, annotation

Definition: Digital Libraries are complex systems that

help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

5S Layers

Societies

Scenarios

Spaces

Structures

Streams

Definition: 5S Framework Societies: interacting people (, computers) Scenarios: services, functions, operations, methods Spaces: domains + constraints (e.g., distance,

adjacency): 2D, vector, probability Structures: relations, trees, nodes and arcs Streams: sequences of items (text, audio, video,

network traffic) (5 Element System: Fire, Wood, Earth, Metal, Water)

5S: Components

Societies: roles, rituals, reasons, relationships, artifacts Scenarios: acquire, index, consult, administer, preserve Spaces: physical, temporal, functional, presentational,

conceptual Structures: architectures, taxonomies, schema,

grammars, links, objects Streams: granularities, protocols, paths, flows,

turbulences

5S: Combinations

Societies + Scenarios = user model Societies + Scenarios + Spaces = user interface Streams + Structures = markup Streams + Structures + Scenarios = object Structures + Scenarios = DBMS

Outline

Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications

Complex to Simple

MARC ($50) Dublin Core (DC)

Author‘s toolswww.physik.uni-oldenburg.de/EPS/mmm

DL Components

User Interfaces

Workflow Mgr

DBMS

Search Engines

Data, MM Info

Gateways

Repository

Rights Mgr

MM/ HT Renderer

Open Archives Initiative

OAIwww.openarchives.org

openarchives@openarchives.org

OAi Philosophy

Self-archiving = submission mechanismLong-term storage system = archiveOpen interface = harvesting mechanismData provider + service providerStart with “gray literature”

– e-prints/pre-prints, reports, dissertations, …

Archive of Digital Objects

ArchiveAccessProtocol

Handle(ID)

Digital object

terms and conditions

OAI – Repository Perspective

Required: Protocol

DODO DO DO

MDO

MDO MDOMDOMDO

MDOMDOMDO

OAI – Black Box Perspective

OA 1

OA 2

OA 4

OA 3

OA 5OA 6

OA 7

Black Box OAI-ETD Perspective

ISTEC(Ibero

America)

PhysDis

NSYSU(Taiwan)

ADT(Australia)

BN.PT(Portugal)

www.theses.org

CyberTheses(Francophone)

VT

Dissert.Online(Germany)

MITOhioLINK

CBUC(Catalunya)

NDC(Greece)

SEALS(S.Africa)

CIC U. Bergen(Norway)

Outline

Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications

Presidential Directive - 12/17/1999Subject: Use of Information Technology to Improve Our Society

“13. The Secretary of the Smithsonian Institution, the Director of the National Science Foundation, the Director of the National Park Service, and the Director of the Institute of Museum and Library Services shall work with the private sector and cultural and educational institutions across the country to create a Digital Library of Education to house this country's cultural and educational resources.”

Programmatic History

Digital Libraries Initiative (DLI 1) - NSF/NASA/ARPA, FY 94-97

DLI 2 - NSF, et al., initiated in FY 98, continuing

in UG Education FY 98-99 DLI 2 Special Emphasis

NSDL ProgramNSF: FY 00-02

DL Operational

Fall, 2002

DLs & UG Earth Systems Educationinitiated FY 99, continuing

Vision

A Learning Environments and Resources Network for SMET

Education (LEARNS)

Designed to meet the needs of learners, in both individual and collaborative settings

Constructed to enable dynamic use of a broad array of materials for learning, primarily in digital format

Managed actively to promote reliable anytime - anywhere access to quality collections and services, available both within and without the network

“The network is the library.”

LEARNS Connects:

Users: students, educators, life-long learners

Content: structured learning materials; large real-time or archived datasets; audio, images, animations;primary sources; digital learning objects (e.g. applets);interactive (virtual, remote) laboratories; ...

Tools: search; refer; validate; integrate; create; customize; publish; share; notify; collaborate; ...

Expectations of Tracks Core Integration: to coordinate a distributed alliance of

resource collection and service providers, and to ensure reliable and extensible access to and usability of the resulting network of learning environments and resources

Collections: to aggregate and actively manage a subset of the digital library’s content within a coherent theme or specialty

Services: to increase the impact, reach, efficiency, and value of the digital library in its fully operational form

Targeted Research: to have immediate impact on one or more of the other three tracks

CS Teaching Center (CSTC)

Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units.

Learners benefit from having well-crafted modules that have been reviewed and tested.

Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built.

ACM Education Board and SIG support, new NSF grant with COLLEGIS Research Institute and others …

Browsing (1)

Browsing (2)

A Digital Library Case Study

Domain: graduate education, research

Genre: ETDs = electronic theses & dissertations

Submission: http://etd.vt.edu

Collection: http://www.theses.org

Project: Networked Digital Library

of Theses & Dissertations http://www.ndltd.org

(NDLTD – remember: ND LTD / NDL TD) (also, newer NUDL:

Networked University Digital Library, with e-courseware, etc.)

GradProgram Library IT Ed.

(Tech)

The Networked Digital Library of Theses and Dissertations

www.NDLTD.org

Leader of the Worldwide ETD(Electronic Thesis and Dissertation) Initiative

Training AuthorsExpanding Access

Preserving KnowledgeImproving Graduate Education

Enhancing Scholarly CommunicationEmpowering Students & Universities

What are the long term goals?

Attract all TDs/yr: 50K D-US, 25K D-Germany, 10K TD-Canada, …

>200K/yr rich hypermedia ETDs that may turn into electronic portfolios (images, video, audio, …)

Dramatic increase in knowledge sharing: literature reviews, bibliographies, …

Services providing lifelong access for students: browse, search, prior searches, citation links

Hundreds/thousands of downloads / year / work

Student Gets CommitteeSignatures and Submits ETD

Signed

Grad School

Graduate School Approves ETD, Student is Graduated

Ph.D.

Library Catalogs ETD, Access isOpened to the New Research

WWW

NDLTD

User Search Support(multilingual, XML)

NDLTD W orld FederatedSearch

V irg in ia Tech ...(un iv )

D isserta tionsO nline

(G erm any)

O hioLink(lib / un iv group)

Portugese N L ...(national lib)

Austra lia(reg ional)

O AS,ISTEC(Latin

Am erica)

UserInterface

Note: All groups shown are connected with NDLTD.

Access Possibilities

Websearchengines

librarycatalogclients

www.theses.org

www.openarchives.org

3rd

PartyServices(e.g.,UMI)

VirginiaTech

NationalLibrary ofPortugal

CBUC(Spain)

OhioLink

MIT NationalProjects:AU, GE, …

Status of the Local Project Approved by university governance Spring

1996; required starting 1/1/97 Submission & access software in place Submission workshops for students (and

faculty) occur often: beginner/adv. Faculty training as part of Faculty

Development Initiative Over 3000 ETDs in collection – some have

audio, video, large images, software, …

US University Members (44) Penn. State University Rochester Institute of Tech. U. of Colorado Health Science Center U. of Florida U. of Georgia University of Hawaii, Manoa U. of Iowa U. of Kentucky U. of Maine U. of North Texas – required since 8/99 U. of Oklahoma U. of South Florida U. of Tennessee, Knoxville U. of Tennessee, Memphis U. of Texas at Austin – required in 2001 U. of Virginia U. Wisconsin - Madison Vanderbilt U. Virginia Commonwealth U. Virginia Tech - required since 1/97 West Virginia U. - required fall 1998 Western Michigan U. Worcester Polytechnic Inst.

Air University (Alabama)Baylor UniversityBrigham Young University (part, whole)CaltechClemson UniversityCollege of William & MaryConcordia University (Illinois)East Carolina UniversityEast Tenn. State U. – require fall 2000Florida Institute of TechnologyFlorida International UniversityGeorge Washington UniversityLouisiana State UniversityMarshall University (W. Va.)Miami University of OhioMichigan TechMississippi State UniversityMITNaval Postgraduate School (CA)New Mexico TechNorth Carolina State University

OhioLINK

Statewide Consortium Represents 79 colleges, universities, libraries Public Universities Private Universities and Colleges 2-Year Colleges Only a few (e.g., Miami U. of Ohio) are also

NDLTD members on their own

National / Regional Projects Australia

– U. New South Wales (lead)– U. of Melbourne– U. of Queensland– U. of Sydney– Australian National U.– Curtin U. of Technology– Griffith U.

Germany– Humboldt University (lead)

– 3 other universities

– 5 learned societies: Math, Physics, Chemistry, Sociology, Education

– 1 computing center

– 2 major libraries

Consorci de Biblioteques Universitàries de Catalunya, as group, www.cbuc.es:– Universitat de Barcelona– Universitat Autonòma de Barcelona– Universitat Politècnica de Catalunya– Universitat Pompeu Fabra– Universitat de Girona– Universitat de Lleida– Universitat Rovira i Virgili– Universitat Oberta de Catalunya– Biblioteca de Catalunya

India, Portugal, … South Africa: SEALS, …

Other Countries with Members

Belgium Brazil Canada Germany Hong Kong India Italy Korea Mexico

Netherland Norway Russia Singapore S. Africa S. Korea Spain Taiwan UK

ECHEA / SEALS (S. Africa) Mellon Foundation $80K Eastern Cape Higher Education Association South East Academic Library System Border Technikon Eastern Cape Technikon University of Fort Hare Port Elizabeth Technikon Rhodes University (first to require outside US) University of Port Elizabeth University of Transkei

(and members elsewhere, e.g., University of Pretoria)

GermanPhysDis

Collection

5SL Source

Description

wrapper wrapper

Harvestprotocol

VT OAI

Collection

MARIAN/DEByE Mediation Middleware

MIT ETDCollection...

Open Archives

protocol

wrapper...Dienst

protocol

SOIF

DublinCore RFC1807

NDLTD/NUDL/Digital Library User

Queries + Results

Belief Network LayerFusion Layer

Additional Evidential

Information

GreekHellenic Dissertations

Collection

wrapper

MARCZ39.50

protocol

WrapperGenerator

Local Data Store

Search ServicesRecommendation Services, etc

AnalysisIndexingLinking

Build Local ETD Site

Digital Library

Policies

Inspection/Approval

Workshop/Training

ETD

ETD

In South Africa

DISSAnet papers Library of Parliment Howard Pim Africana Library,

University of Fort Hare Collections in 11 Languages Cultural Heritage …

Remember

Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications

Conclusions

Consider DLs (like the poetry project/paper) in South Africa

Education is one important application Cultural heritage, linguistic diversity, are important

to preserve Technology opens up exciting opportunities Having a framework and theory may lead to better

systems and broader applicability