ETD Search Services Ming Luo Edward A. Fox ([email protected]) Virginia Tech.
NDLTD Resources & Projects ETD 2006 U.S. Regional Conference St. Louis, Missouri October 27-28, 2006...
-
Upload
mervin-quinn -
Category
Documents
-
view
231 -
download
0
Transcript of NDLTD Resources & Projects ETD 2006 U.S. Regional Conference St. Louis, Missouri October 27-28, 2006...
NDLTDResources & Projects
ETD 2006 U.S. Regional ConferenceSt. Louis, Missouri
October 27-28, 2006
Edward A. Fox,Executive Director, NDLTD
[email protected] http://fox.cs.vt.edu
Outline
AcknowledgementsInfo Life Cycle, DLs, 5S, DL Curric, OAIETDs and NDLTDPartnerships, Union Catalog, Access,
Preservation, ResearchSummary and Conclusions
Acknowledgements
• All those working with ETDs
• NDLTD, including Board, Committees, and Members
• ETD 2006 US Regional Conference Team
• Sponsors
• Presenters, Attendees
Acknowledgements: ETD Mtgs• 1987 mtg in Ann Arbor: UMI, VT, …• 1992 mtg in Washington, DC: CNI, CGS, UMI, VT and 10
universities with 3 reps each• 1993 mtg in Atlanta to start Monticello Electronic Library
(regional, US Southeast): SURA, SOLINET• 1994 mtg at VT: std: PDF + SGML + multimedia objects• 1996 funding by SURA, US Dept. of Education (FIPSE)• 1997 meetings in UK, Germany, ...• 1998 – 1st symposium – Memphis (20)• 1999 – 2nd symposium – Blacksburg (70)• 2000 – 3rd symposium – St. Petersburg (225)• 2001 – 4th symposium – Caltech (200)• 2002 – 5th symposium – BYU, Provo, Utah• 2003 – 6th symposium – Berlin (215) • 2004 – 7th symposium – U. Kentucky• 2005 – 8th symposium – Sydney, Australia• 2006 – 9th symposium – Quebec City, Canada
Acknowledgements:Future ETD Conferences
• 2007 – 10th symposium– Uppsala University, Sweden– 13-16 June
• 2008 – 11th symposium– Dartington College of Arts, Devon, UK– 29 June – 2 July (tentative)
Outline
AcknowledgementsInfo Life Cycle, DLs, 5S, DL Curric, OAIETDs and NDLTDPartnerships, Union Catalog, Access,
Preservation, ResearchSummary and Conclusions
Borgman et al.:Workshop Report onSocial Aspects ofDigital Libraries: http://www-lis.gseis.ucla.edu/DL/
InformationLifeCycle
Information Life Cycle
AuthoringModifying
OrganizingIndexing
StoringRetrieving
DistributingNetworking
Retention/ Mining
AccessingFiltering
UsingCreating
AuthoringModifying
OrganizingIndexing
Storing
Archiving
NetworkingAccessing
Filtering
Creation
DistributionUtilization
Significance
Similarity
Pertinence
AccuracyCompletenessConformance
Seeking
SearchingBrowsingRecommending
Relevance
Timeliness
Accessibility
Accessibility
Inactive
Active
Discard
RetentionMining
Semi-Active
Preservability
Timeliness
Preservability
Describing
Quality and the Information Life Cycle
crea
tion
distributionsee
kingutilization
E1:starting
E2: chaining
E3: browsin
g
E4: diffe
rentia
ting
E5: monito
ring
E6: extra
cting
storing, archiving,
networking
K1:
in
itia
tion
K2:
se
lectio
n
K3:
explorationK4:
formulation
K5: collection
K6:
presentation
auth
orin
g, m
odify
ing,
desc
ribin
g org
anizi
ng, i
ndex
ing
pres
erva
bilit
y,
sim
ilari
ty,
tim
elin
ess,
accuracy, completeness,
conformance accessibility, preservability
DL Success Constructs
Activ
e Semi-active
Inactive
E: Ellis’ modelK: Kuhlthau’s model
Digital Libraries (DLs) -- Objectives
• World Lit.: 24hr / 7day / from desktop• Ubiquitous• Integrated “super” information systems• Usable, Useful• Higher Quality, Lower Cost • Education, Knowledge Sharing, Discovery• Disintermediation -> Collaboration • Universities Reclaim Property• Interactive Courseware, Student Works
Informal 5S & DL Definitions
DLs are complex systems that
• help satisfy info needs of users (societies)
• provide info services (scenarios)
• organize info in usable ways (structures)
• present info in usable ways (spaces)
• communicate info with users (streams)
Digital Object
RepositoryCollection Minimal DL
Metadata Catalog
Descriptive Metadata
Specification
A Minimal DL in the 5S Framework
Structural Metadata
Specification
Streams Structures Spaces Scenarios Societies
indexing
browsing searching
services
hypertext
Structured Stream
Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing
Annotating Classifying Clustering Evaluating Extracting Indexing
Measuring Publicizing
Rating Reviewing (peer)
Surveying Translating
(language)
Conserving Converting
Copying/Replicating Emulating Renewing
Translating (format)
Acquiring Cataloging
Crawling (focused) Describing Digitizing
Federating Harvesting Purchasing Submitting
Preservational Creational
Add Value
Repository-Building
Information Satisfaction
Services
Infrastructure Services
DL Curriculum Development Project
• Collaborative Research launched by:- Department of Computer Science, Virginia Tech
- School of Information and Library Science, University of North Carolina, Chapel Hill
• Three year (2006 - 2008) funded project
DL Topics in 19 Modules (original)
OAI = Technical Umbrella forPractical Interoperability…
ReferenceLibraries
PublishersE-Print
Archives
…that can be exploited by different communities
Museums
OAI – Repository PerspectiveRequired: Protocol
DODO DO DO
MDO
MDO MDOMDOMDO
MDOMDOMDO
OAI – Black Box Perspective
OA 1
OA 2
OA 4
OA 3
OA 5OA 6
OA 7
DiscoveryCurrent
AwarenessPreservation
Service Providers
Data Providers
Meta
data
harv
estin
g
The World According to OAI
Outline
AcknowledgementsInfo Life Cycle, DLs, 5S, DL Curric, OAIETDs and NDLTDPartnerships, Union Catalog, Access,
Preservation, ResearchSummary and Conclusions
ETDs: History of Rationales
• EPub: SGML, Electronic Manuscript Project
• Graduate Education: Reach next generation of researchers, educators, leaders
• DL: Testbed, demonstration, case study
• Institutional Repository: Good place to start since is easy, inexpensive, beneficial, and can be extended to lead to other beneficial activities
The Networked Digital Library of Theses and Dissertations
www.NDLTD.org
Leader of the Worldwide ETD(Electronic Thesis and Dissertation) Initiative
Training AuthorsExpanding Access
Preserving KnowledgeImproving Graduate Education
Enhancing Scholarly CommunicationEmpowering Students & Universities
QuickTime™ and aCinepak decompressor
are needed to see this picture.
http://scholar.lib.vt.edu/theses/available/etd-2227102539751141/
NDLTD: How can a university get involved?
• Select planning/implementation team– Graduate School– Library– Computing / Information Technology– Institutional Research / Educ. Tech.
• Join as a member• Adapt a proven approach
– Build interest and consensus– Start trial / allow optional submission
NDLTD Goals
• For Students:– Gain knowledge and skills for the Information Age,
especially about Digital Libraries– Richer communication (digital info, multimedia, …)
• For Universities: – Easy way to enter the digital library field and benefit
• For the World: – Global digital library – large, useful, many services
• Generally: – Save time and money– Increased visibility for all associated with university
research results
Some Countries• Argentina• Australia• Belgium• Brazil• Canada• Chile• China, Hong Kong• Columbia• Finland• France• Germany• Greece• India• Italy• Jamaica• Korea• Lithuania• Malaysia• Mexico• Namibia• Netherlands
• Namibia• Netherlands• Norway• Peru• Poland• Russia• Singapore• S. Africa• S. Korea• Spain• Sudan• Sweden• Switzerland• Taiwan• Thailand• Turkey• UK• Ukraine• United Arab Emirates• USA• Venezuela• Yugoslavia
Selected Projects / Sponsors
• Australia (ADT)• Brazil (BDT, IBICT)• Canada• Catalunya• Chile (Cybertesis)• China (CALIS)• Germany• India (Vidyanidhi)• Korea• OhioLINK: 79
colleges/univs
• Portugal (National Library)
• South Africa• Texas Digital Library• UK (British Library,
JISC, Edinburgh, …)• UNESCO (especially
Latin America, Eastern Europe, Africa)
• …
NDLTD Members - 1Association Research Libraries
Ball State University
Brigham Young University
California Institute of Tech.
Consorci de Biblioteques Universitàries de Catalunya
Georg August Universität Göttingen
George Washington University
Georgetown University
Georgia Institute of Technology
Georgia Southern University
Georgia State University
Government of Canada
Griffith University
John Hopkins University
Kauno Technologijos Universitetas
Louisiana State University
L'Université du Québec à Rimouski
McGill University
New Jersey Institute of Technology
Ohio University
Oregon State U. Library
NDLTD Members - 2Penn State University
Pontifícia U. Católica do Rio de Janeiro
Portugal National Library
Rhodes University
Rita Chu (individual)
Simon Fraser University
State of Kansas
Texas Tech University
Triangle Research Lib. Net.
U. de las Américas, Puebla
Universität St. Gallen
U. Alabama at Birmingham
U. Arizona
U. Glasgow
U. Hong Kong
U. Kentucky
U. Maine
U. Missouri
U. New Orleans
U. North Texas
U. Pittsburgh
U. Pretoria
U. Southern Florida
U. Tennessee
U. Waterloo
Uppsala Universitet
Utah Academic Library Assn.
Virginia Commonwealth U.
Virginia Tech
West Virginia U. Libraries
Worcester Polytechnic Inst.
Yale University
NDLTD Member Support
• Annual conference (…, Germany, …, Sweden, UK)• ETD-L – listserv for discussion• Union catalog• Services for access: VT, OCLC, VTLS, Scirus,
Google Scholar, …• Information for ETD projects
– Standards, documentation (Guide, Marcel Dekker book)
• Advocacy for ETD activities worldwide• …
NDLTD Incorporation
• Networked Digital Library of Theses and Dissertations incorporated May 20, 2003 in Virginia, USA
• Charitable and educational purposes (501 c 3)
• Officers– Executive Director (Ed Fox)– Secretary (Gail McMillan)– Treasurer (Scott Eldredge)
Board of Directors• Suzie Allard (ETD2004, U. Kentucky)• Denise A. D. Bedford (World Bank)• Julia C. Blixrud (ARL, SPARC)• José Luis Borbinha (Natl Lib Portugal)• Alex Byrne (ETD2005, ADT: Australia)• Tony Cargnelutti (ETD2005, Australia)• Vinod Chachra (VTLS)• William Clark (Ohio State U.)• Susan Copeland (RGU, UK)• Jude Edminster (Bowling Green St. U.)• Scott Eldredge (Treasurer, ETD2002, BYU)• Edward A. Fox (Exec Director,Virginia
Tech)• John H. Hagen (West Virginia U.)• Thomas B. Hickey (OCLC)
• Christine Jewell (U. Waterloo, Canada)• Joan K. Lippincott (CNI)• Austin McLean (ProQuest)• Gail McMillan (Secretary, Virginia Tech)• Joseph Moxley (ETD2000, USF)• Eva Müller (U. Uppsala, Sweden)• Ana Pavani (PUC Rio, Brazil)• Sharon Reeves (Nat’l Library Canada)• Janice Rickards (chair of ADT)• Peter Schirmbacher (ETD2003,
Humboldt)• Samson Soong (Hong Kong U. Science
& Technology)• Hussein Suleman (U.Cape Town, S.
Africa)• Shalini R. Urs (U. Mysore, India)• Eric F. Van de Velde (ETD2001, Caltech)• Ellen Wagner (Adobe)
NDLTD Committees (Chairs)• Awards (John Hagen)• Conferences (Sharon Reeves)• Development (Peter Schirmbacher)• Executive (Edward Fox)• Finance (Scott Eldredge)• Implementation (Ana Pavani)• Membership (Eric F. Van de Velde )• Nominating (Joan Lippincott)• Standards (Thomas B. Hickey)• Union Catalog (Vinod Chachra)
NDLTD Committee Activities• Implementation
– How to apply standards?– How to coordinate a national program?– How to launch a pilot project?– How to train students regarding copyright, digital
libraries, electronic publishing, preservation, …?
• Membership, Member Support, Public Relations– How to clarify and publicize member services?– How to double membership in 2 years?– Sub-committees for regional member support?
• Please join / update your Member Info!
Standards
• PDF -> PDF/A
• SGML, XML, XML DTDs, XML Schema
• Multimedia
• References -> Reference List for CrossRef
• Packaging -> METS -> Ease Role of Search Engines
Outline
AcknowledgementsInfo Life Cycle, DLs, 5S, DL Curric, OAIETDs and NDLTDPartnerships, Union Catalog, Access,
Preservation, ResearchSummary and Conclusions
Partnerships
• UMI/ProQuest• Adobe, IBM, Microsoft, …• UNESCO• OCLC• VTLS• Ex Libris• Scirus• Google
UNESCO and ETDs(by Axel Plathe at ETD2003)
• Promoting the use of the Internet as a tool for disseminating scientific knowledge
• Facilitating the transfer of ETD expertise from developed to developing countries
• 1998: Member of the NDLTD Steering Committee• 1999: First UNESCO ETD meeting on ETD
internationalisation
• 2002: “UNESCO Guide to Electronic Theses and Dissertations”
• 2003: Model training programmes and training courses• 2003: Sponsor pilot projects• 2003: Pilot projects (Africa, Europe, Latin-America)
Union catalog: OCLC
• OCLC runs OAI data provider on TDs.
• Is getting data from WorldCat (so, from many sites!).
• Harvests from all others who contact them (see Thom Hickey).
• Need DC and either ETD-MS or MARC.
• Need a set for ETDs, or separate data provider.
OCLC SRU Interface
ETD Union Search Mirror Site in China (CALIS)(http://ndltd.calis.edu.cn – popular site!)
VTLS
• VTLS offers its free VALET system to manage ETDs at institutions, building upon Fedora, as well as VTLS software.
• VTLS runs a service provider atop the Union Catalog. It supports multilingual access through the interface, to metadata.
VTLS Content Languages
The VTLS service has data in 6 different languages. These are: English German Greek Korean Portuguese Spanish
Examples follow
Language = German; hits = 137
Full record display
Expansion of Full-text Services
• Running since Sept 2005: Scirus• In beta test: Google Scholar• Next: Microsoft ?• Challenges:
– Broadening the coverage since OAI use has not spread as widely as we would like
– Understanding use, throughout life cycle– Data and DL services quality problems– Inconsistency in way to get from metadata to the full-
text file(s)– Cross-language information retrieval
Google Co-op, Custom Search Engines
Preservation Information
• Henry M. Gladney, Ph.D. (408)867-5454 http://home.pacbell.net/hgladney
• A book, "Preserving Digital Information" is to be available approximately February 2007. See
• http://www.springer.com/east/home?SGWID=5-102-22-173677919-0&changeHeader=true&SHORTCUT=www.springer.com/3-540-37886-3
• http://home.pacbell.net/hgladney/PDI_front.pdf
LOCKSS for ETDs
• Lots of copies keep stuff safe
• Stanford (Vicky Reich)
• Initial content: journals
• Experiments, studies of Int’l ETD service– Humboldt, PUC Rio, U. Cape Town, VT, …
• Production service?
User Expertise YearsUsers' Expertise in Years
0
20
40
60
80
100
120
140
160
180
200
Years
Use
rs
Date Stamp of ETD
0
10,000
20,000
30,000
40,000
50,000
60,000
Year
Supply-Demand ComparisonETD Resources and User Demands (Number of Queries) in NDLTD
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
1 2 3 4 5 6 7 8
Academic Categories
ETDs Demands
1 Architecture and Design
2 Law
3 Medicine, Nursing and Veterinary Medicine
4 Arts and Science
5 Engineering and Applied Science
6 Business and Commerce
7 Education
8 Others. (unclassifiable)
NDLTD cross-language problem
Language NumberEnglish 123,696
Portuguese 11434
German 4131
French 3868
Spanish 1561
Chinese 1463
Catalan 804
Others 19962 (most unclassified)
Total 166919 (summer’05)
Example concept map
Ryan Richardson solution to NDLTD cross-language problem
The Concept Map: From learning tool to cross-language knowledge discovery tool
4. More advanced techniques for concept map creation
ETD of Hussein Suleman’s ETD Chapter 5
4. More advanced techniques for concept map creation
Concept map with 65 nodes
4. More advanced techniques for concept map creation
Same concept map pruned down to 15 nodes
4. More advanced techniques for concept map creation
Detail of ToC maps for ETD by Suleman
4. More advanced techniques for concept map creation
Detail of Relex map showing relationship between ‘OAI’ and ‘bandwidth’
4. More advanced techniques for concept map creation
Detail of ToC maps showing 1st sentence of section 2.8.
Outline
AcknowledgementsInfo Life Cycle, DLs, 5S, DL Curric, OAIETDs and NDLTDPartnerships, Union Catalog, Access,
Preservation, ResearchSummary and Conclusions
Summary and Conclusions
(Editorial: ETDs & NDLTD progressing well!)Summary Words and PhrasesCrossing the ChasmStepsSpirit of NDLTDSelected Links
Conference Summary Words - 1
accessibility aggregation alert
annotate archive arts
attitudes authentication authoring
authorization automation browse
catalog collaboration community
components context conversion
customer decentralized digitize
discourse discovery dissemination
DSpace federated Fedora
global grid economic
harvesting ingest innovation
institutional integrity interaction
Conference Summary Words - 2
interchange interoperability knowledge
LOCKSS management metadata
national OCR organization
partnership PDF (/A) podcasting
portal preservation provider
regional repository retrieval
scalability Scirus search
server service sharing
standardization strategic student
summarization sustainable testimonial
toolkit training tutorial
Unicode usable VALET
XML XSLT workflow
Conference Summary Phrases - 1
alumni development always on
business model concept map
content management copyright compliance
cost effective Creative Commons
creative material cross language
dark archive developing country
digital library digital rights management
digital signature disruptive technology
document model Dublin Core
Conference Summary Phrases - 2
e-knowledge e-publishing
e-research e-science
full text Google Scholar
institutional repository LDAP server
learning object mandatory deposit
Million Book Project national initiative
Net Gen OAI PMH
online digital studio open access
Open Archives Initiative open source
Conference Summary Phrases - 3
persistent identifiers postgraduate research
public domain restricted access
retrospective conversion scholarly communication
server log service oriented architecture
social network stepping stone
subject gateway survey data
union catalog unlocking IP
user centered value added
voluntary participation walking the talk
web based web services
Journal?
• Joseph Moxley [[email protected]]• JET / JED: Journal of Electronic Theses and
Dissertations• JODLAR: Journal of Digital Libraries and
Repositories• Publisher: ?• Columns / Sections
– Project Case Studies– Standards, Technologies, Best Practices– Software, Systems, Services– Statistics, Surveys, Analytical Studies
Steps
• Join NDLTD• Launch initiative, dialog, encourage• Pilot -> requirement• OAI data provider• Log, survey, analyze, improve• Attend ETD xx• Help other sites• Serve on NDLTD committees• Extend services: preservation, inst. rep., …
Spirit of NDLTD• Help make a better (smaller) world• Win-win-win (everyone can benefit)• Have fun helping others• Helpers/teachers learn more than those they work with• Build on standards• ETDs are preservable, popular, expressive, “better”
• Doable, feasible, learnable, affordable, sharable
• Please join/support NDLTD!
Selected Links - http://fox.cs.vt.edu
• NDLTD (electronic theses and dissertations worldwide)– www.ndltd.org and etdguide.org– http://fox.cs.vt.edu/etd-search.htm
• DL curriculum - http://curric.dlib.vt.edu/wiki, http://curric.dlib.vt.edu/DLcurric.html
• Virginia Tech Digital Library Research Laboratory (DLRL, www.dlib.vt.edu)