Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing...

59
Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing [email protected] Grainger Engineering Library Information Center University of Illinois at Urbana- Champaign National Digital Archives Project Office of Taiwan March 25, 2002

Transcript of Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing...

Page 1: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Digital Library Technologies at the Grainger Library

William H. Mischo, Timothy W. Cole, Tom Habing

[email protected]

Grainger Engineering Library Information Center University of Illinois at Urbana-Champaign

National Digital Archives Project Office of Taiwan

March 25, 2002

Page 2: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Outline• IR Tools and Full-Text

• Distributed Information Environment.

• Illinois Projects.

• XML Technologies.

• Metadata Technologies.

• DOIs, Linking, Local Resolver

• OAI

• Portals, Simultaneous Search, Linking

• Issues & Trends.

Page 3: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Overview• We now have the tools to pursue the grand

challenges of Information retrieval:– standard retrieval environment (Web) and

interface/client (Web Browser).– Standardized search/retrieval mechanisms (HTTP

Post/Get, SQL, Z39.50).– Standard language for describing and transforming

content and metadata (XML, XSLT, DC, DCQ, RDF, Schemas).

– Standard transport mechanisms to connect heterogeneous content (HTTP, SOAP, OAI).

• Candidate set of ‘best practices’ for IR.

Page 4: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

The Digital Library• ‘Digital’, ‘Virtual’, ‘Electronic’ Library as

network-based library without regard to place and time.

• Tendency to apply term to collections and resources.

• Digital Collections vs. Digital Library.

• Emphasis on the integration of collections and services (NSDL).

• Application of standards and protocols is important.

Page 5: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Full-Text Technologies

• Continuum of Web-Enabled technologies -- all presently being utilized.

• Evolving technologies and standards.

• Role and history of markup.

• XML: its role and importance.

• The Smart Document.

Page 6: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 7: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Scholarly Communication Overview• E-Resources are Web-based and publisher-centric.

• Growth of Heterogeneous Distributed Repositories.

• Value-added services and ‘branding’ of journals.

• Prestige of Journals and Publishers

• Reciprocal linking relationships between publishers.

• Cooperation on linking standards (DOI, CrossRef).

• Alternative publishing models - Academia, Preprint Servers, disintermediation.

Page 8: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Distributed Information Model

• Diverse information environment in which we operate.

• Multiple elements, relationships and nodes.

• Need for gateway, interface, and navigation tools.

• Need for document representation, transmission, linking, and retrieval middleware tools and standards.

• Role of A & I Services.

Page 9: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 10: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Distributed Repository Issues• Integration of discrete publisher repositories,

locally loaded full-text, local and remote A & I services, OPAC, Web resources, and local data.

• Issues for user access:– need to identify appropriate publisher repository,

but presently interfaces are different and full-text and controlled vocabulary searching often not offered.

– A & Is: not full-text but offer controlled vocabulary, no links to full-text repositories.

Page 11: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Distributed Repository - Needs • Integration of discrete publisher repositories,

locally loaded full-text, local and remote A & I services, OPAC, Web resources, and local data.

• Support simultaneous searching of A & I Services, Distributed Repositories, OPACs, Web search engines, local files. Integrate TOC, full-text.

• Remote Reference 24 X 7.

• Metadata harvesting, archiving.

• Local Resolver services for locally loaded or Aggregator Resources.

Page 12: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Illinois Testbed Project• Funded under DLI-I by NSF, DARPA, and

NASA, 1994--1998. Awards made to 6 universities.

• Large-scale Testbed, Distributed Repository models, evaluation, Web software.

• Funded under CNRI D-Lib Test Suite Program, 1998—2001.

• Collaborating Partners Program. AIP, APS, ASCE, IEE, NRL, ASM, ACM, NTT Learning Systems, Elsevier.

• All XML Journal -- AIP, APS, ACM.

Page 13: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Illinois Testbed• American Institute of Physics--APL, JAP, RSI

– 18,000+ articles, 1995--.• American Physical Society--PRL

– 14,000+ articles, 1995--, weekly updates.• ASCE Journals (25 titles)

– 10,000+ articles, 1995--.• IEE Proceedings and Electronics Letters

– 8,500+ articles, 1993--.

• IEEE Computer Society.• ASM (American Society for Materials) Handbook.• ACM (Association for Computing Machinery)

Transactions.• Elsevier Science.

Page 14: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Project Issues• Evolution of the Document.• Distributed information environment.• Use of Metalanguages & Transformations

(SGML, XML).• Searching over full-text of journals vs. document

surrogates in A & I format.• Rendering and styling (SGML, XML, MathML).• Dynamic metadata for normalization, linking.• Breadth and depth of collections.• User needs.

Page 15: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Accomplishments• Process & retrieve from multiple publishers &

heterogeneous DTDs.

• Metadata specification that uses RDF, Dublin Core (DCQ, DC Agents) Schemas, IDLI Namespace.

• Cross-repository searching (Testbed & D-LIB Test Suite). Full-Text and Metadata.

• SGML to XML Conversion.

• XSLT, CSS, for transformation & rendering, including Mathematics.

Page 16: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Accomplishments (2)• Linking: Forward/Backward within Testbed,

from/to A & I Services.

• Conversion of ISO 12083 math markup to MathML.

• Enhanced Web retrieval mechanisms: Author Word Wheels, Co-Occurrence Matrices.

• Detailed user transaction logs, gathered at the search argument level, with identification of characteristics of each user search sessions

• Local Link Server for DOIs, Context-Sensitive linking.

Page 17: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Accomplishments (3)• CSS/DHTML Math rendering techniques,

TechExplorer integration. Two international math conferences.

• Simultaneous search within DeLiver of Tesbed repositories, A & Is, NCSTRL.

• Local Link Server and Appropriate Copy Issues.

• Simultaneous search of A & Is, OPAC, Google, Local resources with integrated reference linking using OpenURL and DOIs from A & Is.

• Open Archives Initiative (OAI).

Page 18: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Ongoing Investigations (1)• Support simultaneous searching of A & I

Services, Distributed Repositories, enhanced navigation, expanded gateway functions.

• Interoperability models, e.g., Metadata harvesting vs. Federated (Broadcast).

• OAI Provider and Harvesting software. OAI EAD and Cultural Heritage collection and retrieval system.

• HTTP harvesting, Spider technology (gathering).

Page 19: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Ongoing Investigations (2)• Archiving.

• Local Link Server with context-sensitive resources.

• Reference Linking integration built on OpenURL and DOI.

• NSDL presence.

• Reference Assistant software with simultaneous search, point-of-contact assistance, and remote reference capability..

Page 20: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 21: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML (eXtensible Markup Language)

• Like SGML, a Data Description Language (Metalanguage).

• Subset/version of SGML.• Allows fine-granularity markup of content and structure.

Author can create their own elements (extensible).• Tags define the structure of document not presentation

format.• Validated vs. “well-formed” - separation of authoring

process from representation & presentation.• Either validated in DTD/Schema or well-formed.• Compatible with relational DBs.

Page 22: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML and Publishers• Seybold Seminars Publishing 2000, Boston,

February 2000.

• Tim Gill of Quark, “…the use of XML could lead to a drop in the cost of Web publishing by 30% to 50% and a significant reduction in the time it takes to produce sites.”

• Gill: “I don’t believe that there is any innovation in print that is going to save us even 10% in costs.”

• Issues and Challenges remain.

• Publishers are looking at the all-XML journal.

Page 23: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML Features• The milestones in document description and

transmission: ASCII, TCP/IP, HTTP and HTML, XML. Web Programmability.

• DTD not required with XML. Needed if internal entities.

• Use of Document Object Model (DOM).

• Technology approach from Web developer’s standpoint: XML data, CSS presentation layer, XSLT to transform the structure (‘view’) of the data/document.

Page 24: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Role of XML• “If you ask 20 people in the industry, ‘what is

XML?’ You’ll get 20 different answers – Dale Fuller, CEO, Inprise Corporation.

• Vendor-Neutral, platform-independent structured information standard.

• Document representation and interchange Standard.

• Applications can externalize their data/metadata as XML.

• Issues with full-text representation: PDF, XML/HTML. Value in indexing, retrieval.

Page 25: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML Parser APIs: Tree-Based and Event-Based

• DOM (Document Object Model).– DOM Level 1 and Level 2 W3C recommendation. Widely

implemented, Tree-Based. Hierarchy of nodes. Loads entire document into memory. Level 2 adds namespace support, traversal, stylesheets, events, triggers. Level 3 working draft. DOM HTML candidate. Parsers allow developers to iterate through documents, change document content.

• SAX (Simple API for XML).– Open-source, XML-DEV, not W3C. Event-based, fires events

as it reads document, need not load entire document into memory. Good for single-pass processing. Xerces, XML4C, Sun Project X (Crimson).

Page 26: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML Linking• XML Base http://www.w3.org/TR/xmlbase

– Permits use of relative URI path prefixes. Can then shorten references.

• XLink http://www.w3.org/TR/xlink/– Method for specifying navigational links. Allows

enforcement of specific path order through links. xlink:type=“simple” corresponds to HTML <a> or <img> tags.

• XInclude http://www.w3.org/TR/xinclude– Copies entire XML documents or selected portions into

current document. Candidate recommendation. Uses XPath and XPointer to specify document elements to include.

• XPointer http://www.w3.org/TR/xptr– Uses XPath to identify portion of a document. Permits string

searches and range specifiers.

Page 27: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML Schema and Structure

• DTD– Original schema representation, defines structural rules for a

class of XML documents.

• XML Schema http://www.w3.org/XML/Schema– Also sets out standardized structure for class of XML

documents. Is coded in XML, can be parsed and edited with standard software. Two separate parts: structures and datatypes.

• Namespaces http://www.w3.org/TR/REC-xml-names/– Allows developers to qualify element and attribute names

with unique URIs, avoids recognition errors.

Page 28: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML Implementations• XHTML, SVG (Structured Vector Graphics),

XForms (similar to HTML forms).• MathML http://www.w3.org/Math/

– Markup language for describing mathematics, both presentation and content.

• RDF http://www.w3.org/RDF/

– Resource Description Framework. Defines structure for encoding object metadata. Facilitates metadata interchange & harvesting. RDF Schemas.

• Others: DocBook, XML ISO12083, Open eBook, WAP/WML.

Page 29: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Searching and Transformation• XPath http://www.w3.org/TR/xpath

– Defines pattern-matching syntax used by XSLT and XPointer. Method for selecting data in a document. MSXML 3.0 supports XPath. Supercedes XPatterns./descendant-or-self::node()/child::name

• XSL– Includes transformative and FO formatting objects. FO will

replace CSS for document formatting.

• XSLT http://www.w3.org/TR/xslt– Mechanism for encoding style rules, ensures consistent

rendering of XML documents of the same type.

• XML Query http://www.w3.org/XML/Query– Response to limitations of XPath. Would bring database-

style queries to XML documents.

Page 30: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Remote Object Access• SOAP (Simple Object Access Protocol)

– Microsoft, IBM, Sun. Allows applications to invoke objects or functions residing on remote servers. Creates request block in XML.

• XML-RPC http://www.xmlrpc.com/– Remote procedure calling using HTTP as the

transport and XML as the encoding. Open, but not standard protocol; widely adopted.

• Web Services.

Page 31: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Remote Object Access• Web Services:

– Based on XML, SOAP, UDDI (Universal Description, Discovery, and Integration), and WSDL (Web Services Description Language). Applications are assembled on the fly in XML, exposed to the world, and accessed via the Web from different devices.

– Supported by Microsoft .net, IBM WebSphere, SUN ONE.

Page 32: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XML, XSLT, and CSS• Use XML full-text articles as ordered hierarchy

of content objects.

• Generate item-level metadata in XML, using RDF and Dublin Core syntax and semantics.

• XSLT and CSS used to present metadata and articles in either XML or HTML format depending on Browser.

• Mathematics rendering using MathML tools (conversion from ISO 12083 to MathML).

• Real-time transformation between XML and HTML using XSLT (scalability issues).

Page 33: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

XSLT Where Should It Happen• Client-side

– IE5+ only• Not Netscape 6 or Mozilla (yet)• IE5 not yet fully compliant w/ XSLT and XPath standard

– Can reduce the load on your servers– But performance on low-end clients can be BAD

• Server-side– Performance could be a problem on busy servers, serving

large, complex documents– More control & flexibility over the conversion

(metamerge)• Offline Preconversion

– Best performance– Not best for dynamic documents (metamerge)

Page 34: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Converting XML to HTML (XSLT)• Simple one-to-one conversions:<sect> becomes <span class="sect">– span.sect {display:block;margin-left:2em}

• Attribute based conversions:<emph type="1"> becomes <span class="emph_1">– span.emph_1 {font-style:italic}

• Generated text, such as punctuation:<ag><au>Tom</au><au>Tim</au><au>Bob</au></ag> becomes Tom, Tim, Bob.

• Rearranged children:<au><sn>Habing</sn><fn>Tom</fn></au> becomes Tom Habing

Page 35: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Converting XML to HTML (cont.)• Some elements are converted into HTML elements other

than <span> or <div>– Figures are converted to <img src="…"> tags.– Internal links with ID and IDREF attributes are usually

converted into HTML anchor tags.– Table elements are converted into corresponding HTML <table>, <tr>, or <td> tags.

• ‘Real’ DTDs require some fairly complex processing.– So far XSLT seems to be able to handle nearly every case

we have come across– However, some cases have required JScript extensions to

XSLT

Page 36: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Schemas vs. DTDs• Both are systems of representing a data model

that defines the data’s elements and attributes, and the relationship among elements.

• Schema addresses limitations of DTDs and the increasingly data-oriented role of XML.

• Initial Arbortext, DataChannel, Inso, Microsoft, and Univ of Edinburgh proposal: XML-Data.

• W3C XML Schema Working Group: two documents: XML structures and datatypes.

Page 37: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Schema Justification• Description of document type’s structure should

be in an XML document instead of written in special syntax (DTD).

• Schema are in XML: easier to edit and process using standard XML DOM manipulation tools.

• DTD notation doesn’t allow schema designers the power to impose strong data typing -- for example, the ability to say that a certain element type must always have a positive integer value, that it may not be empty, or that it must be one of a list of possible choices.

Page 38: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Metadata and Linking Standards

• Digital Object Identifier (DOI) and Persistent Object Identifiers.

• OpenURL and Value-Added Service Components (SFX).

• Open Archives Initiative (OAI), Dublin Core and Qualifiers.

• Local Resolver Servers.

Page 39: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Metadata in DLI• To normalize & augment presentation.• To normalize searching (e.g. Names).• To store dynamic links.• Types of links:

– Articles referenced By item (Backward).– Articles that reference the item (Forward).– A & I Records for references and items.– Other relationships (TOC, Other items by

Author, Collaborative Data).– Known item and presumptive linking.

Page 40: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

DLI Metadata Schema• Maintained as XML files using RDF and

Qualified Dublin Core syntax and semantics.• Example: <dcq:issued> <!-- subproperty/refinement of DC Date -->

<dcq:W3CDTF> <!-- DC Date encoding --> <rdf:value>1999-09</rdf:value> </dcq:W3CDTF> </dcq:issued>

• Application of XML DOM for processing at DC or idli level.

Page 41: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

New DLI Metadata Schema<dc:creator>

<rdf:Seq>

<rdf:li>

<dca:Person rdf:ID="AUTHOR-1">

<dca:agentname>

<dca:FNF>

<rdf:value>L'Ecuyer, Pierre</rdf:value>

</dca:FNF>

</dca:agentname>

<dca:agentaffiliation>Université de Montréal Département...</dca:agentaffiliation>

<dca:agentidentifier rdf:resource="mailto:[email protected]" />

</dca:Person>

</rdf:li>

…..

</rdf:Seq>

</dc:creator>

Page 42: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 43: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 44: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 45: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Digital Object Identifier (DOI)• DOI is both a unique identifier of a piece of

digital content AND a system to access that content digitally. Persistent object identifier.

• ‘The ISBN for the 21st Century’ -- Norman Paskin.

• DOI system has two main parts: (the identifier and a directory system) and a third logical component, a database.

• Developed by AAP (Association of American Publishers), now managed by International DOI Foundation.

Page 46: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

DOI Construction• First real open standard for content identification.

• DOI is a number that identifies a digital object:– 10.1063/S000369519903216

• 10 Registration Agency Prefix

• 1063 Publisher Prefix

• S000369519903216 Suffix (Publisher-assigned ID)

• Suffix can be SICI or PII.

• The DOI and URL pointing to the digital object, is registered with the International DOI Foundation, e.g:– 10.1063/333 | http://www.pubsite.org/apr99/artl1.pdf

Page 47: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Using a DOI• DOIs are resolved using the Handle System

technology from CNRI (Corporation for National research Initiatives).

• Retrieval of object is two step process: link is sent to central directory where current Web address is stored, location is sent back to browser with special message to redirect to address, e.g:– dx.doi.org/10.1063/333 redirects to

www.pubsite.org/apr99/artl1.pdf

Page 48: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Reference Linking• Alternatives to DOI:

– PubMed/PubRef (National Library of Medicine)– PubSCIENCE (DOE/OSTI)– Proprietary Link Managers (AIP, APS)

• CrossRef Project: major Sci-Tech professional societies and commercial publishers.

• System design calls for one URL for each DOI; underlying technology can handle multiple URLs however.

Page 49: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Local Resolver• Issue: Directing users to locally held or

licensed version of Digital Object (locally loaded or from Aggregator).

• Harvard problem, Appropriate Copy problem.

• Additional desire to direct users to local value-added services: local print holdings, interlibrary borrowing, other articles in A & I Services.

Page 50: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Local Resolver• Local Resolver Servers

– OpenURL Protocol, CookiePusher vs. IP Addresses.

• Demonstration Project at Illinois, OhioLink (Ex Libris SFX), Los Alamos.– Localizing Name Resolution for AIP, ASCE, Elsevier,

other publishers.

– Use of CrossRef Metadata Database for identifying Publisher from DOI and linking to Local Copy, A & I Services, Library Assistance.

Page 51: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Cookie on clientClient

(Web Browser)

DOI Proxy

Illinois LocalLink Server

OpenURL

AwareLocal

AIP, IEE

CrossRefMetadataDatabase

dx.doi.org/10.1063/1234HandleServer

AIP

IEE

Elsevier

DOI

Metadata

LocalValueAdded

Nosfx=y

UIUC MetadataRegistry

OpenURL

Page 52: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 53: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.
Page 54: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Grainger Search Aid• Development of Portal and Gateway sites

featuring:– robust search/navigation;– ability to link everywhere from anywhere.

• Simultaneous search of heterogeneous resources to assist in database selection.

• Article level and e-journal Web site access to full-text repositories.

• Utilize OpenURL and DOI.

Page 55: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Open Archives Initiative (OAI)• Released version 1.0 of metadata harvesting

protocols. Frozen through second quarter 2001.• Mechanism for data providers to expose their

metadata through an HTTP protocol and a mechanism for harvesting records containing metadata from repositories.

• Roots in e-print archives.• Lightweight, low-barrier. Easy to implement Web

server to handle OAI protocol requests; need to develop procedures to access and extract your metadata.

Page 56: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

OAI Continued• Requires repositories to support the Dublin Core

elements.• Allows communities to expose metadata in other

formats as long as records are structured as XML data with corresponding XML schema.

• Registration mechanism provides publicly accessible list of OAI conformants.

• Alpha testing phase completed.

Page 57: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Publishing Trends• Publishers will continue to add value to

online journal articles.

• Digital version will become version of record.

• Virtual journals (both publisher-based and cross-publisher) will become common.

• Next-generation knowledge environments will evolve. Multimedia, data exposed, live equations with in-place calculations.

Page 58: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Publishing Trends (Continued)• Personalized services will be available --

agent technology, alerting services.

• Different economic and subscription models will be introduced.

• Deconstruction of Journal (Bob Kelly, APS); article at a time publishing.

• Journal branding or perhaps publisher branding.

• Academia issues: publishing, tenure.

Page 59: Digital Library Technologies at the Grainger Library William H. Mischo, Timothy W. Cole, Tom Habing w-mischo@uiuc.edu Grainger Engineering Library Information.

Closing Issues• Role of Authors, Academic Institutions,

Libraries, Publishers, Abstracting & Indexing Services.

• Disintermediation may affect both Libraries and Publishers.

• Information as Function not Place.

• Provide a ‘Digital Library’ out of digital collections.

• Role of XML technology.

• Service mechanisms: processing & archiving, search and discovery, presentation, linking.