The Future Of Access to Articles

The Future of Access to Articles

Steve Toub

Friday the 13th, 2007

A&I Will Die

• The rise of full-text content providers and the introduction of non-traditional aggregators (e.g., Google Scholar, Microsoft Live Search Academic) reduces the value of traditional aggregators and indexes.

• The economic model of how to sustain the terminology services (thesaurus creation and maintenance, indexing) that traditional A&I vendors have provided is not yet clear.

MetaSearch Sucks

• In an environment where patrons expect instant answers from Google, a platform where folks will wait 32 seconds to get back the 10 “last-in, first-out” results at a time is a transitional solution, at best

• Where possible, it’s always better to aggregrate than to federate

BSTF & Aggregating Articles

• In early Oct 2006, SOPAG reported to BSTF that “the ULs did look with favor on the talks with the Internet Search Engines to address the ‘articles’ side of next generation discovery.”

• Initial discussions with Google Scholar and Microsoft Live Search Academic in December/January surfaced some areas of possible collaboration.

• Currently in a bit of a holding pattern

Google Scholar

• Broad coverage– But incomplete– And GS is not willing to pay for content

• Not likely to better leverage metadata• Interested in:

– Us pressuring vendors who won’t let GS in– Books interface and integration between books

and articles and more.– Our bib/holdings records

Microsoft Live Search Academic

• Narrow coverage– Started with CompSci and added BioMed/Health– Will add topics incrementally over time

• Seems like a “runner-up” product• Interested in:

– Input on collection development priorities– Improved linking from Academic to UC holdings– Integrated interface with books, articles, etc.

WorldCat Local

• Broad coverage (30M articles at launch)– ArticleFirst & some gov’t DBs: ERIC, Medline, etc.– Talking with others

• Could be integrated with Melvyl 2.0 interface• Interested in:

– ???– Not a good time for CDL to be discussing articles

with OCLC

From reality to pure speculation

• “Caveat management”

• Ideas shaped over email with some other institutions and at a breakout session at Code4Lib

• “At the network level” meme echoed at the RLG D2D symposium

GOOG/MSFT fall short; I want:

• An API, allowing institutions to:– Control the UI– Build aggregations that combine this “article store” with

local indexes (e.g., courseware, institutional repositories, networked drives) if they so chose.

– Enhance metadata (e.g., subject categorization)– Perform text mining operations

• Better leverage metadata for fielded searching, sorting, faceted browse, etc.

• A service level agreement, perpetual access• Known (and comprehensive) coverage.

Components of a Hypothetical Centralized Service

1. Hammer out terms and conditions and doing acquisitions through a single broker

2. Obtaining the right to use full-text formats are preferred so there is the ability to index and do text mining on that rather than just the metadata (even though we prefer to go to the vended site for delivery).

3. A central repository should store, manage and preserve the raw data.

4. Scrubbing/reformatting tools as well as tools that load data should be developed/shared.

Components, continued…

5. The central service should provide indexing and querying of the data. It should provide a web interface as well as an API.

• Partners would need to be able to share their ERM KBs with the central service in order index institution-specific permissions and display the appropriate delivery options in the discovery UI.

6. Partners could grab from the common repository & deploy in their local discovery services (may include other collections: metadata of library holdings, full-text of digitized books, special collections, etc.)

Negotiating Tips (Code4Lib-ers)

• Need to hone message on the incentives for content providers to participate. MLA Bibliography told OhioLINK after a certain point they wouldn't sell it to them at any price. The threat of canceling our subscriptions isn't very credible.– Possible additional $ for them at little additional cost– Additional $ for their to pay-per-view services– Possible savings (if they < $ in their discovery services)– Possible savings (<$ to support z39.50/SRU)

• Determine how much leverage we have.– What % of the customer base of vendors is academic

libraries? How much leverage do we have?– What if a database is only available from one vendor?

So… Whaddya Think?

The Future Of Access to Articles

Technology

Transcript of The Future Of Access to Articles