OCLC Online Computer Library Center OCLC Research Eric Childress OCLC Research SHARES Meeting NYU...

34
OCLC Online Computer Library Center OCLC Research Eric Childress OCLC Research SHARES Meeting NYU New York, NY 2006-10-20

Transcript of OCLC Online Computer Library Center OCLC Research Eric Childress OCLC Research SHARES Meeting NYU...

OCLC Online Computer Library Center

OCLC Research

Eric Childress

OCLC Research

SHARES Meeting

NYU

New York, NY

2006-10-20

OCLC Online Computer Library Center

Agenda

Overview

Areas of activity

Sample projects

OCLC Online Computer Library Center

OCLC Research

Unit established 1978 ; Currently 30 staff (8 scientists)

Mission: To expand knowledge that advances OCLC's public purpose to:

Reduce costs

Further access to the world’s information

Activities: experimentation, prototypes, software, academic research (papers, studies), standards work

Community support:

LITA/OCLC Kilgour award

OCLC/ALISE LIS Research Grants

Software contest, in-kind grants, more…

OCLC Online Computer Library Center

Areas of major activity

Metadata management

FRBR, large-scale catalogs, metrics & analysis

Knowledge organization

Terminology, classification, authority files

Content management

Preservation, collection management

Management intelligence

Data mining, collection attributes

Interoperability

Standards, frameworks, models

User behavior

OCLC Online Computer Library Center

Sample projects: Metadata

FRBR

FictionFinder

xISBN

OCLC Online Computer Library Center

Work

Expression

Is realized through

Is exemplified by Item

ManifestationIs embodied in

A distinct intellectual or artistic creation

The intellectual or artistic realization of a work

The physical embodiment of an expression

A single exemplar of a manifestation

Functional Requiremest of Bibliographic Records (FRBR) Group 1 Entities

OCLC Online Computer Library Center

OCLC FRBR work set algorithm is used to group related records

Original Illustrated edition

Spanishedition

Abridgededition

Video

Expressions

Work¹ Work²

e¹ e² e³ e¹

Mr. Collins... protested that he never read novels

OCLC Online Computer Library Center

Worldcat (FRBR stats)

Manifestations

Works

Items

(est: holdings*1.5)

59,879,322

47,423,810

1,531,400,969

35,372,459

28,542,021

1,194,751,352

Total Print books

OCLC Online Computer Library Center

Top 10 works in WC by holdings

10

OCLC Online Computer Library Center

FictionFinder

Employs FRBR to:

Build a “work” view & cluster related records

Support the creation of special indexes

Supports searching & browsing of fiction materials cataloged in WorldCat

Fiction records — 2.8 million

Unique works — 1.4 million

Total holdings — 130 million

OCLC Research prototype

OCLC Research team:

Diane Vizine-Goetz (lead)

Roger Thompson

Carol Hickey

J.D. Shipengrover

OCLC Online Computer Library Center

OCLC Online Computer Library Center

OCLC Online Computer Library Center

OCLC Online Computer Library Center

OCLC Online Computer Library Center

xISBN

OCLC Research prototype

Reveals all ISBNs associated with individual works in WorldCat

Web service:

URL syntax query (submit an ISBN)

Simple XML response (all ISBNs in workset)

Ex: Dune http://labs.oclc.org/xisbn/0441172717

Users:

Various, loosely-coupled look-it-up applications

Copyright Clearance Center

OCLC Research team:

Thom Hickey (lead)

Jenny Toves

Jeff Young

OCLC Online Computer Library Center

xISBN output for Dune

<idlist>

<isbn>0441172717</isbn>

<isbn>0801950775</isbn>

<isbn>0441102670</isbn>

<isbn>1556909330</isbn>

<isbn>0425027066</isbn>

<isbn>0425036987</isbn>

<isbn>0425046877</isbn>

<isbn>0425092143</isbn>

<isbn>0553580302</isbn>

<isbn>042507160x</isbn>

<isbn>0736643443</isbn>

<isbn>0736692401</isbn>

<isbn>0441172660</isbn>

<isbn>042505313x</isbn>

<isbn>0425080021</isbn>

<isbn>1402553242</isbn>

<isbn>0736689591</isbn>…..

For any given resource, the full complement of ISBNs can be generated

Provides convenient mechanism for expanding searches to look for multiple manifestations…

OCLC Online Computer Library Center

Sample projects: Knowledge Organization

Dewey Browser

Terminology Services

OCLC Online Computer Library Center

DeweyBrowser

OCLC Research prototype

Supports searching & browsing collections organized by DDC

Presents search results at three levels corresponding to the three main summaries of Dewey

Collections available:

wcat – 2.2 million of the most widely held WorldCat records

abr14 – selected data from the Abridged Edition 14 of DDC

ebooks - 210,000+ electronic book records from WorldCat

Summaries can be displayed in:

English

French

German

Spanish

Swedish

OCLC Research team:

Diane Vizine-Goetz (lead)

Thom Hickey (lead)

Carol Hickey

Harry Wagner

OCLC Online Computer Library Center

OCLC Online Computer Library Center

Terminology Services Project

OCLC Research prototype

Explores Semantic Web value of vocabularies

Enriched versions of controlled vocabularies & classification schemes

Multiple formats (MARCXML, SKOS, Zthes)

Machine-friendly (e.g., web services)

Product version released July 2006

OCLC Research team:

Diane Vizine-Goetz (lead)

Carol Hickey

Andrew Houghton

Roger Thompson

TerminologyServices

Architecture

Web ServiceProxy

SRW/U REST SOAP

BrowserSidebar

Metadata Editing Application

•Registration•Query handling•Markup translation•Authorization/Authentication

Mic

ros

oft

Off

ice

Re

se

arc

h

Pa

ne

Full Text SQL XML

1

2

3

Storage Technology Layer

Application Protocol Layer

OCLC Online Computer Library Center

OCLC Online Computer Library Center

Sample projects: Management intelligence

Data Mining

G5 Study

WorldMap

Audience Level

OCLC Online Computer Library Center

“G5” Study

Identified the overlap within the Google 5 collections and against WorldCat

Looked at various metrics for the system-wide collection

OCLC Online Computer Library Center

The system-wide print book collectionas represented in WorldCat (January 2005)

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

Total WorldCat Records Language-based monographs Language-based monographs,excluding government

documents andtheses/dissertations

Language-based monographs,excluding government

documents andtheses/dissertations, in print

format only

~55 million

~41 million

~35 million

~32 millionprint books

OCLC Online Computer Library Center

From “Anatomy of Aggregate Collections:

The Example of Google Print for Libraries”

in D-Lib (Sept 2005) [link]

OCLC Online Computer Library Center

OCLC WorldMap

Visual, geographic representation of publishing- and library collection-related data

Interactive

Uses data from varied sources (e.g., WorldCat, NCES, UNESCO)

OCLC Research team:

Lynn Silipigni Connaway (lead)

Jeremy Browning

Other team members:

Larry Olszewski

OCLC Online Computer Library Center

OCLC Online Computer Library Center

Audience Level

An OCLC Research prototype

A two-step process for assigning a relative “audience level”:

Use MARC “Target Audience” if present

If not, calculate the audience based on weighted holdings

Features:

Human- and machine-readable interfaces

Resolves OCLC record number or ISBN to probable “audience level”

OCLC Research Team:

Lynn Connaway (lead)

Brian Lavoie

Ed O’Neill

Cliff Snyder

Akeisha Heard

OCLC Online Computer Library Center

Calculating “audience level”

Library Type Weight Holdings Holdings Wgt.

ARL 1.00 72 72.00

Academic .67 97 64.99

Public .33 8 2.64

School .00 0 0

177 139.63

Sum of Holding Weight ÷ Total Holdings

139.63 ÷ 177 = 0.78Operations research for libraries and information agencies :

techniques for the evaluation of management decision alternatives

by Donald H Kraft & Bert R Boyce [San Diego : Academic Press, ©1991]

OCLC Online Computer Library Center

work

manifestations in workset

Human-readableinterface

0.62

OCLC Online Computer Library Center

work

manifestations in workset

Human-readableinterface

0.45

OCLC Online Computer Library Center

Thank you!

OCLC Research http://www.oclc.org/research

Project pages http://www.oclc.org/research/projects

ResearchWorks http://www.oclc.org/research/researchworks