A Systemwide View of Library Collections

35
Ithaka A Systemwide View of Library Collections Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005

description

A Systemwide View of Library Collections. Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005. Systemwide View of Library Collections. - PowerPoint PPT Presentation

Transcript of A Systemwide View of Library Collections

Page 1: A Systemwide View of  Library Collections

Ithaka

A Systemwide View of Library Collections

Brian Lavoie, OCLC ResearchRoger C. Schonfeld, Ithaka

CNI Spring Task Force Meeting April 5, 2005

Page 2: A Systemwide View of  Library Collections

Ithaka

Systemwide View of Library Collections

Print collections have been changing, as the distinction between local and external resources is increasingly blurred due to resource sharing

Digitization combined with network technologies creates opportunities for one “copy” of a resource to be shared across many libraries

These forces inevitably are going to lead to a shift in focus to the resources of the “system,” rather than individual library collections

Page 3: A Systemwide View of  Library Collections

Ithaka

Mass Digitization

Great deal of public and private investment in digitization programs … e.g., JSTOR, ARTstor - and of course mass digitization spearheaded via GooglePrint

Digitization opportunities unlimited; resources are not …• How to determine priorities? What programs of

digitization will be necessary to meet the needs of the scholarly community?

Page 4: A Systemwide View of  Library Collections

Ithaka

Print Preservation

From a systemwide perspective, what preservation framework makes most sense for print resources?

How have preservation frameworks changed over time?

As retrospective materials become increasingly available in digital form, will new frameworks for print preservation be necessary?

Page 5: A Systemwide View of  Library Collections

Ithaka

What Are We Going to Do Today?

The kinds of collaborations necessary to begin to take advantage of a systemwide perspective are very hard, both from economic and political standpoints

We will not be proposing any answers!

Instead, we thought to take advantage of the WorldCat resource – which affords the broadest view of print collections – to build a bridge from a local perspective to the beginnings of a systemwide perspective

Today’s presentation focuses on print books

Page 6: A Systemwide View of  Library Collections

Ithaka

Data Sources

WorldCat: world’s largest and most comprehensive bibliographic database• > 20,000 libraries worldwide have contributed to the

development of WorldCat

Copy of WorldCat from January 2005:• ~55 million records

Copy of WorldCat holdings file from January 2005:• ~950 million holdings

Page 7: A Systemwide View of  Library Collections

Ithaka

Data Source Limitations

Not all published materials are cataloged in WorldCat

Not all library holdings are represented in WorldCat

Largely reflects North American library collections So … WorldCat does not embody the whole

universe of library collections and holdings – but it’s a very good approximation!

Page 8: A Systemwide View of  Library Collections

Ithaka

1. The “Systemwide Collection”

Size Age

Page 9: A Systemwide View of  Library Collections

Ithaka

54,831,000

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

Total WorldCat Records Language-based or manuscriptmonographs, excluding

government documents andtheses/dissertations, in print

format only

How Many “Books” Are Held in the Systemwide Collection?

Page 10: A Systemwide View of  Library Collections

Ithaka

How Many “Books” Are Held in the Systemwide Collection?

45,269,000

54,831,000

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

Total WorldCat Records Language-based or manuscriptmonographs

Language-based or manuscriptmonographs, excluding

government documents andtheses/dissertations, in print

format only

Page 11: A Systemwide View of  Library Collections

Ithaka

How Many “Books” Are Held in the Systemwide Collection?

35,251,000

45,269,000

54,831,000

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

Total WorldCat Records Language-based or manuscriptmonographs

Language-based or manuscriptmonographs, excluding

government documents andtheses/dissertations

Language-based or manuscriptmonographs, excluding

government documents andtheses/dissertations, in print

format only

Page 12: A Systemwide View of  Library Collections

Ithaka

How Many “Books” Are Held in the Systemwide Collection?

31,923,00035,251,000

45,269,000

54,831,000

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

Total WorldCat Records Language-based or manuscriptmonographs

Language-based or manuscriptmonographs, excluding

government documents andtheses/dissertations

Language-based or manuscriptmonographs, excluding

government documents andtheses/dissertations, in print

format only

Page 13: A Systemwide View of  Library Collections

Ithaka

Works and Manifestations

FRBR (Functional Requirements for Bibliographic Records):• Hierarchy of bibliographic entities • Works, Expressions, Manifestations, Items

Work: distinct intellectual or artistic creation• e.g., Macbeth

Manifestation: physical embodiment of an expression of a work• e.g., Macbeth, Folger Shakespeare Library edition, published in

paperback by Washington Square Press (2004) WorldCat records describe FRBR manifestations Works identified using OCLC “FRBRization” algorithm

• Converts MARC21 bibliographic databases into FRBR “work-sets”• http://www.oclc.org/research/software/frbr/

Page 14: A Systemwide View of  Library Collections

Ithaka

Most Book Works Have Few Manifestations

31,923,000

26,025,000

0

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

30,000,000

35,000,000

Manifestations Works

Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only

Page 15: A Systemwide View of  Library Collections

Ithaka

Print Book Manifestations and Works – and Digital Manifestations

31,923,000

26,025,000

121,6890

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

30,000,000

35,000,000

Manifestations Works Digital Manifestations

Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only

Page 16: A Systemwide View of  Library Collections

Ithaka

How Old Are the Components of the Systemwide Collection? Cumulative Book Works/Manifestations Over Time

0

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

30,000,000

35,000,000

1700

1710

1720

1730

1740

1750

1760

1770

1780

1790

1800

1810

1820

1830

1840

1850

1860

1870

1880

1890

1900

1910

1920

1930

1940

1950

1960

1970

1980

1990

2000

Manifestations

Works

Page 17: A Systemwide View of  Library Collections

Ithaka

How Old Are the Components of the Systemwide Collection? Book Works/Manifestations per Year

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

1700

1710

1720

1730

1740

1750

1760

1770

1780

1790

1800

1810

1820

1830

1840

1850

1860

1870

1880

1890

1900

1910

1920

1930

1940

1950

1960

1970

1980

1990

2000

Manifestations

Works

Page 18: A Systemwide View of  Library Collections

Ithaka

Age of Works and Manifestations: Relative to 1923 (millions)

0

5

10

15

20

25

30

Manifestations Works

Pre-1923

1923andAfter

18%

82%

17%

83%

Page 19: A Systemwide View of  Library Collections

Ithaka

2. Individual Collections Cumulate to Form the System

How will digitization bring them together virtually?

Page 20: A Systemwide View of  Library Collections

Ithaka

Minimal OverlapBook Works Held by X or More Libraries (in millions)

0

5

10

15

20

25

30

1 ormore

2 ormore

3 ormore

4 ormore

5 ormore

6 ormore

7 ormore

8 ormore

9 ormore

10 ormore

100 ormore

Number of Libraries

Page 21: A Systemwide View of  Library Collections

Ithaka

Works Held BroadlyBook Works Held by X or More Libraries (in millions)

0

1

2

3

4

5

6

7

10 ormore

50 ormore

100 ormore

200 ormore

300 ormore

400 ormore

500 ormore

Number of Libraries

Page 22: A Systemwide View of  Library Collections

Ithaka

Works Held BroadlyBook Works Held by X or More Libraries, as Percent of Total Book Works

24%

9%6%

4%2% 2% 1%

0%

5%

10%

15%

20%

25%

30%

10 ormore

50 ormore

100 ormore

200 ormore

300 ormore

400 ormore

500 ormore

Number of Libraries

Page 23: A Systemwide View of  Library Collections

Ithaka

The Virtual System in Practice GooglePrint digitization initiative Questions:

• How many print books does this initiative potentially impact?• What proportion of “systemwide print book collection” does this

represent?• Overlap (how much held broadly? how much held uniquely?)

Forthcoming paper from OCLC researchers that will offer some perspective on these questions

Hopefully, work like this will help to establish set of important questions/metrics that need to be addressed when:• Considering digitization initiatives• Considering implications of a changing world of research and learning

for collections

Page 24: A Systemwide View of  Library Collections

Ithaka

3. How Is Rareness Distributed through the System?

Page 25: A Systemwide View of  Library Collections

Ithaka

Systemwide Holdings of Print Works

1 holding37%

2 holdings14%

3-5 holdings16%

More than 5 holdings

33%

Page 26: A Systemwide View of  Library Collections

Ithaka

More than 9 millions works are held only once

0

2,000,000

4,000,000

6,000,000

8,000,000

10,000,000

12,000,000

1 holding 2 holdings 3 holdings 4 holdings 5 holdings 6 to 10holdings

11 to 20holdings

21-50holdings

51-100holdings

100+holdings

Page 27: A Systemwide View of  Library Collections

Ithaka

4. What Systemwide Preservation Frameworks Have Served Us?

Page 28: A Systemwide View of  Library Collections

Ithaka

The Growth and Peak in Average Holdings Over Time

0

5

10

15

20

25

30

35

40

45

0 25 50 75 100 125 150 175 200

Age in Years

Ave

rage

Hol

ding

s

ManifestationsWorks

Page 29: A Systemwide View of  Library Collections

Ithaka

Steady, Gradual Nineteenth Century Growth in Works Held Many Times…

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

200,000

1801

-181

0

1811

-182

0

1821

-183

0

1831

-184

0

1841

-185

0

1851

-186

0

1861

-187

0

1871

-188

0

1881

-189

0

1891

-190

0

2 to 1011 to 5051 to 100101 to 200201 to 400400 to 10001000+

Page 30: A Systemwide View of  Library Collections

Ithaka

…Rapid Postwar Increase in Works Held Many Times

0

500,000

1,000,000

1,500,000

2,000,000

2,500,00019

11-1

920

1921

-193

0

1931

-194

0

1941

-195

0

1951

-196

0

1961

-197

0

1971

-198

0

1981

-199

0

1991

-200

0

2 to 1011 to 5051 to 100101 to 200201 to 400400 to 10001000+

Page 31: A Systemwide View of  Library Collections

Ithaka

Of Works with Multiple Holdings, Steady Increase Through the 1960s in the Proportion Held Many Times

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1801

-181

0

1811

-182

018

21-1

830

1831

-184

0

1841

-185

018

51-1

860

1861

-187

0

1871

-188

018

81-1

890

1891

-190

0

1901

-191

019

11-1

920

1921

-193

0

1931

-194

019

41-1

950

1951

-196

0

1961

-197

019

71-1

980

1981

-199

0

1991

-200

0

1000+400 to 1000201 to 400101 to 20051 to 10011 to 502 to 10

Page 32: A Systemwide View of  Library Collections

Ithaka

Summary and Discussion

Page 33: A Systemwide View of  Library Collections

Ithaka

Summary: Findings

1. Roughly 26 million print title works, represented in 32 million print title manifestations, are held by OCLC member libraries. This should be seen as a minimum in considering the number of printed books over time. Half of the books date from the period since 1977. How can a mass digitization strategy effectively manage the intellectual property ramifications of this finding?

2. Publications are distributed across a wide number of libraries, and any mass digitization strategy that ignores this distributional reality is likely to omit numerous works. How should this finding impact the library system’s planning for a massive format migration?

Page 34: A Systemwide View of  Library Collections

Ithaka

Summary: Findings

3. Rareness is very common within the system. This has been recognized by many librarians but is not always taken into account in policy development. How will any future print preservation strategy address this reality? Can data on rareness help to inform digitization strategies?

4. Redundancy in holdings across the system has changed over time. How has this led our framework for preservation to become more or less secure? What lessons should be drawn as we consider other print preservation strategies, particularly in the era of mass digitization, such as paper repositories? What lessons might there be for digital preservation?

Page 35: A Systemwide View of  Library Collections

Ithaka

More information …

More in-depth article forthcoming …

Contact us with comments and questions:• Brian Lavoie: [email protected]• Roger C. Schonfeld: [email protected]