A Systemwide View of Library Collections
description
Transcript of A Systemwide View of Library Collections
Ithaka
A Systemwide View of Library Collections
Brian Lavoie, OCLC ResearchRoger C. Schonfeld, Ithaka
CNI Spring Task Force Meeting April 5, 2005
Ithaka
Systemwide View of Library Collections
Print collections have been changing, as the distinction between local and external resources is increasingly blurred due to resource sharing
Digitization combined with network technologies creates opportunities for one “copy” of a resource to be shared across many libraries
These forces inevitably are going to lead to a shift in focus to the resources of the “system,” rather than individual library collections
Ithaka
Mass Digitization
Great deal of public and private investment in digitization programs … e.g., JSTOR, ARTstor - and of course mass digitization spearheaded via GooglePrint
Digitization opportunities unlimited; resources are not …• How to determine priorities? What programs of
digitization will be necessary to meet the needs of the scholarly community?
Ithaka
Print Preservation
From a systemwide perspective, what preservation framework makes most sense for print resources?
How have preservation frameworks changed over time?
As retrospective materials become increasingly available in digital form, will new frameworks for print preservation be necessary?
Ithaka
What Are We Going to Do Today?
The kinds of collaborations necessary to begin to take advantage of a systemwide perspective are very hard, both from economic and political standpoints
We will not be proposing any answers!
Instead, we thought to take advantage of the WorldCat resource – which affords the broadest view of print collections – to build a bridge from a local perspective to the beginnings of a systemwide perspective
Today’s presentation focuses on print books
Ithaka
Data Sources
WorldCat: world’s largest and most comprehensive bibliographic database• > 20,000 libraries worldwide have contributed to the
development of WorldCat
Copy of WorldCat from January 2005:• ~55 million records
Copy of WorldCat holdings file from January 2005:• ~950 million holdings
Ithaka
Data Source Limitations
Not all published materials are cataloged in WorldCat
Not all library holdings are represented in WorldCat
Largely reflects North American library collections So … WorldCat does not embody the whole
universe of library collections and holdings – but it’s a very good approximation!
Ithaka
1. The “Systemwide Collection”
Size Age
Ithaka
54,831,000
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
Total WorldCat Records Language-based or manuscriptmonographs, excluding
government documents andtheses/dissertations, in print
format only
How Many “Books” Are Held in the Systemwide Collection?
Ithaka
How Many “Books” Are Held in the Systemwide Collection?
45,269,000
54,831,000
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
Total WorldCat Records Language-based or manuscriptmonographs
Language-based or manuscriptmonographs, excluding
government documents andtheses/dissertations, in print
format only
Ithaka
How Many “Books” Are Held in the Systemwide Collection?
35,251,000
45,269,000
54,831,000
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
Total WorldCat Records Language-based or manuscriptmonographs
Language-based or manuscriptmonographs, excluding
government documents andtheses/dissertations
Language-based or manuscriptmonographs, excluding
government documents andtheses/dissertations, in print
format only
Ithaka
How Many “Books” Are Held in the Systemwide Collection?
31,923,00035,251,000
45,269,000
54,831,000
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
Total WorldCat Records Language-based or manuscriptmonographs
Language-based or manuscriptmonographs, excluding
government documents andtheses/dissertations
Language-based or manuscriptmonographs, excluding
government documents andtheses/dissertations, in print
format only
Ithaka
Works and Manifestations
FRBR (Functional Requirements for Bibliographic Records):• Hierarchy of bibliographic entities • Works, Expressions, Manifestations, Items
Work: distinct intellectual or artistic creation• e.g., Macbeth
Manifestation: physical embodiment of an expression of a work• e.g., Macbeth, Folger Shakespeare Library edition, published in
paperback by Washington Square Press (2004) WorldCat records describe FRBR manifestations Works identified using OCLC “FRBRization” algorithm
• Converts MARC21 bibliographic databases into FRBR “work-sets”• http://www.oclc.org/research/software/frbr/
Ithaka
Most Book Works Have Few Manifestations
31,923,000
26,025,000
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
Manifestations Works
Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only
Ithaka
Print Book Manifestations and Works – and Digital Manifestations
31,923,000
26,025,000
121,6890
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
Manifestations Works Digital Manifestations
Language-based or manuscript monographs, excluding government documents and theses/dissertations, in print format only
Ithaka
How Old Are the Components of the Systemwide Collection? Cumulative Book Works/Manifestations Over Time
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
1700
1710
1720
1730
1740
1750
1760
1770
1780
1790
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
Manifestations
Works
Ithaka
How Old Are the Components of the Systemwide Collection? Book Works/Manifestations per Year
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
1700
1710
1720
1730
1740
1750
1760
1770
1780
1790
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
Manifestations
Works
Ithaka
Age of Works and Manifestations: Relative to 1923 (millions)
0
5
10
15
20
25
30
Manifestations Works
Pre-1923
1923andAfter
18%
82%
17%
83%
Ithaka
2. Individual Collections Cumulate to Form the System
How will digitization bring them together virtually?
Ithaka
Minimal OverlapBook Works Held by X or More Libraries (in millions)
0
5
10
15
20
25
30
1 ormore
2 ormore
3 ormore
4 ormore
5 ormore
6 ormore
7 ormore
8 ormore
9 ormore
10 ormore
100 ormore
Number of Libraries
Ithaka
Works Held BroadlyBook Works Held by X or More Libraries (in millions)
0
1
2
3
4
5
6
7
10 ormore
50 ormore
100 ormore
200 ormore
300 ormore
400 ormore
500 ormore
Number of Libraries
Ithaka
Works Held BroadlyBook Works Held by X or More Libraries, as Percent of Total Book Works
24%
9%6%
4%2% 2% 1%
0%
5%
10%
15%
20%
25%
30%
10 ormore
50 ormore
100 ormore
200 ormore
300 ormore
400 ormore
500 ormore
Number of Libraries
Ithaka
The Virtual System in Practice GooglePrint digitization initiative Questions:
• How many print books does this initiative potentially impact?• What proportion of “systemwide print book collection” does this
represent?• Overlap (how much held broadly? how much held uniquely?)
Forthcoming paper from OCLC researchers that will offer some perspective on these questions
Hopefully, work like this will help to establish set of important questions/metrics that need to be addressed when:• Considering digitization initiatives• Considering implications of a changing world of research and learning
for collections
Ithaka
3. How Is Rareness Distributed through the System?
Ithaka
Systemwide Holdings of Print Works
1 holding37%
2 holdings14%
3-5 holdings16%
More than 5 holdings
33%
Ithaka
More than 9 millions works are held only once
0
2,000,000
4,000,000
6,000,000
8,000,000
10,000,000
12,000,000
1 holding 2 holdings 3 holdings 4 holdings 5 holdings 6 to 10holdings
11 to 20holdings
21-50holdings
51-100holdings
100+holdings
Ithaka
4. What Systemwide Preservation Frameworks Have Served Us?
Ithaka
The Growth and Peak in Average Holdings Over Time
0
5
10
15
20
25
30
35
40
45
0 25 50 75 100 125 150 175 200
Age in Years
Ave
rage
Hol
ding
s
ManifestationsWorks
Ithaka
Steady, Gradual Nineteenth Century Growth in Works Held Many Times…
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
200,000
1801
-181
0
1811
-182
0
1821
-183
0
1831
-184
0
1841
-185
0
1851
-186
0
1861
-187
0
1871
-188
0
1881
-189
0
1891
-190
0
2 to 1011 to 5051 to 100101 to 200201 to 400400 to 10001000+
Ithaka
…Rapid Postwar Increase in Works Held Many Times
0
500,000
1,000,000
1,500,000
2,000,000
2,500,00019
11-1
920
1921
-193
0
1931
-194
0
1941
-195
0
1951
-196
0
1961
-197
0
1971
-198
0
1981
-199
0
1991
-200
0
2 to 1011 to 5051 to 100101 to 200201 to 400400 to 10001000+
Ithaka
Of Works with Multiple Holdings, Steady Increase Through the 1960s in the Proportion Held Many Times
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1801
-181
0
1811
-182
018
21-1
830
1831
-184
0
1841
-185
018
51-1
860
1861
-187
0
1871
-188
018
81-1
890
1891
-190
0
1901
-191
019
11-1
920
1921
-193
0
1931
-194
019
41-1
950
1951
-196
0
1961
-197
019
71-1
980
1981
-199
0
1991
-200
0
1000+400 to 1000201 to 400101 to 20051 to 10011 to 502 to 10
Ithaka
Summary and Discussion
Ithaka
Summary: Findings
1. Roughly 26 million print title works, represented in 32 million print title manifestations, are held by OCLC member libraries. This should be seen as a minimum in considering the number of printed books over time. Half of the books date from the period since 1977. How can a mass digitization strategy effectively manage the intellectual property ramifications of this finding?
2. Publications are distributed across a wide number of libraries, and any mass digitization strategy that ignores this distributional reality is likely to omit numerous works. How should this finding impact the library system’s planning for a massive format migration?
Ithaka
Summary: Findings
3. Rareness is very common within the system. This has been recognized by many librarians but is not always taken into account in policy development. How will any future print preservation strategy address this reality? Can data on rareness help to inform digitization strategies?
4. Redundancy in holdings across the system has changed over time. How has this led our framework for preservation to become more or less secure? What lessons should be drawn as we consider other print preservation strategies, particularly in the era of mass digitization, such as paper repositories? What lessons might there be for digital preservation?
Ithaka
More information …
More in-depth article forthcoming …
Contact us with comments and questions:• Brian Lavoie: [email protected]• Roger C. Schonfeld: [email protected]