Million Book Project: Dreams and Realities Dr. Gloriana St. Clair University Librarian, Carnegie...

40
Million Book Project: Dreams and Realities Dr. Gloriana St. Clair University Librarian, Carnegie Mellon
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    1

Transcript of Million Book Project: Dreams and Realities Dr. Gloriana St. Clair University Librarian, Carnegie...

Million Book Project:Dreams and Realities

Dr. Gloriana St. Clair

University Librarian, Carnegie Mellon

Thesis

• Project has great potential to make good information available to more scholars.

• The future of libraries is digital.

Main Points

• Genesis: How the project began

• Dreams: Positive comments about the vision

• Realities: Collections and logistics

Genesis

Chinese Universities Participating

• Chinese Academy of Science• Chinese Ministry of Education• Fudan University• Nanjing University• Peking University• Tsinghua University• Zheijiang University 

Project Personnel

People at Carnegie Mellon

Directors of the Universal

Library Project will serve

as the main consultants: 

• Dr. Raj Reddy• Dr. Michael Shamos

People at Carnegie Mellon

Directors of the Universal

Library Project will serve

as the main consultants: 

• Dr. Jaime Carbonell• Dr. Robert Thibadeau

People at Carnegie Mellon

Directors of the Universal

Library Project will serve

as the main consultants: 

• Dr. Gloriana St. Clair 

People at Carnegie Mellon

Additional library

personnel include:

• Ms. Erika Linke• Ms. Denise Troll• Ms. Gabrielle Michalek  

People Elsewhere

• Michael Lesk, National Science Foundation

• Brewster Kahle and Niall O’Driscoll, Internet Archive

Research Initiatives

• Machine translation

• Massive distributed database

• Storage formats

• Use of digital libraries

• Distribution and sustainability

• Security

Distribution and Sustainability

• Library of Congress, Digital Preservation

• OCLC

• RLG

• STOR family

• University presses

• Commercial vendors

Dreams

• “Libraries have played a vital role in the advance of human society. They have supplemented the formal education system by making human knowledge available to anyone who can read and has access to them. Human advance, including especially science and engineering, has depended on young people having access to books via libraries.”

• “Libraries are very unevenly distributed across the world and within countries. Even in the US there are enormous differences. Now technology makes possible a universal world library in which every person has access to anything written.”

• “The idea is compelling, the enhancements to education and learning can be intuited -- and few, if anyone, would object to the conceptual framework.”

•“In the end, this will be Vannevar Bush’s Memex.”

- Michael Lesk, Internet Archive

Bush, “As We May Think,” Atlantic Monthly (July 1945). http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm

• “This project alone can change how education is conducted in much of the world. Further, if this project is carried further, it can change how publishing and reading is done in the more general public.”

Realities

Copyright is the biggest reality of the project.

Copyright

• U. S. copyright is now 95 years.

• Many books are out of print but restricted by copyright for ~93 years.

• Copyright can be cleared by asking permission to scan.

A Collection of Collections

• Books will represent a variety of languages including material originating in China and India, our partners.

• Partners will select and include collections of cultural importance.

A Collection of Collections

• Copyright-cleared Books for College Libraries.

• Technical reports.

• University press publications.

• Government documents.

• Pre-1923 books, now in depositories.

U. S. Collection Partners

• Indiana University• Pennsylvania State University• Stanford University• TriColleges (Swarthmore, Haverford, Bryn Mawr)• University of California Berkeley• University of Pittsburgh• University of Washington

Best Books Feature

• Books for College Libraries 60,000 best books, published in 1988.

• $80,000 needed for a copyright clearance project.

Subject Collections

• Sending proposals to small foundations for money to support copyright clearance for resources in subjects such as history and environment.

• Discussed putting up a segment of materials to support literacy.

University Press Negotiations

• National Academy Press We will scan early materials and exchange for some 2,500 they are scanning.

• MIT Press Discussed scanning some of their backlog.

Government Documents

• U. S. government produces about 100,000 documents per year; mostly in the public domain.

• Have 30 boxes of Dept of Education materials.• Negotiating with other agencies• Some highly desired collections• British Parliamentary Papers, thousands, 1950

back copyright o.k.

Logistics

Optimum Scanner Throughput

• One Minolta scanner running two shifts daily = 16 books per day.

• 250 work days per year = 4000 books per year.

Optimum Scanner Throughput

• With currently supplied (18) scanners

= 72,000 books per year.• With additional (82) scanners from this grant

= 400,000 books/year.• Allowing a generous 50% deterioration in

throughput, 100 scanners can complete the project in five years.

Transportation

• 1,000,000… Imagine sending every book at Carnegie Mellon to the other side of the world.

• By sea? (slow) — or by air? (expensive)• Return? (a perpetual argument with computer science)• Many points here and there.

Conclusions

• The collection must be composed of many sub-collections.

• Copyright is a serious barrier to an effective effort.

• Librarians will be brought into the picture to ensure solid selection criteria.

Thank You