Encylopedia of Life Informatics (Data Model) Workshop: Engaging Partners

Post on 18-May-2015

1.302 views 1 download

Tags:

description

Presentation given at the Encyclopedia of Life Informatics workshop at the Marine Biological Laboratory (Woods Hole), February 9, 2007

Transcript of Encylopedia of Life Informatics (Data Model) Workshop: Engaging Partners

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Encyclopedia of LifeInformatics (Data Model)

WorkshopEngaging Partners

The Biodiversity Heritage Library

Martin R. KalfatovicSmithsonian Institution Libraries

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Engaging Partners Round Table• Martin R. Kalfatovic

– Head, New Media Office and Preservation Services DepartmentSmithsonian Institution Libraries

• Areas of Interest– Digital conversion technologies– Network information discovery and retrieval– Technology review editor, Library Information

Technology Association (LITA)• Areas of Work

– Writing purchase orders and annoying staff in the contracts office

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Vast, But Not Infinite

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Vast, But Not Infinite

• 100 characters (Western European languages, plus spaces and some punctuation)

• Each line has 50 spaces• Each page is 40 lines long• Each book is 500 pages long• Total Books: 100 1,000,000

• Googolplex: 1 followed by a googol (10 100) zeros

Kurd Lasswitz, “The Universal Library.” 1901

Jorge Louis Borges, “The Library of Babel.” 1941

Daniel Dennett, Darwin's Dangerous Idea. 1995

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

But where can we put it?

• Compressed (at today’s standards) this would be about 50 petabytes (about the size of a small-town library building)

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Digitization Projects

• Amazon: Search Inside the Book• Google Book Search• Microsoft Live Text!

• Open Content Alliance• Biodiversity Heritage

Library

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Challenges/Opportunities

• Cheap Scanning– Internet Archive Scribe– Kirtas APT 2400 Scanner

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

What Do You Do With a Million …?

• Books?• Name strings?• Images?• Specimens?• Web pages?

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Engaging Partners• Can the layered

architecture accommodate input from and contribute to your project?–Yes

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Engaging Partners• Do you have any

software / components that you would like to integrate within the EoL informatics package?–Chris Freeland

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Engaging Partners• Can we build

crosswalks with your project?–BHL data will

provide the historic groundwork for a significant number of EoL species content

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Engaging Partners• What changes to the data

model or additional WorkBench modules would help meet your needs ... – Within the WorkBench layer,

building bibliographic citation scraping tools that can roundtrip between EoL and other tools: e.g. Zotero and Connotea

Encyclopaedia of Life: Informatics Workshop, February 9-10, 2007Marine Biological Laboratory, Woods Hole, Mass.

Martin R. KalfatovicSmithsonian Institution Libraries

Engaging Partners

“The world has arrived at an age of cheap complex devices of great reliability; and something is bound to come of it”- Vannevar Bush (1945)