Chris Freeland Director, Center for Biodiversity Informatics, Missouri Botanical Garden

24
Scribbles & Scraps: Darwin’s Library & the Online Display of Annotated Biodiversity Literature http://biodiversitylibrary.org/collection /darwinlibrary Chris Freeland Director, Center for Biodiversity Informatics, Missouri Botanical Garden Technical Director, Biodiversity Heritage Library @chrisfreeland @chrisfreeland#bhlib #tdwg

description

Scribbles & Scraps : Darwin’s Library & the Online Display of Annotated Biodiversity Literature http://biodiversitylibrary.org/collection/darwinlibrary. Chris Freeland Director, Center for Biodiversity Informatics, Missouri Botanical Garden Technical Director, Biodiversity Heritage Library. - PowerPoint PPT Presentation

Transcript of Chris Freeland Director, Center for Biodiversity Informatics, Missouri Botanical Garden

Page 1: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Scribbles & Scraps:Darwin’s Library & the Online Display of

Annotated Biodiversity Literaturehttp://biodiversitylibrary.org/collection/darwinlibrary

Chris FreelandDirector, Center for Biodiversity Informatics,

Missouri Botanical GardenTechnical Director, Biodiversity Heritage Library

@chrisfreeland@chrisfreeland #bhlib #tdwg

Page 2: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

About Darwin’s Library

• Digital edition & virtual reconstruction of surviving books owned by Charles Darwin.

• Darwin’s son Francis transferred “Darwin’s Library” to the Botany School at Cambridge University in 1908

• More detail about the collection at:http://biodiversitylibrary.org/collection/darwinlibrary

@chrisfreeland #bhlib #tdwg

Page 3: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Funded Project

• Digitize most heavily annotated volumes in Darwin’s Library

• Funded via JISC / NEH: Transatlantic Digitisation Collaboration Grant

• Partners:– Cambridge University Library– Natural History Museum, London– American Museum of Natural History• Subaward to BHL via Missouri Botanical Garden

@chrisfreeland #bhlib #tdwg

Page 4: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

http://www.biodiversitylibrary.org/page/34117347 Agassiz, L. Contributions to the natural history of the United States of North America.

Page 5: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

The scribbly bits

Charles Darwin’s Marginalia, vol. 1 (1990)• Compiled by Mario Di Gregorio & Nicholas Gill• Painstaking work to :– transcribe Darwin’s annotations & markings– assign subjects & concepts– crosslink marginalia & related annotations on

loose slips or end papers• Data encoded in purpose-driven form &

format intended for print@chrisfreeland #bhlib #tdwg

Page 6: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Digitization Considerations• BHL had already digitized some of the titles through

mass scanning• Some materials couldn’t be scanned at CUL for fear of

damage – Truly unique documents

• Higher per page scanning cost at CUL for special handling

Compromise:• Scan the most heavily annotated volumes at CUL• Include existing content in BHL• Scan “surrogates” via BHL mass scanning from NHM

@chrisfreeland #bhlib #tdwg

Page 7: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

And then the fun began…• Originally envisioned simple Flickr-like notes

@chrisfreeland #bhlib #tdwg http://www.flickr.com/photos/chrisfreeland/4928966390/

Page 8: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

And then the reality• Realized true complexity of data parsing after

getting Di Gregorio & Gill’s data\n0015.v01.p01.c0117 |n0015.v01.p01.c0117:m01= 10—14 m / 13 w $ mere analogy $ / from_\

n0015.v01.p01.f1000 w \m4a $ 117    On combinations of characters in old Forms $

|n0015.v01.p01.c0117:m01i an /b |n0015.v01.p01.c0117:m01i fos /B/c |n0015.v01.p01.c0117:m01i rsa- /b |n0015.v01.p01.c0117:m01i tm /B/c |n0015.v01.p01.c0117:m01j faz:bird /e |n0015.v01.p01.c0117:m01j faz:dolphin /e |n0015.v01.p01.c0117:m01j faz:fish, sauroid /e |n0015.v01.p01.c0117:m01j faz:Ichthyosauri /e |n0015.v01.p01.c0117:m01j faz:Pterodactyl /e |n0015.v01.p01.c0117:m01j faz:reptile /e |n0015.v01.p01.c0117:m01j tiz:ancient /D

Page 9: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Structural markup instructions+n[4-digits] starts a book; then =a author =t title =e edition data =v volume data =p publication details =d date =l location [CUL (Cambridge University Library) or Down (Down

House, Kent)] =b Beagle-era / on board =x book has a Darwin signature (S) and/or is inscribed to him

@chrisfreeland #bhlib #tdwg

Page 10: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

In the body of the item:

\n starts a new page or other piece |n flags the other data entries

The 4-digit title number is followed by .v?? volume number (00 = 'only’) .p?? part number

page or other piece designator .b???r [roman-numbered front-matter] .c???? [arabic-numbered page-count] .d??? [end-matter with its own pagination] .f???? [Darwin's final end-notes/slips: f0? end-note (f00 'only') f1? end-slip last 2 digits numbering the sides] [A note or slip can have a 'head-note', flagged \H, describing the physical

characteristics of the piece.]

@chrisfreeland #bhlib #tdwg

Page 11: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden
Page 12: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Subject indexing instructions

Page 13: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

\n0015.v01.p01.c0117 |n0015.v01.p01.c0117:m01= 10—14 m / 13 w $ mere analogy $ / from_\n0015.v01.p01.f1000 w \m4a $ 117    On combinations of characters in old Forms $ |n0015.v01.p01.c0117:m01i an /b |n0015.v01.p01.c0117:m01i fos /B/c |n0015.v01.p01.c0117:m01i rsa- /b |n0015.v01.p01.c0117:m01i tm /B/c |n0015.v01.p01.c0117:m01j faz:bird /e |n0015.v01.p01.c0117:m01j faz:dolphin /e |n0015.v01.p01.c0117:m01j faz:fish, sauroid /e |n0015.v01.p01.c0117:m01j faz:Ichthyosauri /e |n0015.v01.p01.c0117:m01j faz:Pterodactyl /e |n0015.v01.p01.c0117:m01j faz:reptile /e |n0015.v01.p01.c0117:m01j tiz:ancient /D |n0015.v01.p01.c0117:m02= 14 u "Ichthyosauri" / @14 w $ ⸮ $ |n0015.v01.p01.c0117:m02i rsq /b |n0015.v01.p01.c0117:m02j faz:Ichthyosauri /e |n0015.v01.p01.c0117:m03= 22—24m / from_\n0015.v01.p01.f1000 w \m4a $ 117    On combinations of characters in old Forms $ |n0015.v01.p01.c0117:m03i fos /B/c |n0015.v01.p01.c0117:m03i sph /c |n0015.v01.p01.c0117:m03i tm /B/c |n0015.v01.p01.c0117:m03j faz:fish /e |n0015.v01.p01.c0117:m03j faz:reptile /e |n0015.v01.p01.c0117:m03j tiz:ancient /D |n0015.v01.p01.c0117:m04= 25 c "Crustacea" \a[corrected to `Cetacea']\c |n0015.v01.p01.c0117:m04i rsz /b

Page 14: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

http://www.biodiversitylibrary.org/page/34117347 Agassiz, L. Contributions to the natural history of the United States of North America.

Page 15: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

<insert magic here>

Page 16: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

http://www.biodiversitylibrary.org/page/34117347

Page 17: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

<magic> = code

• Regular expressions & SQL inserts• No UI for adding annotations, all data driven• Parsing completed by:– Scholars & programmers, not generalists &

enthusiasts• Parsing code is reusable within project,

unlikely to be useful outside due to data input– Purpose-driven, specific

@chrisfreeland #bhlib #tdwg

Page 18: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Data Model

http://code.google.com/p/bhl-bits/downloads @chrisfreeland #bhlib #tdwg

Page 19: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Future

Extensibility• Delivered the scholarly, nth degree option• Can be reused for simpler annotations

Phase II• New Darwin originals from CUL• Replace surrogates with originals• Refine user interface / user experience

@chrisfreeland #bhlib #tdwg

Page 20: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Outcomes & Perspective

• Incorporation of unique material of interest to many domains– Biology– Humanities– General public

• A glimpse into Darwin’s mind– And the minds of historians of science

@chrisfreeland #bhlib #tdwg

Page 21: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

AcknowledgementsTransatlantic Digitisation Collaboration Grant, Phase 1 sponsored by:• United Kingdom

JISC Joint Information Systems Committee of the Higher Education Founding Council of England & Wales (HEFCE) to Cambridge University Library and Natural History Museum (Award CCICP002)

• United StatesNEH National Endowment for the Humanities to Darwin Manuscripts Project of the American Museum of Natural History, with subaward to the Missouri Botanical Garden (Award PX-50026-09)

Contributors • Cambridge University Library (http://www.lib.cam.ac.uk)• Natural History Museum, London (http://www.nhm.ac.uk)• Darwin Manuscripts Project (http://darwin.amnh.org)The project wishes to express its gratitude to William Huxley Darwin

for permission to reproduce the Darwin manuscripts.@chrisfreeland #bhlib #tdwg

Page 22: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Credits• Edition of Darwin's annotations and other marks. Mario Di Gregorio and

Nicholas Gill, updated by Gill and produced as part of the Darwin Manuscripts Project of the American Museum of Natural History. Adam Goldstein and Huw Jones served as bibliographers. David Kohn, PI

• Digitisation of original Darwin copies by Cambridge University Library. Grant Young, PI

• Digitisation of surrogate copies by the Library of the Natural History Museum (London). Jane Smith and Judith McGee

• Additional surrogates drawn from works digitised by member libraries of the Biodiversity Heritage Library and contributors to the Internet Archive.

• Transcription interface developed by the Biodiversity Heritage Library Technical Unit at Missouri Botanical Garden. Chris Freeland and Mike Lichtenberg

@chrisfreeland #bhlib #tdwg

Page 23: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

http://www.biodiversitylibrary.org/page/33949861

Lyell, C. Principles of Geology. 1837.

Awesome

Page 24: Chris Freeland Director, Center for Biodiversity Informatics,  Missouri Botanical Garden

Questions?

http://biodiversitylibrary.org/collection/darwinlibrary

Email: [email protected]

Twitter: @chrisfreeland

Chris FreelandDirector, Center for Biodiversity Informatics, Missouri Botanical Garden

Technical Director, Biodiversity Heritage Library

@chrisfreeland #bhlib #tdwg