Organized Organized Digital Library Development Digital Library Development … from the Bottom Up...

20
Organized Organized Digital Library Development Digital Library Development from the Bottom Up from the Bottom Up University of Alabama Libraries Jody L. Jody L. DeRidder DeRidder jlderidder@ua jlderidder@ua .edu .edu Image courtesy of Life Magazine

Transcript of Organized Organized Digital Library Development Digital Library Development … from the Bottom Up...

  Organized Organized Digital Library Development Digital Library Development … … from the Bottom Upfrom the Bottom Up

University of Alabama Libraries

Jody L. DeRidderJody L. [email protected] [email protected]

Image courtesy of Life Magazine

Libraries organize information… primarily books.

Trinity College Library, Dublin, as captured by Candida Höferin her book Libraries (Thames and Hudson ,UK: 2005).

Photo credit: Flickr user "Libby", used with permission (creative commons)

If libraries organize books… Why not digital files??

It’s all information!

A digital object may belong in MANY potential virtual collections…

… but it originated from ONE SINGLE ANALOG collection. Provenance trumps all!

Slavery African Americans Sheet Music Tombigbee River Southern History … and more

“Gum Tree Canoe,” Published by G.P. Reed (Boston: 1847). Wade Hall collection of Southern History and Culture, Hoole Special Collections, University of Alabama Libraries.

Bringing Order to Chaos

University of Alabama Libraries

Holder ID: u0003

Collection ID: 0000023

Item ID: 0000007

Sequence ID: 0005

Archival File: u0003_0000023_0000007_0005.tif

1) Clarity

2) Low cost

3) Simple

4) Extensible

u0003_0001980_0000001 is the first digitized item in the MSS 1980 collection

HOLDER ID

COLLECTION ID

The Digitization Working Area…

Collection folders are named for the collection identifier. Allowed subfolders include:

Admin Metadata Scans Transcripts

Compound objects have their own subfolders for pages, named for the item.

And a Collection Folder in the Working Area

Bringing Content Up to the Level Of the WEB!!! Greater Usability and Access == Longer Life

Images … ImageMagick: http://www.imagemagick.org(it’s free!)

Protected archive area

u0003 u0003

0000023 0000023

0000007

0005

u0003_0000023_0000007_0005.tif

0000007

0005

Thumb and large-size derivatives

Web accessible area

Audio … LAME: http://lame.sourceforge.netOCR … TESSERACT: http://code.google.com/p/tesseract-ocr/

u0003 slide

Identification, Organization and Consistency

Each segment of numbers:

Holder ID Collection ID Item ID Sequence ID

is used in the directory structure.

The directory for u0003_0000003_0002_001.tif

Is simply:

u0003/ 0000003/ 0002/ 001/

Dropping the Technical Metadata in… where it belongs

Makes METS creation a Piece of Cake!

(and redundant!)

Using FITS, the File Information Tool Set developed by Harvard which encapsulates JHOVE, DROID, ExifTool and other tools: http://code.google.com/p/fits/

An Example of the Lowest- Cost Model: The Alabama Digital Preservation Network http://www.adpn.org/

http://www.lockss.org/

Lots of Copies Keeps Stuff Safe!!

storage area

Simple, Clear Hierarchical Organization:

Holder ID Collection ID Item ID Sequence ID

http://acumen.lib.ua.edu

ACCESS! Via Acumen

(also free!)

XML agnostic No ingest No metadata modifications All content easily accessible Open to search engines

Now it’s organized. But can users find what they need?

Trinity College Library, Dublin, as captured by Candida Höferin her book Libraries (Thames and Hudson ,UK: 2005).

Usability Testing

* U=Undergraduate, G=Graduate Student, PG=Post graduate volunteer, S=University staff

Participant Number 1 2 3 4 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21

Educational Status* G G U G G G S G U G U U U U U U PG

G G G

Educational Background in

History

  X   X   X                   X       X

Previous Special Collections Experience

X X   X X X X       X   X   X X X     X

Previous Digital Collections Experience

X X   X X X X X X X X X X X X X   X   X

English as a Second Language

      X             X X     X       X  

http://transcribe.lib.ua.edu

http://tagit.lib.ua.edu

http://tagit.lib.ua.edu

S. R. Ranganthan (1931), paraphrased:

Information is for use.Information is for use.

Every user his / her information.

Every information its user.

Save the time of the user.Save the time of the user.

The library is a growing organism.The library is a growing organism.

Remember why we’re here…

Image: jscreationzs / FreeDigitalPhotos.net