Cold, Dark, and Lonely: An Archive Moves Online

15
Cold, Dark, and Lonely An Archive Moves Online Bryan Beecher IT Director ICPSR

description

Slide deck for my presentation at the All About Repositories webinar on 10/14/2009.

Transcript of Cold, Dark, and Lonely: An Archive Moves Online

Page 1: Cold, Dark, and Lonely: An Archive Moves Online

Cold, Dark, and LonelyAn Archive Moves OnlineBryan BeecherIT DirectorICPSR

Page 2: Cold, Dark, and Lonely: An Archive Moves Online

What’s ICPSR?

• Inter-university Consortium for Political and Social Research

• Clients– Higher education– US Government– Our “hot, flat, and crowded” world

• In business since 1962

Page 3: Cold, Dark, and Lonely: An Archive Moves Online

What do we do?

• Acquire, curate, and deliver social science data to researchers, students, policy-makers, etc.– JSTOR of data

• Cover many different fields– Political Science, Economics,

Sociology, Demography, Criminal Justice, and many more

Page 4: Cold, Dark, and Lonely: An Archive Moves Online

What content do we curate?

• Primarily survey data– Also aggregate government data

(such as Census data)

• Tabular– Rows = respondents– Columns = variables

• SAS, SPSS, Stata, even Excel

Page 5: Cold, Dark, and Lonely: An Archive Moves Online

ICPSR and OAIS

• Clients deposit data (ingest)• ICPSR normalizes content into plain

text data (ASCII, Unicode) and “setups” for stat pkgs + adds metadata (ingest + data mgmt)

• Preserves content (archival storage)• Makes it available to others (access)

Page 6: Cold, Dark, and Lonely: An Archive Moves Online

Access

• Mechanisms have evolved over time• Tapes + USPS• FTP• Gopher• Web

Page 7: Cold, Dark, and Lonely: An Archive Moves Online

Archival Storage

• Historically kept two copies on tape– Off-line, local (Ann Arbor, MI)

• Worked, but– Expensive– Cannot browse– Are the bits OK?

• “The Warehouse”…

Page 8: Cold, Dark, and Lonely: An Archive Moves Online
Page 9: Cold, Dark, and Lonely: An Archive Moves Online

But then in 2006…

• Created Chief Preservation Officer role– Nancy McGovern

• Assigned Archival Storage engineering and operations to the IT shop– Bryan Beecher

Page 10: Cold, Dark, and Lonely: An Archive Moves Online

2006 - 2008

• Digital Preservation Management program begins

• Warehouse cleared, closed• Tapes read, checked, destroyed• 6TB of content over 600k unique files• Lots of files

– Not so “cold” and “dark” any more…

Page 11: Cold, Dark, and Lonely: An Archive Moves Online
Page 12: Cold, Dark, and Lonely: An Archive Moves Online

Fedora, Part 1

• Lots of files, not so much metadata• Always know the aggregate object

(“study” number)• Use simple Fedora Content Model

(the data “keepsake”) to store the content

• Small step from “files” to “objects”

Page 13: Cold, Dark, and Lonely: An Archive Moves Online

Fedora, Part 2

• Would really like “smarter” objects– Strongly typed– Well defined relationships– Rich services

• Definitely possible, particularly for more modern content (post 2002)

• If only we had the time and money…

Page 14: Cold, Dark, and Lonely: An Archive Moves Online

NSF EAGER grant

• EArly-concept Grants for Exploratory Research– Eighteen months for 1.5 people

• Deliverables– CMA for social science data and docs– Packaging tools to create FOXML– Nifty SDeps and SDefs

Page 15: Cold, Dark, and Lonely: An Archive Moves Online

Thank you!

Bryan Beechertechaticpsr.blogspot.com