Smithsonian Institution (SI) assessment: Smithsonian Brand ...
Preserving the Smithsonian Institution’s Web Presence
-
Upload
smithsonian-institution-archives -
Category
Education
-
view
873 -
download
0
description
Transcript of Preserving the Smithsonian Institution’s Web Presence
Lynda Schmitz Fuhrig and Jennifer Wright
Oct. 14, 2011
Preserving the Smithsonian Institution’s
Web Presence
Smithsonian Institution Archives Fair
The Mission of SI Archives
Appraise, acquire, and preserve the records of the Institution
Offer a range of research and reference services
Establish policy and provide expert guidance on record keeping practices
Create and promote products and services that broaden understanding of the Smithsonian
Provide professional archival and conservation expertise
Smithsonian’s First Home Page, 1995
The Smithsonian Today
Website and Social Media Registry A “record” is any official recorded
information, regardless of medium or characteristics, created, received, and maintained by a Smithsonian museum, office, or employee
Websites and social media accounts must be managed as records
Registry allows staff from across the Smithsonian to add and update information about all of their websites and social media accounts
Appraising Records
All records must be appraised to determine their ultimate disposition
Records appraised based on administrative, legal, historical, and research value
Records with long-term value are transferred to Archives
Appraising Traditional WebsitesWebsites are public face of Smithsonian Significant historical and research value Constantly changing Crawl annually and before and after
major redesigns Work with webmasters to determine if
crawls should be more or less frequent
Appraising Social Media Accounts
All social media accounts are used differently Each account appraised individually based on
content Accounts containing significant original content
will be fully captured each year Accounts consisting mostly of links to other
resources will be captured occasionally to document existence
Method and frequency of capture may depend on terms of service and ability to avoid capturing non-Smithsonian content
Past Web Archiving Procedures• Files transferred from the Smithsonian’s
IT office• HTTrack web crawler• Scripts used to create XHTML
preservation files but very manual and time-consuming
Heritrix
• Archival web crawler• Open source• Java• Developed by Internet Archive, National
Library of Norway and National and University Library of Iceland
WARC
WARC – Web ARChive file format International standard – ISO 28500:2009 Extension of the ARC format in use since
1996 Container format
Crawling in Heritrix
STRI website in 1995SIA Accession 05-032
Viewing a Crawl
More To Do
Social Media
Third-party issues Privacy concerns Different tools
Lessons Learned
In-house archiving takes time No one-size fits all solution Master site registry requires regular
updating
Contacts and Resources
Lynda Schmitz FuhrigDigital Services [email protected]
Jennifer WrightArchives and Information Management [email protected]
Smithsonian Institution Archives website:http://siarchives.si.edu