Archiving PDFs at the Orange County Register Sharon Clairemont Director, Content Research Freedom...
-
Upload
jesse-burke -
Category
Documents
-
view
224 -
download
0
Transcript of Archiving PDFs at the Orange County Register Sharon Clairemont Director, Content Research Freedom...
Archiving PDFs at the Orange County Register
Sharon Clairemont
Director, Content Research
Freedom Media Enterprises
Archiving PDFs at the Orange County Register
Why? How? How much?
The Orange County Register is a daily newspaper of 360,000 circulation.We also have 25 weekly community papers and a weekly Spanish-language paper.
Introduction Our purpose is to give you useful
information for planning PDF archiving.
Archiving PDFs at the Orange County Register will serve as an example.
Topics of Discussion Why archive PDFs? How do you archive PDFs? Issues in archiving PDFs
How to search PDFs Linking PDFs to page elements
Why archive PDFs? Revenue
PhotoAccess /Showcase
www.ocregister.com/photos Historical record Easy access to advertisements Contest entries Easy to share, transmit Links to related items on the page
In the beginning... Choosing a pagination system...
will involve many others in your organization, but needs you to represent
archiving issues and requirements. The process will be unique at your
company. The process at Orange County
Register...• The Register uses CCI as a complete publishing system / editorial environment
CCI to archive process• CCI archive script creates PDFs every morning HTML extract tells Distiller how to build the page; determines size/dpi; links are created between the PDF and elements on the page
• CCI script moves PDFs to “BasketsOutArchive”• BasketsOutArchive has subfolders
by product (section) and by date• Perl script from archive picks up PDFs from CCI and moves them into archive• Archivists’ quality control steps verify presence of PDFs as they archive each section.
Issues in PDF archiving Size and storage Quality control Corrections Copyright
Size and storage The Register stores PDFs at
200 dpi
Total PDFs 120,234 (since 10/00)
Avg. per day 204
Total size ~ 100 gigabytes
Avg. size / day ~ 171 megabytes
Quality Control: Why and How Do it (no one else will)
Assume the responsibility as you do for
other data. Archivists can check pages as they
archive each section.• Missing or incorrect pages are easier to find and correct sooner than later.
Corrections and copyright Page PDFs are considered the
same as microfilm in the issue of corrections.
The Register considers the page and its representations our own creation.
Searching PDFs at the Orange County Register
March 117, 2002
March 17, 2002
Questions?
dpi
Customers
Contests
Storage
Quality control
Process