Introduction to metadata Jenn Riley Metadata Librarian Digital Library Program.
Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.
-
Upload
dwain-page -
Category
Documents
-
view
223 -
download
0
Transcript of Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.
Digitization of Visual Resources
Jenn Riley
Metadata Librarian
IU Digital Library Program
October 21, 2004 L597: Humanities Computing
Technical overview
Analog to digital conversion Resolution Bit depth Color representation Reflectivity and polarity Compression
October 21, 2004 L597: Humanities Computing
Analog to digital conversion
Image is converted to a series of pixels laid out in a grid
Each pixel has a specific color, represented by a sequence of 1s and 0s
Pixel-based images are called “raster” images or “bitmaps”
October 21, 2004 L597: Humanities Computing
Resolution (1)
Often referred to as “dpi” or “ppi” RATIO of number of pixels captured per inch
of original photo size 8x10 print scanned at 300ppi = 2400 x 3000
pixels 35mm slide (24x36mm!) scanned at 300ppi ≈ 212
x 318 pixels
October 21, 2004 L597: Humanities Computing
Resolution (2)
“Spatial resolution” refers to pixel dimensions of image, e.g., 3000 x 2400 pixels
Flatbed and film scanners have a fixed focus, so they know how big the original is; digital cameras don’t
October 21, 2004 L597: Humanities Computing
Bit depth (1)
Refers to number of bits (binary digits, places for zeroes and ones) devoted to storing color information about each pixel
1 bit (1) = 21 = 2 shades (“bitonal”) 2 bit (01) = 22 = 4 shades 4 bit (0010) = 24 = 16 shades 8 bit (11010001) = 28 = 256 shades (“grayscale”)
October 21, 2004 L597: Humanities Computing
Bit depth (2)
1 bit (black & white) 2 bit (4 colors)
4 bit (16 colors) 8 bit (256 colors)
October 21, 2004 L597: Humanities Computing
Color representation
RGB Scanners generally have sensors for Red, Green,
and Blue Each of these “channels” is stored separately in
the digital file 8 bits for each of 3 channels = 24 bit color
CMYK (Cyan, Magenta, Yellow and Black) is used for high-end “pre-press” printing purposes
October 21, 2004 L597: Humanities Computing
Reflectivity and polarity
Positive Negative
Reflective Paper
Photographic prints
Transmissive Slide film Negative film
October 21, 2004 L597: Humanities Computing
Compression
Makes files smaller for storage Files must be decompressed for viewing Lossless Lossy
“visually lossless”
October 21, 2004 L597: Humanities Computing
Technical questions?
Analog to digital conversion Resolution Bit depth Color representation Reflectivity and polarity Compression
October 21, 2004 L597: Humanities Computing
Setting specifications
Standards & best practices Capture once, use many Determine purpose Resolution Bit depth & color File formats Quality control
October 21, 2004 L597: Humanities Computing
Standards & best practices
NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access
Moving Theory into Practice Book Online tutorial
California Digital Library Digital Image Format Standards
Western States Digital Imaging Best Practices
October 21, 2004 L597: Humanities Computing
Capture once, use many
Create master image when scanning Capture all “important” information Meets all foreseeable needs For long-term storage and later use
Create derivatives for specific uses later Web delivery Printing Publication
October 21, 2004 L597: Humanities Computing
Determine purpose
Nature of materials Artifactual For content only
Capture all important information But what’s “important”? Not always “what people can see”
October 21, 2004 L597: Humanities Computing
Resolution (1)
Higher is not always better
Scan at highest resolution necessary to achieve your stated purpose, no higher
chart from Cornell’s online digital imaging tutorial: <http://www.library.cornell.edu/preservation/tutorial/conversion/conversion-03.html>
October 21, 2004 L597: Humanities Computing
Resolution (2)
Color photographic materials 3000-6000 pixels on the long side 24-bit RGB
B/W photographic materials 3000-6000 pixels on the long side 8-bit grayscale (unless sepia tone is “important”)
October 21, 2004 L597: Humanities Computing
Resolution comparison (1)
October 21, 2004 L597: Humanities Computing
Resolution comparison (2)
600 dpi 300 dpi
October 21, 2004 L597: Humanities Computing
Bit depth & color
Match current photo or match original scene Final master images should be 8 bits per
channel (8-bit grayscale, 24-bit RGB); some specialized projects using higher bit depths
Any color adjustments & other processing should be done in scanning software before final scan is done
Use almost the full tonal range; avoid “clipping”
October 21, 2004 L597: Humanities Computing
Master file formats
TIFF (uncompressed) Virtually unanimously recommended by digital imaging
best practices “De facto” standard
JPEG2000 ISO/IEC IS 15444-1 | ITU-T T.800 Not patent-free Up-and-coming but not quite there yet Supports embedded metadata Uses wavelet-based compression
October 21, 2004 L597: Humanities Computing
Why not JPEG?
Lossy-compressed every time they are saved
low compression, high quality high compression, low quality
October 21, 2004 L597: Humanities Computing
Delivery file formats
Photographic materials: JPEG Text, line drawings: GIF PNG?
October 21, 2004 L597: Humanities Computing
Quality control
Essential part of every digitization project Objective criteria
Can be automated Can check all items
Subjective criteria Require human checks Must sample
October 21, 2004 L597: Humanities Computing
Specifications questions?
Standards & best practices Capture once, use many Determine purpose Resolution Bit depth & color File formats Quality control
October 21, 2004 L597: Humanities Computing
More information
NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access
Moving Theory into Practice Book Online tutorial
IU DLP Use of Digital Imaging Standards & Best Practices
Visual Resources Metadata in Libraries
Jenn Riley
Metadata Librarian
IU Digital Library Program
October 21, 2004 L597: Humanities Computing
What is metadata?
“Data about data”
October 21, 2004 L597: Humanities Computing
A better definition
Other characteristics Structure Control
Origin Machine-generated Human-generated
In practice, often grows to cover data and meta-metadata
October 21, 2004 L597: Humanities Computing
Levels of control
Data structure standards Data content standards
October 21, 2004 L597: Humanities Computing
Creating metadata
HTML <meta> tags Spreadsheets Databases XML Digital library content management systems
ContentDM Greenstone
October 21, 2004 L597: Humanities Computing
Types of metadata
Descriptive metadata Administrative metadata
Technical metadata Preservation metadata Rights metadata
Structural metadata
October 21, 2004 L597: Humanities Computing
Purposes of descriptive metadata Description Discovery
October 21, 2004 L597: Humanities Computing
General descriptive metadata standards Dublin Core [example]
Unqualified Qualified
MARC/AACR2 [example]
MARCXML MODS [example]
October 21, 2004 L597: Humanities Computing
Visual resources metadata standards CDWA [example]
VRA Core [example]
CCO
October 21, 2004 L597: Humanities Computing
Technical metadata
For recording technical aspects of digital objects
Of use for long-term maintenance of data Standards for still images
NISO Z39.87: Data Dictionary – Technical Metadata for Digital Still Images
MIX
October 21, 2004 L597: Humanities Computing
Structural metadata
For creating a logical structure between digital objects Multiple copies of same bibliographic item Multiple pages within item Grouping of pages into sections Multiple sizes of each page
METS is the current primary schema
October 21, 2004 L597: Humanities Computing
METS
metsHdr dmdSec amdSec
techMD rightsMD sourceMD digiprovMD
fileSec structMap structLink behaviorSec
October 21, 2004 L597: Humanities Computing
Some other specialized metadata formats TEI EAD [example]
GILS CSDGM [example]
GEM CIDOC
October 21, 2004 L597: Humanities Computing
Vocabularies
TGM I TGM II TGN GeoNet AAT LCSH LCNAF DCMI Type MIME Types
October 21, 2004 L597: Humanities Computing
Other considerations
Standard formatting Repeatability of elements Describing original vs. digitized item Relationships between records Interoperability
October 21, 2004 L597: Humanities Computing
Crosswalks
For transforming between metadata formats Usually refers to transforming between
content standards rather than structure standards, but not always
Good practice to create and store most robust metadata format possible, then create other views for specific needs
October 21, 2004 L597: Humanities Computing
The bottom line
Many concepts from a century and a half of library cataloging inform good metadata practices
But some re-examination needed for new environment
Need for automation Need for smart people contributing!
October 21, 2004 L597: Humanities Computing
More information
Individual schema documentation Caplan: Metadata Fundamentals for all
Librarians, 2003 NISO Press: Understanding Metadata IFLA Functional Requirements for
Bibliographic Records
October 21, 2004 L597: Humanities Computing
Thank you!
[email protected] These presentation slides:
<http://www.dlib.indiana.edu/~jenlrile/presentations/slis/04fall/l597walsh/L597.ppt>