Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

45
Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program

Transcript of Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

Page 1: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

Digitization of Visual Resources

Jenn Riley

Metadata Librarian

IU Digital Library Program

Page 2: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Technical overview

Analog to digital conversion Resolution Bit depth Color representation Reflectivity and polarity Compression

Page 3: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Analog to digital conversion

Image is converted to a series of pixels laid out in a grid

Each pixel has a specific color, represented by a sequence of 1s and 0s

Pixel-based images are called “raster” images or “bitmaps”

Page 4: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Resolution (1)

Often referred to as “dpi” or “ppi” RATIO of number of pixels captured per inch

of original photo size 8x10 print scanned at 300ppi = 2400 x 3000

pixels 35mm slide (24x36mm!) scanned at 300ppi ≈ 212

x 318 pixels

Page 5: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Resolution (2)

“Spatial resolution” refers to pixel dimensions of image, e.g., 3000 x 2400 pixels

Flatbed and film scanners have a fixed focus, so they know how big the original is; digital cameras don’t

Page 6: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Bit depth (1)

Refers to number of bits (binary digits, places for zeroes and ones) devoted to storing color information about each pixel

1 bit (1) = 21 = 2 shades (“bitonal”) 2 bit (01) = 22 = 4 shades 4 bit (0010) = 24 = 16 shades 8 bit (11010001) = 28 = 256 shades (“grayscale”)

Page 7: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Bit depth (2)

1 bit (black & white) 2 bit (4 colors)

4 bit (16 colors) 8 bit (256 colors)

Page 8: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Color representation

RGB Scanners generally have sensors for Red, Green,

and Blue Each of these “channels” is stored separately in

the digital file 8 bits for each of 3 channels = 24 bit color

CMYK (Cyan, Magenta, Yellow and Black) is used for high-end “pre-press” printing purposes

Page 9: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Reflectivity and polarity

Positive Negative

Reflective Paper

Photographic prints

Transmissive Slide film Negative film

Page 10: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Compression

Makes files smaller for storage Files must be decompressed for viewing Lossless Lossy

“visually lossless”

Page 11: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Technical questions?

Analog to digital conversion Resolution Bit depth Color representation Reflectivity and polarity Compression

Page 12: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Setting specifications

Standards & best practices Capture once, use many Determine purpose Resolution Bit depth & color File formats Quality control

Page 13: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Standards & best practices

NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access

Moving Theory into Practice Book Online tutorial

California Digital Library Digital Image Format Standards

Western States Digital Imaging Best Practices

Page 14: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Capture once, use many

Create master image when scanning Capture all “important” information Meets all foreseeable needs For long-term storage and later use

Create derivatives for specific uses later Web delivery Printing Publication

Page 15: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Determine purpose

Nature of materials Artifactual For content only

Capture all important information But what’s “important”? Not always “what people can see”

Page 16: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Resolution (1)

Higher is not always better

Scan at highest resolution necessary to achieve your stated purpose, no higher

chart from Cornell’s online digital imaging tutorial: <http://www.library.cornell.edu/preservation/tutorial/conversion/conversion-03.html>

Page 17: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Resolution (2)

Color photographic materials 3000-6000 pixels on the long side 24-bit RGB

B/W photographic materials 3000-6000 pixels on the long side 8-bit grayscale (unless sepia tone is “important”)

Page 18: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Resolution comparison (1)

Page 19: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Resolution comparison (2)

600 dpi 300 dpi

Page 20: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Bit depth & color

Match current photo or match original scene Final master images should be 8 bits per

channel (8-bit grayscale, 24-bit RGB); some specialized projects using higher bit depths

Any color adjustments & other processing should be done in scanning software before final scan is done

Use almost the full tonal range; avoid “clipping”

Page 21: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Master file formats

TIFF (uncompressed) Virtually unanimously recommended by digital imaging

best practices “De facto” standard

JPEG2000 ISO/IEC IS 15444-1 | ITU-T T.800 Not patent-free Up-and-coming but not quite there yet Supports embedded metadata Uses wavelet-based compression

Page 22: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Why not JPEG?

Lossy-compressed every time they are saved

low compression, high quality high compression, low quality

Page 23: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Delivery file formats

Photographic materials: JPEG Text, line drawings: GIF PNG?

Page 24: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Quality control

Essential part of every digitization project Objective criteria

Can be automated Can check all items

Subjective criteria Require human checks Must sample

Page 25: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Specifications questions?

Standards & best practices Capture once, use many Determine purpose Resolution Bit depth & color File formats Quality control

Page 26: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

More information

NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access

Moving Theory into Practice Book Online tutorial

IU DLP Use of Digital Imaging Standards & Best Practices

Page 27: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

Visual Resources Metadata in Libraries

Jenn Riley

Metadata Librarian

IU Digital Library Program

Page 28: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

What is metadata?

“Data about data”

Page 29: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

A better definition

Other characteristics Structure Control

Origin Machine-generated Human-generated

In practice, often grows to cover data and meta-metadata

Page 30: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Levels of control

Data structure standards Data content standards

Page 31: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Creating metadata

HTML <meta> tags Spreadsheets Databases XML Digital library content management systems

ContentDM Greenstone

Page 32: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Types of metadata

Descriptive metadata Administrative metadata

Technical metadata Preservation metadata Rights metadata

Structural metadata

Page 33: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Purposes of descriptive metadata Description Discovery

Page 34: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

General descriptive metadata standards Dublin Core [example]

Unqualified Qualified

MARC/AACR2 [example]

MARCXML MODS [example]

Page 35: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Visual resources metadata standards CDWA [example]

VRA Core [example]

CCO

Page 36: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Technical metadata

For recording technical aspects of digital objects

Of use for long-term maintenance of data Standards for still images

NISO Z39.87: Data Dictionary – Technical Metadata for Digital Still Images

MIX

Page 37: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Structural metadata

For creating a logical structure between digital objects Multiple copies of same bibliographic item Multiple pages within item Grouping of pages into sections Multiple sizes of each page

METS is the current primary schema

Page 38: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

METS

metsHdr dmdSec amdSec

techMD rightsMD sourceMD digiprovMD

fileSec structMap structLink behaviorSec

Page 39: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Some other specialized metadata formats TEI EAD [example]

GILS CSDGM [example]

GEM CIDOC

Page 40: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Vocabularies

TGM I TGM II TGN GeoNet AAT LCSH LCNAF DCMI Type MIME Types

Page 41: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Other considerations

Standard formatting Repeatability of elements Describing original vs. digitized item Relationships between records Interoperability

Page 42: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Crosswalks

For transforming between metadata formats Usually refers to transforming between

content standards rather than structure standards, but not always

Good practice to create and store most robust metadata format possible, then create other views for specific needs

Page 43: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

The bottom line

Many concepts from a century and a half of library cataloging inform good metadata practices

But some re-examination needed for new environment

Need for automation Need for smart people contributing!

Page 44: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

More information

Individual schema documentation Caplan: Metadata Fundamentals for all

Librarians, 2003 NISO Press: Understanding Metadata IFLA Functional Requirements for

Bibliographic Records

Page 45: Digitization of Visual Resources Jenn Riley Metadata Librarian IU Digital Library Program.

October 21, 2004 L597: Humanities Computing

Thank you!

[email protected] These presentation slides:

<http://www.dlib.indiana.edu/~jenlrile/presentations/slis/04fall/l597walsh/L597.ppt>