2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof....

76
2002.09.10 - SLIDE 1 IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2002 SIMS 202: Information Organization and Retrieval
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof....

Page 1: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 1IS 202 - Fall 2002

Lecture 05: Metadata: Introduction

Prof. Ray Larson & Prof. Marc Davis

UC Berkeley SIMS

Tuesday and Thursday 10:30 am - 12:00 am

Fall 2002

SIMS 202:

Information Organization

and Retrieval

Page 2: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 2IS 202 - Fall 2002

Lecture Contents

• Review– Categories– The Vocabulary Problem

• Organization of Information

• Metadata

• Kinds of Metadata

• Dublin Core

Page 3: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 3IS 202 - Fall 2002

Lecture Contents

• Review– Categories– The Vocabulary Problem

• Organization of Information

• Metadata

• Kinds of Metadata

• Dublin Core

Page 4: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 4IS 202 - Fall 2002

Categorization

• Classical categorization– Necessary and sufficient conditions for

membership– Generic-to-specific monohierarchical structure

• Modern categorization– Characteristic features (family resemblances)– Centrality/typicality (prototypes)– Basic-level categories

Page 5: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 5IS 202 - Fall 2002

Properties of Categorization

• Family Resemblance– Members of a category may be related to one

another without all members having any property in common

• Prototypes– Some members of a category may be “better

examples” than others, i.e., “prototypical” members

Page 6: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 6IS 202 - Fall 2002

Furnas: The Vocabulary Problem

• People use different words to describe the same things– “If one person assigns the name of an item,

other untutored people will fail to access it on 80 to 90 percent of their attempts.”

– “Simply stated, the data tell us there is no one good access term for most objects.”

Page 7: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 7IS 202 - Fall 2002

Vocabulary Problem Solutions?

• Furnas et al.– Make the user memorize precise system

meanings– Have the user and system interact to identify

the precise referent

• Minsky and Lenat– Give the system “commonsense” so it can

understand what the user’s words can mean

Page 8: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 8IS 202 - Fall 2002

Calling Things Names

• Impromptu Study by Nathan Good

– Asked people to identify 3 common objects

– Although the objects were fairly common, people came up with widely different names for them

– Found 14 people from four different contexts (Soda hall, my home, HP labs, bus stop)

Page 9: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 9IS 202 - Fall 2002

Results

• 2 - Vanilla Coke Bottle• 1 - Vanilla Coke Can• 1 - Coke• 1 - Empty bottle• 2 - Bottle• 2 - Coke bottle• 1 - Bottle of coke• 2 - Plastic bottle• 1 - Empty vanilla coke

bottle• 1 - 20 oz coke bottle

• 7- Pen• 1 - A horizontal line• 1 - Blue Ball point Pen• 1 - Ink Pen • 1 - Pencil• 1 - Pentel Pen• 1 - Transparent Pen• 1 - Pentel pen with blue

rubber grip

Page 10: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 10IS 202 - Fall 2002

Results

• 9 - Notebook• 1 - A black object• 1 - Black Media Star

Notebook• 1 - Black Notebook• 1 - Binder• 1 - Spiral notebook

0

1

2

3

4

5

6

7

8

9

notebook

people

Page 11: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 11IS 202 - Fall 2002

Lecture Contents

• Review– Categories– The Vocabulary Problem

• Organization of Information

• Metadata

• Kinds of Metadata

• Dublin Core

Page 12: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 12IS 202 - Fall 2002

Organization of Information

• Is there a basic human need to put things into some sort of order?– Much of natural language concerns

categories of things rather than individual things

– Why do we organize things and information?• Why do spoons go in THAT drawer in the kitchen

and not in a can in the garage?• Why do your favorite books go on one shelf and

not-so-favorite on another?

Page 13: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 13IS 202 - Fall 2002

Why Organize Information?

• The main reason– So that you can find things more effectively

• I.e., effective retrieval is predicated on some sort of organization applied to information resources

• Historically there have been many institutions and tools devoted to information organization– Libraries– Museums– Archives– Indexes and catalogs, dictionaries, phone books, etc.

Page 14: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 14IS 202 - Fall 2002

Why Organize Information?

• A question of scale:– Using your own ad hoc set of categories and

methods to organize your own collection of books seems to work fine…

– What if your collection grew to• 10 Times the size? How would you organize it?• 100 Times? • 1000 Times?• 100000 times?

Page 15: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 15IS 202 - Fall 2002

What is Information Organization?

• Identifying the existence of all types of information-bearing entities as they are made available

• Identifying the works contained within those information-bearing entities or as parts of them

• Systematically pulling together these information-bearing entities into collections in libraries, archives, museums, Internet communications files and other such depositories

From Hagler via Taylor, Chap. 1

Page 16: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 16IS 202 - Fall 2002

What is Information Organization?

• Producing lists of these information-bearing entities prepared according to standard rules for citation

• Providing name, title, subject and other useful access to these information-bearing entities

• Providing the means of locating each information-bearing entity or a copy of it

Page 17: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 17IS 202 - Fall 2002

Organizing Information

• Libraries

• Archives

• Museums and galleries

• Internet

• Corporate and office environments

Page 18: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 18IS 202 - Fall 2002

Key Issues in This Course

• How to describe information resources or information-bearing objects in ways so that they may be effectively used by those who need to use them– Organizing

• How to find the appropriate information resources or information-bearing objects for someone’s (or your own) needs– Retrieving

Page 19: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 19IS 202 - Fall 2002

Key Issues

Creation

Utilization Searching

Active

Inactive

Semi-Active

Retention/Mining

Disposition

Discard

Using Creating

AuthoringModifying

OrganizingIndexing

StoringRetrieval

DistributionNetworking

AccessingFiltering

Page 20: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 20IS 202 - Fall 2002

Organizing/Indexing

• Collecting and integrating information

• Affects data, information and metadata

• “Metadata” describes data and information– More on this later

• Organizing information– Types of organization?

• Indexing

Page 21: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 21IS 202 - Fall 2002

Accessing/Filtering

• Using the organization created in the O/I stage to:– Select desired (or relevant) information– Locate that information– Retrieve the information from its storage

location (often via a network)

Page 22: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 22IS 202 - Fall 2002

Structure of an IR System

Interest profiles& Queries

Documents & data

Rules of the game =Rules for subject indexing +

Thesaurus (which consists of

Lead-InVocabulary

andIndexing

Language

StorageLine

Potentially Relevant

Documents

Comparison/Matching

Store1: Profiles/Search requests

Store2: Documentrepresentations

Indexing (Descriptive and

Subject)

Formulating query in terms of

descriptors

Storage of profiles

Storage of Documents

Information Storage and Retrieval System

Page 23: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 23IS 202 - Fall 2002

Lecture Contents

• Review– Categories– The Vocabulary Problem

• Organization of Information

• Metadata

• Kinds of Metadata

• Dublin Core

Page 24: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 24IS 202 - Fall 2002

Metadata

• Metadata is– “Data about Data” (database systems)– Information about Information

• First used (to the best we can discover) in 1978 (meta-data)

• Used for databases in (Meta-Data Base)– “a data base which itself contains the structural and

semantic data of other data bases”» Thomas R. Cousins & Wayne D. Dominick, “The

Management of Data Bases of Data Bases” ASIS Proceedings, 1978.

Page 25: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 25IS 202 - Fall 2002

Metadata

• Structures and languages for the description of information resources and their elements (components or features)

• “Metadata is information on the organization of the data, the various data domains, and the relationship between them” (Baeza-Yates p. 142)

Page 26: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 26IS 202 - Fall 2002

Metadata

• Often two main types of metadata are distinguished:– Descriptive metadata

• Describes the information/data object and its properties

• May use a variety of descriptive formats and rules

– Topical metadata• Describes the topic or “aboutness” of an

information/data object • May include a variety of vocabularies for

describing, subjects, topics, categories, etc.

Page 27: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 27IS 202 - Fall 2002

Lecture Contents

• Review– Categories– The Vocabulary Problem

• Organization of Information

• Metadata

• Kinds of Metadata

• Dublin Core

Page 28: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 28IS 202 - Fall 2002

Types of Metadata

• Element names

• Element description

• Element representation

• Element coding

• Element semantics

• Element classification

Page 29: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 29IS 202 - Fall 2002

How Can You Describe an Information-Bearing Object?

Page 30: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 30IS 202 - Fall 2002

Goals of Descriptive Cataloging

• To enable a person to find a document of which– The author, or– The title, or– The subject is known

• To show what a library has– By a given author– On a given subject (and related subjects)– In a given kind (or form) of literature.

• To assist in the choice of a document– As to its edition (bibliographically)– As to its character (literary or topical)

Charles A. Cutter, 1876

Page 31: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 31IS 202 - Fall 2002

Rules for Descriptive Cataloging

• ISBD

• AACR

• AACR II

Page 32: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 32IS 202 - Fall 2002

AACRII

• Sources of Information

• ISBD areas

• Choice of Access Points

Page 33: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 33IS 202 - Fall 2002

Sources of Information

• Each different type of material has a preferred location for deriving information about it– Books and printed material

• Title page

– Cartographic materials (maps, globes, etc)• The map itself, or containers, stands, etc.

– Sound recordings• Disc label, cassette label, etc.

Page 34: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 34IS 202 - Fall 2002

ISBD Areas

• Title and statement of responsibility

• Edition

• Material or type of publication specification

• Publication, distribution (etc.)

• Physical description

• Series

• Notes

• Standard numbers

Page 35: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 35IS 202 - Fall 2002

ISBD Punctuation

• Title Proper (GMD) = Parallel title : other title info / First statement of responsibility ; others. -- Edition information. -- Material. -- Place of Publication : Publisher Name, Date. -- Material designation and extent ; Dimensions of item. -- (Title of Series / Statement of responsibility). -- Notes. -- Standard numbers: terms of availability (qualifications).

Page 36: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 36IS 202 - Fall 2002

Bibliographic Record

• Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. -- (Library science text series).

Page 37: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 37IS 202 - Fall 2002

Choice of Access Points

• Title(s) (Always main title)

• Main Entry??

• Added Entries

• Series Titles

• Identifying Numbers

Page 38: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 38IS 202 - Fall 2002

More Metadata Systems

• The following are a sample of metadata systems for a variety of special types of data/documents/objects

Page 39: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 39IS 202 - Fall 2002

Metadata Systems and Standards

• Naming and ID systems• Bibliographic description

– Texts

• Music• Images and objects• Numeric data• Geospatial data• Collections• Video and motion pictures

Page 40: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 40IS 202 - Fall 2002

Naming and ID Systems

• URLs (Uniform Resource Locators)– URIs (Uniform Resource Indentifiers)

• URNs (Uniform Resource Names )

• URCs (Uniform Resource Characteristics)

• Kahn/Wilensky Handles

• SICI (Serial Item and Content Identifiers)

• ISBN

• ISSN

Page 41: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 41IS 202 - Fall 2002

Bibliographic Description

• MARC (Machine Readable Cataloging)

• DUBLIN CORE– Warwick Framework for Dublin Core Metadata

• GILS (Government Information Locator Service)

• RFC 1807 (Format for Bibliographic Records)

• RDF (Resource Description Format)

Page 42: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 42IS 202 - Fall 2002

More Bibliographic Descriptors

• TEI Headers (Text Encoding initiative)

• BibTex

• PICS (Platform for Internet Content Selection)

• SOIF (Summary Object Interchange Format)

Page 43: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 43IS 202 - Fall 2002

Music

• Standard Music Description Language (SMDL)

Page 44: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 44IS 202 - Fall 2002

Numeric Data

• ICPSR Data Documentation Initiative (SGML DTD development)

• Standard for Survey Design and Statistical Methodology Metadata (SDSM)

Page 45: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 45IS 202 - Fall 2002

Images and Objects

• Categories for the Description of Works of Art (Getty Art Institute)

• Consortium for the Computer Interchange of Museum Information (CIMI)

• RLG REACH Element Set (for Shared Description of Museum Objects)

• VRA Core Categories (Visual Resources Association)

Page 46: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 46IS 202 - Fall 2002

Geospatial Data

• Content Standards for Digital Geospatial Metadata

• FGDC (Federal Geographic Data Committee)

• ASTM Section D18.01.05 Draft Specification Content Specification for Digital Geospatial Metadata (American Society for Testing and Materials (ASTM)

Page 47: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 47IS 202 - Fall 2002

Collection Level Descriptors

• EAD (Encoded Archival Description)

• Z39.50 Profile for Access to Digital Collections

• RSLP Collection Description (Research Support Libraries Programme)

Page 48: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 48IS 202 - Fall 2002

Video and Motion Pictures: Multimedia

• MPEG-7 (more on this later)

• Video Development Initiative (ViDe) User's Guide: Dublin Core Application Profile for Digital Video

• Data Dictionary for Audio/Video Metadata (Library of Congress Digital Audio-Visual Preservation Prototyping project)

Page 49: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 49IS 202 - Fall 2002

Mega-Metadata Standards

• METS - Metadata Encoding and Transmission Standard– Developed by the Digital Library Federation as an

implementation strategy for preservation metadata– "XML document format for encoding metadata

necessary for both management of digital library objects within a repository and exchange of such objects between repositories (or between repositories and their users)”

– Provides a flexible mechanism for encoding descriptive, administrative, and structural metadata for a digital library object, and for expressing the complex links between these various forms of metadata

Page 50: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 50IS 202 - Fall 2002

Lecture Contents

• Review– Categories– The Vocabulary Problem

• Organization of Information

• Metadata

• Kinds of Metadata

• Dublin Core

Page 51: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 51IS 202 - Fall 2002

Dublin Core

• Simple metadata for describing internet resources

• For “Document-Like Objects”

• 15 Elements (in base DC)

Page 52: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 52IS 202 - Fall 2002

Dublin Core Elements

• Title

• Creator

• Subject

• Description

• Publisher

• Other Contributors

• Date

• Resource Type

• Format

• Resource Identifier

• Source

• Language

• Relation

• Coverage

• Rights Management

Page 53: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 53IS 202 - Fall 2002

Title

• Label: TITLE

• The name given to the resource by the CREATOR or PUBLISHER

Page 54: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 54IS 202 - Fall 2002

Author or Creator

• Label: CREATOR

• The person(s) or organization(s) primarily responsible for the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources.

Page 55: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 55IS 202 - Fall 2002

Subject and Keywords

• Label: SUBJECT • The topic of the resource, or keywords or

phrases that describe the subject or content of the resource. The intent of the specification of this element is to promote the use of controlled vocabularies and keywords. This element might well include scheme-qualified classification data (for example, Library of Congress Classification Numbers or Dewey Decimal numbers) or scheme-qualified controlled vocabularies (such as Medical Subject Headings or Art and Architecture Thesaurus descriptors) as well.

Page 56: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 56IS 202 - Fall 2002

Description

• Label: DESCRIPTION • A textual description of the content of the

resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. Future metadata collections might well include computational content description (spectral analysis of a visual resource, for example) that may not be embeddable in current network systems. In such a case this field might contain a link to such a description rather than the description itself.

Page 57: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 57IS 202 - Fall 2002

Publisher

• Label: PUBLISHER

• The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource.

Page 58: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 58IS 202 - Fall 2002

Other Contributors

• Label: CONTRIBUTORS

• Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specified in the CREATOR element (for example, editors, transcribers, illustrators, and convenors).

Page 59: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 59IS 202 - Fall 2002

Date

• Label: DATE• The date the resource was made available in its

present form. The recommended best practice is an 8 digit number in the form YYYYMMDD as defined by ANSI X3.30-1985. In this scheme, the date element for the day this is written would be 19961203, or December 3, 1996. Many other schema are possible, but if used, they should be identified in an unambiguous manner.

Page 60: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 60IS 202 - Fall 2002

Resource Type

• Label: TYPE • The category of the resource, such as

home page, novel, poem, working paper, preprint, technical report, essay, dictionary. It is expected that RESOURCE TYPE will be chosen from an enumerated list of types. One preliminary set of such types can be found at the following URL (now out of date): http://www.roads.lut.ac.uk/Metadata/DC-ObjectTypes.html

Page 61: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 61IS 202 - Fall 2002

Format

• Label: FORMAT • The data representation of the resource, such as

text/html, ASCII, Postscript file, executable application, or JPEG image. The intent of specifying this element is to provide information necessary to allow people or machines to make decisions about the usability of the encoded data (what hardware and software might be required to display or execute it, for example). As with RESOURCE TYPE, FORMAT will be assigned from enumerated lists such as registered Internet Media Types (MIME types). In principal, formats can include physical media such as books, serials, or other non-electronic media.

Page 62: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 62IS 202 - Fall 2002

Resource Identifier

• Label: IDENTIFIER

• String or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (when implemented). Other globally-unique identifiers,such as International Standard Book Numbers (ISBN) or other formal names would also be candidates for this element.

Page 63: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 63IS 202 - Fall 2002

Source

• Label: SOURCE

• The work, either print or electronic, from which this resource is derived, if applicable. For example, an html encoding of a Shakespearean sonnet might identify the paper version of the sonnet from which the electronic version was transcribed.

Page 64: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 64IS 202 - Fall 2002

Language

• Label: LANGUAGE

• Language(s) of the intellectual content of the resource. Where practical, the content of this field should coincide with the Z39.53 three character codes for written languages. See: http://www.sil.org/sgml/nisoLang3-1994.html

Page 65: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 65IS 202 - Fall 2002

Relation

• Label: RELATION• Relationship to other resources. The intent of

specifying this element is to provide a means to express relationships among resources that have formal relationships to others, but exist as discrete resources themselves. For example, images in a document, chapters in a book, or items in a collection. A formal specification of RELATION is currently under development. Users and developers should understand that use of this element should be currently considered experimental.

Page 66: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 66IS 202 - Fall 2002

Coverage

• Label: COVERAGE

• The spatial locations and temporal duration characteristic of the resource. Formal specification of COVERAGE is currently under development. Users and developers should understand that use of this element should be currently considered experimental.

Page 67: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 67IS 202 - Fall 2002

Rights Management

• Label: RIGHTS • The content of this element is intended to be a

link (a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement, or perhaps a server that would provide such information in a dynamic way. The intent of specifying this field is to allow providers a means to associate terms and conditions or copyright statements with a resource or collection of resources. No assumptions should be made by users if such a field is empty or not present.

Page 68: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 68IS 202 - Fall 2002

The Same Item in Different Metadata Systems

• ISBD

• Dublin Core

• RFC 1807

• TEI Header

• MARC Record

Page 69: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 69IS 202 - Fall 2002

ISBD Punctuation

• Title Proper (GMD) = Parallel title : other title info / First statement of responsibility ; others. -- Edition information. -- Material. -- Place of Publication : Publisher Name, Date. -- Material designation and extent ; Dimensions of item. -- (Title of Series / Statement of responsibility). -- Notes. -- Standard numbers: terms of availability (qualifications).

Page 70: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 70IS 202 - Fall 2002

Bibliographic Record

• Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. -- (Library science text series).

Page 71: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 71IS 202 - Fall 2002

Dublin Core

• TITLE: Introduction to cataloging and classification• CREATOR: Taylor, Arlene G.• OTHER CONTRIBUTOR: Wynar, Bohdan S.• DATE: 1992• FORMAT: BOOK• LANGUAGE: ENG• PAGES: 633• PUBLISHER: Libraries Unlimited• SUBJECT: Cataloging.• SUBJECT: subject cataloging.• SUBJECT: Classification -- Books• DESCRIPTION: Textbook on cataloging and classification• RESOURCE TYPE: text.monograph• RESOURCE IDENTIFIER: (ISBN) 0872879674

Page 72: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 72IS 202 - Fall 2002

RFC 1807

• BIB-VERSION:: CS-TR-v2.1• ID:: UCB//123456• ENTRY:: September 9, 1997• TYPE:: BOOK• TITLE:: Introduction to cataloging and classification• AUTHOR:: Wynar, Bohdan S.• AUTHOR:: Taylor, Arlene G.• DATE:: 1992• PAGES:: 633• COPYRIGHT:: Libraries Unlimited, 1992• SERIES:: Library Science Text Series• END:: UCB//123456

Page 73: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 73IS 202 - Fall 2002

Minimal TEI Header

• <teiHeader>• <fileDesc>• <titleStmt>• <title> Introduction to cataloging and classification</title>• <respStmt><name>Bohdan S. Wynar<resp> 8th edition by</resp>• <name>Arlene G. Taylor</name>• </respStmt>• </titleStmt>• <publicationStmt>• <distributor>Libraries Unlimited</distributor>• </publicationStmt>• <sourceDesc>• <bibl> Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th

ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. • </bibl>• </sourceDesc>• </fileDesc>• <teiHeader>

Page 74: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 74IS 202 - Fall 2002

MARC Record (Display)

• ID:DCLC9124851-B RTYP:c ST:p FRN: MS:c EL: AD:06-20-91• CC:9110 BLT:am DCF:a CSC: MOD: SNR: ATC: UD:04-11-92• CP:cou L:eng INT: GPC: BIO: FIC:0 CON:b• PC:s PD:1992/ REP: CPI:0 FSI:0 ILC:a II:1• MMD: OR: POL: DM: RR: COL: EML: GEN: BSE:• 010 9124851• 020 0872878112 (cloth)• 020 0872879674 (paper)• 040 DLC$cDLC$dDLC• 050 00 Z693$b.W94 1991• 082 00 025.3$220• 100 1 Wynar, Bohdan S.• 245 10 Introduction to cataloging and classification /$cBohdan S. Wynar.• 250 8th ed. /$bArlene G. Taylor.• 260 Englewood, Colo. :$bLibraries Unlimited,$c1992.• 300 xvii, 633 p. :$bill. ;$c24 cm.• 440 0 Library science text series• 504 Includes bibliographical references (p. 591-599) and index.• 650 0 Cataloging.• 650 0 Subject cataloging.• 650 0 Classification$xBooks.• 630 00 Anglo-American cataloguing rules.• 700 10 Taylor, Arlene G.,$d1941-

Page 75: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 75IS 202 - Fall 2002

Metadata Resources

• Check the Links section from the class home page

• Best site is the “Digital Library: Metadata Resources” page from IFLA at http://www.ifla.org/II/metadata.htm

• For another good source of information on metadata standards see http://www.chin.gc.ca/English/Standards

Page 76: 2002.09.10 - SLIDE 1IS 202 - Fall 2002 Lecture 05: Metadata: Introduction Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

2002.09.10 - SLIDE 76IS 202 - Fall 2002

Next Time

• Controlled vocabularies (Introduction)

• Readings for next time (in Protected)– Paper by Chris Borgman on online

catalogs

– Paper by Marcia Bates on a design model for access

– Paper by Elaine Svenonius on controlled vocabularies