Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

53
Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation

Transcript of Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Page 1: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Digital Object Identifier workshop

doi>

Norman Paskin The International DOI Foundation

Page 2: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Background: why DOI• What the DOI system consists of• What DOI does

DOI - outline of talk

Page 3: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Identifiers enable us to manage content• Physical world: ISBN, ISSN, ISMN, SICI, etc

• good systems for publishers• Digital world: ? URL?

•poor systems for publishers (e.g. E Books)•how to use existing identifier systems?

• Make WWW transactions as invisible as telephone transactions– machine to machine, – not machine to people to machine

Background - why now?

Page 4: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Digital world enables both use and protection• Aim is to maximise value of information

objects: - reduce copy infringement and - increase accessibility; - need to identify what it is you are managing

• Mass production mass customisation - components must be clearly identifiable - and terms defined

The intellectual property background

Page 5: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• International DOI Foundation: founded 1998 – following demonstration of prototype in 1997

• Not-for-profit; paid membership support– similar principles to World Wide Web Consortium

• Open to all interested parties• Democratic: board elected from members• Full time staff (Director)• 40+ organisations (growing)

– Content owners (text publishers, music, etc )– Technology companies– Content intermediaries (etc)

DOI - organisation

Page 6: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Establish a way of identifying content in the digital environment– actionable identifier

• Which can be the basis of rights management– extensible; can be developed further

DOI: aim

Page 7: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Identification of content - intellectual property in any form - precisely• Actionable identification - automation; “click to do something”

- services • Interoperability, extensibility

• Open standard

DOI requirements

Page 8: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Must be consistent• Must be extensible:

• technology: changes – e.g. PC netC P2P …?; E-books; WAP

• multimedia: needed – e.g. music clip and image in E-Book with web update (“media convergence”)

• applications: cannot be known in advance

Key issues:

Page 9: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

ActivitytrackingActivitytracking

Full implementation

Full implementation

Initial implementation

Initial implementation

Single redirection (persistent identifier)

Metadata W3C, WIPO, NISO, ISO, UDDI etc.Multiple resolution

A continuing development activity

DOI: development in three tracks

Page 10: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

DOI: components

• An analogy: the telephone system

Page 11: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• A number (or “name”)– assign a number to something– (compare: telephone number)

DOI: components

Page 12: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• A number (or “name”)– assign a number to something– (compare: telephone number)

• A description– what the number is assigned to– (compare: directory entry)

DOI: components

Page 13: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• A number (or “name”)– assign a number to something– (compare: telephone number)

• A description– what the number is assigned to– (compare: directory entry)

• An action – make the number do something – (compare: the telephone

system)

DOI: components

Page 14: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• A number (or “name”)– assign a number to something– (compare: telephone number)

• A description– what the number is assigned to– (compare: directory entry)

• An action – make the number do something – (compare: the telephone system)

• Policies– how to get a phone number; billing – (compare: social structures)

DOI: components

Page 15: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Deployment POLICIES

Syntax 10.1234/5678

NUMBERING

DESCRIPTION

MetadataPieces of data which describe uniquely that which is identified

ResolutionSystem able to link the number to somethinguseful

ACTION

Page 16: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

POLICIES

Any form of identifier

NUMBERING

DESCRIPTION

<indecs> framework:DOI can describe any form of intellectual property, at any level of granularity

ACTION

Handle resolution allows a DOI to link to any and multiple piecesof current data

doi>extensible

Page 17: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• DOI syntax: how the number is made up - NISO standard (Z39.84) - 10.1000/12345

•10.1000 = prefix (e.g. publisher, journal, etc)•12345 = suffix (combination is unique)

• Suffix can be anything (CrossRef example)• An opaque string (“a dumb number”)

– parts do not have separate meaning• Permanent

– stays the same if ownership or location changes

1. Numbering

Page 18: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• “What is numbered?”• Not as simple as you might think:1. Not only digital files, but physical

things and intangible things.2. Not only things, but parts of things.

2. Description

Page 19: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Manuscriptmss #ABC123

Not only digital things...

paper journal/volume/pageISBN, ISSN, etc.

Page 20: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

MS

Vol/page; ISBN; SICI, etc

URL“intangible abstraction”

“intangible abstraction”

ISTC?

Page 21: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Components• Book

– Chapter• Section

– Figure

Not only things, but parts of things

Page 22: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Components• Book

– Chapter•Section

–Figure• “Granularity”

Not only things, but parts of things

Page 23: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Components• Book

– Chapter•Section

–Figure• “Granularity”• Must be able to identify at whatever level

is appropriate : functional granularity

Not only things, but parts of things

Page 24: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Metadata is: Data• Relationships between data - Book: ISBN 0864426437 (data) - Price: $12.95 (metadata) - Subject: Buenos Aires (metadata)• One man’s metadata is another man’s

data

Description is by metadata

Page 25: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Not sufficient to assign an identifier without specifying precisely what the entity is– “ a paper” or “a book” is not precise– must be precise, because:

• In an automated world, that specification must be by metadata (able to be used by machines)

• In an interoperable world, that metadata must be– unambiguous (“well-formed”)– follow a data model(able to be used consistently by machines)

Description is by metadata

Page 26: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Interoperability of data in e-commerce systems• Broad in scope: generic intellectual property

management– description, transaction, rights

• Based on tested “real world” models– CIS (music industry); IFLA (library cataloguing)

• Wide endorsement of this approach– see recent papers Lagoze, Caplan (links at

www.doi.org)• Now in use in applications

– note especially EPICS/ONIX dictionary• Extensible, structured, open standard

DOI used indecs framework

Page 27: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• A few (7-8) key pieces of data– title, type of content, origin, etc– varies according to what is needed (video, book, etc)

• about the object– does not include rights metadata

• but interoperates with rights data– because based on same data model– uses the same terms to mean the same thing

• DOI “Genre” defines key metadata for a family– see DOI Handbook

DOI kernel metadata

Page 28: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Web Browser

User

etc.

Actionable identifier

Specified Action

doi>

10.1000/123

3. Actions

Page 29: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• I have found what I want to link to, but:– I have a copy locally; or– I use an aggregator; or– The publisher provides alternative

sources; or– I am linked to an authorised E-print

archive; or– It is available in a public archive (etc)

• so I want to go to the “appropriate copy” – rights issues (access control) are

implicit

Example issue: getting the appropriate copy

Page 30: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• Open Standard using internet • Distributed, scalable, fast and reliable• In use now in several places (e.g. Lib. of

Congress) • Very simple concept, powerful applications• Fits with other standards (URL, URN, etc) • Associates a name with “values” (e.g. URL)

– input DOI– output URL (or some other defined value)

• Work by CNRI (Robert Kahn)

DOI uses Handle System®

Page 31: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Global Handle System

Web Browser

Local Client www.pub.com

DOI?

URLabc

abc.doc

Page 32: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

3

Handle dataDOI Data type Index

10.123/456 URL http://srv1.pub.com/.....3

URL http://srv2.pub.com/.....2

9URL http://srv3.pub.com/.....59MD http://lu.cr.com/10.123..10

999EM [email protected]

9IP 10.456/7894

Background: DOIs resolve to Typed Data

DOI Handle data

Page 33: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

3

Handle dataDOI Data type Index

10.123/456 URL http://srv1.pub.com/.....3

URL http://srv2.pub.com/.....2

9URL http://srv3.pub.com/.....59MD http://lu.cr.com/10.123..10

999EM [email protected]

9IP 10.456/7894

DOIs resolve to Typed Data

Multiple typed values per DOI

Page 34: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

3

Handle dataDOI Data type Index

10.123/456 URL http://srv1.pub.com/.....3

URL http://srv2.pub.com/.....2

9URL http://srv3.pub.com/.....59MD http://lu.cr.com/10.123..10

999EM [email protected]

9IP 10.456/7894

DOIs resolve to Typed Data

Extensible typing

Page 35: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

3

Handle dataDOI Data type Index

10.123/456 URL http://srv1.pub.com/.....3

URL http://srv2.pub.com/.....2

9URL http://srv3.pub.com/.....59MD http://lu.cr.com/10.123..10

999EM [email protected]

9IP 10.456/7894

DOIs resolve to Typed Data

Query by type

Page 36: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

etc.

For convenience we re-draw like this:

URL

URL2

RAP

XYZ

doi>

10.1000/123

INPUT OUTPUT

Page 37: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• DOI free to use– costs paid by assigner

• DOI applies to any Intellectual Property entity – copyright focus (Berne/WCT etc)

• Registration agencies to deal with assigning DOIs (and metadata/resolution) for publishers etc

• Business models determined by agencies• Policies for agencies are now evolving

4. Policies

Page 38: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Digital Object Identifier• A unique persistentidentifier…. - of a piece of intellectual property - in any form (tangible, intangible) - defined by some key metadata - an opaque string e.g.

DOI:10.1000/123

What is DOI?

Page 39: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• “resolvable..”

- routing, via proven internet technology,

• “to associated state data”…. - one or more current values of specified types of data (e.g. URL); - these data may be, or link to,

services

What is DOI?

Page 40: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

• “in an information management substrate…”

- once the (meta)data has been obtained, it can interoperate with other data

- e.g. about context (subscription etc) - to construct services and transactions - because (meta)data follows a generic

interoperable architecture

What is DOI?

Page 41: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

“A unique resolvable identifier and multiple pieces of associated state data in an information management substrate” achieved by:

• Technical implementation + policies• Two underlying technical tools:

1. intellectual property: <indecs> framework

2. resolution: Handle System

What is DOI?

Page 42: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

1. Identify the item of intellectual property• not its location, because:• if the location changes the identifier should

stay the same (persistence)• the same “resource” can be at several

locations at the same time (“multiple copies”)

DOI does this

What are the advantages of DOI?

Page 43: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

2. Able to deal with relationships:– “this item is a manifestation of that

work”– “this item is a part of that item”

DOI does this:• Metadata can express relationships

– “is part of…” etc • DOIs can resolve to other DOIs

What are the advantages of DOI?

Page 44: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

3. Apply to any intellectual property entity– any format (digital convergence)– any granularity (any part of something)

4. Enable complex actions – can express relationships between

entities– interact with data from other sources – enables services (automated,

predictable) to be constructed

What are the advantages of DOI?

Page 45: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

5. Extensible• resolution system has capability for

trusted transactions (p.k.i.)• metadata framework has capability for

full rights management architecture6. Not limited to current environments• not just the Web (other Internet

applications)• not just digital (intangibles etc)

What are the advantages of DOI?

Page 46: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Web Browser

User

URL

“404 not found”

1. URL is not a persistent identifier - it refers to Location, not content

URL

?

2. Same content at two different URLs has two different identifiers - cannot use as common reference

“...has moved to…”

“One in five Web links >1yr old may be out of date” (Alta Vista)

Identifiers on the web

Page 47: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Web Browser

User

URL

1. Don’t change the URL; “persistence is a social, not a technology, problem”

People do change URLs There are good reasons to change URLs Does not deal with multiple copies

Identifiers on the web

Page 48: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

URLWeb Browser

User

URL

2. Assign a Name and use http redirect

name

http Bookmarks and caches save the end point, not the name (in current browsers)

does not deal with multiple copies

Identifiers on the web

Page 49: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

URLWeb Browser

User

3. Assign a Name and use resolver

doi>

DOI provides name

URL Multiple resolution

Identifiers on the web

Page 50: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Web Browser

User

URL

Resolution

1. DOI is a persistent identifier

DOI initial implementation

2. DOI identifies the content, irrespective of the location

doi>

10.1000/123

Page 51: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Web Browser

User

etc.

URLURL

URL2

Data 1

Data 2Actionable identifier

Multiple Resolution

Full DOI implementation

Identifier resolves to any piece of data

doi>

10.1000/123

Page 52: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Web Browser

User

etc.

URLURL

URL2

Data 1

Data 2Actionable identifier

Resolutionservice

Specified Action

doi>

10.1000/123

Service 1 @ 10.1000/123

Page 53: Digital Object Identifier workshop doi> Norman Paskin The International DOI Foundation.

Digital Object Identifier workshop

doi>

Norman Paskin The International DOI Foundation