A Domain Model for Digital Curation Stephen Abrams UC Berkeley School of Information, Friday,...

37
A Domain Model for Digital Curation Stephen Abrams UC Berkeley School of Information, Friday, October 16, 2015 UC Curation Center California Digital Library www.accademia.org/explore-museum/artworks/michelangelos- david

Transcript of A Domain Model for Digital Curation Stephen Abrams UC Berkeley School of Information, Friday,...

A Domain Model for Digital Curation

Stephen Abrams

UC Berkeley School of Information, Friday, October 16, 2015

UC Curation CenterCalifornia Digital Library

www.accademia.org/explore-museum/artworks/michelangelos-david

A domain model for digital curation

Justification

sept

Roadmap

Transition from ad hoc and idiosyncratic to rigorous and systematic analysis, planning, deployment, and assessment

Build up incrementally from first principles

Synthesize and extend prior community efforts

View curation as an inherently semiotic activity

Benefits

Better understand and express nuanced curation intentions and outcomes

Set realistic stakeholder expectations

Gain greater confidence that activities are comprehensive

Programmatic change and aging infrastructure

Imminent retirement of executive and program directors; mounting technical debt

genesis

An opportunity for strategic reassessment and planning

Hoping for a “short” background paper to guide analysis

The more investigation, the less confidence

The more questions, the fewer persuasive answers

Pragmatic advancement, but no robust and comprehensive conceptual underpinnings

Two decades of progress

state-of-the-art

There is a model – explicit or tacit – underlying all of theseFedora

OAISPorticoDIASLOCKSSJHOVEPREMISDioscuriTRACPlato4CDPN

Do they all fit together? Are we thinking about the right

things and defining them properly?

There is no more overloaded and under-formalized term of practice than “digital object”

DSpacePRONOMAIHTPDF/AChronopolisiRODSAceNDSAFIDOOliveBitCuratorPCDM

Prior object modeling

crosswalk

Sender/ receiver

BucklandKahn-

WilenskyFRBR NAA OAIS PREMIS BRM ICO

sourceinfo-as-knowledge

work essenceintellectual entity

propositional content

intellectual entity

encodinginfo-as-thing

data

expression

source

data object / digital object

bitstream / filestream

symbol structure

symbol structuremanifestati

on

file / representation

item bitspatterned matter/energy

information carrier

frame-of-reference

key-metadata representatio

n information

auxiliary information

channel

info-as-process

process projection

signalperformance

sensory impression

contentknowledge base

decoding

effectinfo-as-knowledge

essenceintellectual entity

propositional content

intellectual entity

not fully populated

fineness o

f gra

nula

rity

Sept object modeling

crosswalk

Sender/ receiver

BucklandKahn-

WilenskyFRBR NAA OAIS PREMIS BRM ICO

sourceinfo-as-knowledge

work essenceintellectual entity

propositional content

intellectual entity

encodinginfo-as-thing

data

expression

source

data object / digital object

bitstream / filestream

symbol structure

symbol structuremanifestati

on

file / representation

item bitspatterned matter/energy

information carrier

frame-of-reference

key-metadata representatio

n information

auxiliary information

channel

info-as-process

process projection

signalperformance

sensory impression

contentknowledge base

decoding

effectinfo-as-knowledge

essenceintellectual entity

propositional content

intellectual entity

Sept

message

structure

form

carrier

annotation

behavior

stimuli

ground

interpretation

experience

Digital Curation Centre

Maintaining, preserving and adding value to digital research data throughout its lifecycle

digital curation

Cui bono?

Process-centric Explains the what, but not the

why or for whom

www.dcc.ac.uk

UC Curation Center

Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time

digital curation

www.cdlib.org/uc3

UC Curation Center

Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time

digital curation

Distinguishable through consumer criteria

www.cdlib.org/uc3

UC Curation Center

Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time

digital curation

Distinguishable through consumer criteria

Is what it purports to be

www.cdlib.org/uc3

UC Curation Center

Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time

digital curation

Distinguishable through consumer criteria

Is what it purports to be Spanning production,

management, and exploitation

www.cdlib.org/uc3

UC Curation Center

Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time

digital curation

Distinguishable through consumer criteria

Is what it purports to be Spanning production,

management, and exploitation Use is feasible and beneficial

www.cdlib.org/uc3

UC Curation Center

Complex of actors, policies, practices, and technologies enabling successful consumer engagement with authentic content of interest across space and time

digital curation

Distinguishable through consumer criteria

Is what it purports to be Spanning production,

management, and exploitation Use is feasible and beneficial Equally dependent on human

competencies, institutional mission and resources, and technology

www.cdlib.org/uc3

Communication with the future

A digital object is the unit of communication

digital curation

An object encapsulates the message to be communicated, but not its meaning

Meaning is an emergent epistemic state of the consumer

Content is realized by physical stimuli …

Perceived by a sense modality … Interpreted in a context … Experienced as cognitive

meaning or psychological affect

The final, crucial transition from perception to cognition is an inherently semiotic act

Signs and systems of signification

Charles Sanders Peirce (1839 – 1914)

semiotics

A sign is something that “stands in” for something else, for someone, in some manner.

Semiosis is a triadic relation between a representation, its referent, and its experiential effect

Interpretation takes place with respect to a subjective contextual ground

referent

representation

ground

effect

stands in for

contextualizes

stimulates

(re)presents

objective subjective

Object-mediated communication

Modes of understanding

semiosis

Denotative

Connotativeobject

curatorial presentatio

n

contextual ground

frames-of- reference

experienced meaning

intended meaning

interpretation

codification

owner

curator

creator

consumer

objective subjective

feasible beneficial

stimulus

Object-mediated communication

Modes of understanding

semiosis

Denotative

Connotativeobject

curatorial presentatio

n

contextual ground

frames-of- reference

experienced meaning

intended meaning

interpretation

codification

owner

curator

creator

consumer

objective subjective

feasible beneficial

contextual noise

channel noise

Object modeling

Dimensions

analysis

Semantics

Syntactics

Empirics

Pragmatics

Diplomatics

Dynamics

Meaning

Symbolic expression

Physical representation

Realizing behavior

Evidential authenticity

Persistence and evolution

SSEPDD

Object modeling

Dimensions

analysis

Semantics

Syntactics

Empirics

Pragmatics

Diplomatics

Dynamics

Meaning

Symbolic expression

Physical representation

Realizing behavior

Evidential authenticity

Persistence and evolution

SEPT

Object modeling

Dimensions

analysis

Semantics

Syntactics

Empirics

Pragmatics

Diplomatics

Dynamics

A subsidiary group or division of an extended family or clan

SEPT

Object modeling

Components

analysis

Semantics

Syntactics

Empirics

Pragmatics

Diplomatics

Dynamics

carrier

message

behaviorencodinginscribe

drealize

d

expressed

describes

semantics

syntactics

empirics

object

pragmatics

verificationinterventio

n

diplomatics dynamics

annotation

Object typology

Types

analysis

Empirics Blob… bits … bits … bits … SSD

Object typology

Types

analysis

Empirics

Syntactics (morphology)

Blob

Artifactbits

identity:

SSD

File

… bits … bits … bits …

Object typology

Types

analysis

Empirics

Syntactics (morphology)

Syntactics (structure)

Blob

Artifact

Exemplar

bits

identity:

type:

SSD

File

.pptx file

Object typology

Types

analysis

Empirics

Syntactics (morphology)

Syntactics (structure)

Semantics

Blob

Artifact

Exemplar

Product

bits

identity:

type:

description:

SSD

File

.pptx file

Topical presentation

Object typology

Types

analysis

Empirics

Syntactics (morphology)

Syntactics (structure)

Semantics

Pragmatics

Blob

Artifact

Exemplar

Product

Asset

bits

identity:

type:

description:

behavior:

SSD

File

.pptx file

Topical presentation

Presentation (in PowerPoint)

Object typology

Types

analysis

Empirics

Syntactics (morphology)

Syntactics (structure)

Semantics

Pragmatics

Diplomatics

Blob

Artifact

Exemplar

Product

Asset

Record

bits

identity:

type:

description:

behavior:

verification:

SSD

File

.pptx file

Topical presentation

Presentation (in PowerPoint)

Presentation (really)

Object typology

Types

analysis

Empirics

Syntactics (morphology)

Syntactics (structure)

Semantics

Pragmatics

Diplomatics

Dynamics

Blob

Artifact

Exemplar

Product

Asset

Record

Heirloom

bits

identity:

type:

description:

behavior:

verification:

intervention:

SSD

File

.pptx file

Topical presentation

Presentation (in PowerPoint)

Presentation (really)

Presentation (tomorrow)

Object typology

analysis

Blob Artifact Exemplar Product Asset Record Heirloom

empirics syntactics syntactics semantics pragmatics diplomatics dynamics

formative informative informative informative performative evaluative reformative

inscription identificationcharacterization

description realization verification intervention

media(outer) encoding

(inner) encoding

meaning /

affectexperience authenticity persistence

carrier form structure message behavior evidence action

existential intentional purposeful interpretable useful trustworthy resilient

nascent incipient potential theoretical practical assured enduring

provenancial / administrative / permissive

provenancial / relational / associational

structural intellectual instrumental provenancial provenancial

Differentia

Dimension

Mode

Act

Concern

Abstraction

Quality

Utility

Annotation

Modes of engagement

Continuum, not lifecycle

continuum

Role Locus

Concern Lifecycle implies a prescribed progression through well demarcated and distinguishable states

Continuum allows adaptive navigation among overlapping and interdependent activities

Creator Curator Consumer Owner

Origination Organization Pluralization

Production Management Exploitation

Modes of engagement

continuum

Originate Organize Pluralize

observe, simulate, create, deriveidentify, classify, clean, annotate, package

license, submit, publish, cite, aggregate

appraise, select, harvest, collectnormalize, characterize, arrange, annotate, store, index, plan, watch, intervene, administer

replicate, audit, notify, syndicate, resolve, resolve, authorize, report

search, discover, retrieve, subselectanalyze, correlate, synthesize, interpret, transform, annotate

summarize, validate, assert, refute

Locus

Production

Management

Exploitation

Policy and strategy

Imperatives

rubric

Predilect

Collect

Protect

Introspect

Project

Connect

Decide what you intend

Obtain (or do) what you intend

Preserve (or sustain) what you obtain

Know what you protect

Offer what you know

Deliver what you offer

Policy and strategy

rubric

Blob Artifact Exemplar Product Asset Record Heirloom

service level agreement

disaster recovery / business continuity

format action plans

collection development policy

outreach and training

evidentiary standards

sustainability / succession planning

annotation packaging, submission

normalization / canonicalization

workflow / tool integration

code / workflow repositories, aggregation

chain of custody

preservation planning

environmental control, redundancy, media refresh

administrative control, fixity audit,malware detection/ sanitation,

technical control, migration

bibliographic control

access control, emulation

archival control change control, preservation watch

forensic characterization

morphological characterization, PID minting

structural characterization, ontologies, format registries

intellectual characterization, entity extraction, sentiment analysis, PID binding

behavioral characterization, software registries, analytics

archival characterization, master registry

provenance, annotation

media inventory

file inventory, PID resolution

object index work catalog transcoding , syndication, discovery

documentary form

versioned change history

legacy / emulated computational environments

file delivery format-aware processing

disciplinary-specific processing

search / browse, hosted tools, annotation

authenticity-dependent workflows

consortial collaboration

Imperative

Predilect

Collect

Protect

Introspect

Project

Connect

Policy and strategy

rubric

Blob Artifact Exemplar Product Asset Record Heirloom

service level agreement

disaster recovery / business continuity

format action plans

collection development policy

outreach and training

evidentiary standards

sustainability / succession planning

annotation packaging, submission

normalization / canonicalization

workflow / tool integration

code / workflow repositories, aggregation

chain of custody

preservation planning

environmental control, redundancy, media refresh

administrative control, fixity audit,malware detection/ sanitation

technical control, migration

bibliographic control

access control, emulation

archival control

change control, preservation watch

forensic characterization

morphological characterization, PID minting

structural characterization, ontologies, format registries

intellectual characterization, entity extraction, sentiment analysis, PID binding

behavioral characterization, software registries, analytics

archival characterization, master registry

provenance, annotation

media inventory

file inventory, PID resolution

object index work catalog transcoding , syndication, discovery

documentary form

versioned change history

legacy / emulated computational environments

file delivery format-aware processing

disciplinary-specific processing

search / browse, hosted tools, annotation

authenticity-dependent workflows

consortial collaboration

Imperative

Predilect

Collect

Protect

Introspect

Project

Connect

A domain model for digital curation

Next steps

sept

Respond to feedback

Continue development

Strategic planning for program and services

Use case analysis and requirements gathering for next generation repository

A domain model for digital curation

Summary

sept

Curation enables communication

Objects carry messages, not meanings

Consumer interpretation and experience are inherently subjective

Progress towards greater rigor in conceptualizing digital curation

Terminology for expressing nuanced intentions, actions, and outcomes

Object modeling concerns span six analytic dimensions

Object typology of increasing utility

Engagement entails a continuum of roles, activities, and concerns

Rubric for strategic and policy imperatives

Thank you

Stephen Abrams

sept domain model for digital curation

UC Curation CenterCalifornia Digital Library

[email protected]

wiki.ucop.edu/display/Curation/Foundations

ipres2015.web.unc.edu/ipres-2015-program/

www.flickr.com/photos/manroland_web_systems/8548753246