Dr. Harald Schönberger TMA Karachi 2006 Wastewater fromTowel Manufacturing Industries in Karachi.
Towards Concise Preservation by Managed...
Transcript of Towards Concise Preservation by Managed...
ForgetIT Project, GA 600826
Towards Concise Preservation by Managed Forgetting
iPRES-2013 Conference Lisbon, Portugal 5 September 2013
Nattiya Kanhabua, Claudia Niederée, and Wolf Siberski
L3S Research Center / Leibniz Universität Hannover
Hannover, Germany
ForgetIT Project Consortium
An interdisciplinary team of experts in: – Preservation, information management, information extraction
– Multimedia analysis, storage computing, cognitive psychology
2
Outline
Motivation & Vision
Pilot Applications
Approaches: Overview
Integration Framework
3
Inspiration
4
A Computer that forgets ? Intentionally ??
And in context of preservation???
Inspiration
However we are facing
– dramatic increase in content creation (e.g. digital photography)
– information overload and changing professional + private lives
– increasing storage costs for long-term storage (>10 years)
– increasing use of mobile devices with restricted capacity
– inadvertent forgetting in lack of systematic preservation
5
A Computer that forgets ? Intentionally ??
And in context of preservation???
Inspiration
However we are facing
– dramatic increase in content creation (e.g. digital photography)
– information overload and changing professional + private lives
– increasing storage costs for long-term storage (>10 years)
– increasing use of mobile devices with restricted capacity
– inadvertent forgetting in lack of systematic preservation
And: Forgetting plays a crucial role for human remembering and life in general (focus, stress on important information, forgetting of details)
6
A Computer that forgets ? Intentionally ??
And in context of preservation???
Inspiration
However we are facing
– dramatic increase in content creation (e.g. digital photography)
– information overload and changing professional + private lives
– increasing storage costs for long-term storage (>10 years)
– increasing use of mobile devices with restricted capacity
– inadvertent forgetting in lack of systematic preservation
And: Forgetting plays a crucial role for human remembering and life in general (focus, stress on important information, forgetting of details)
7
A Computer that forgets ? Intentionally ??
And in context of preservation???
So: “Shouldn’t there be something like forgetting in digital memories as well?
Forget IT
Complementing Human Memory
8
V. Mayer-Schönberger. Delete - The Virtue of Forgetting in the Digital Age. Morgan Kaufmann Publishers, 2009.
Motivation
9
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
Motivation
10
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Vision: Building a Bridge
11
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of
digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Vision: Building a Bridge
12
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of
digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
Enabling smooth transition to preservation
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Vision: Building a Bridge
13
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of
digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
Enabling smooth transition to preservation
Creating immediate benefit + reducing effort
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Vision: Building a Bridge
14
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of
digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
ForgetIT
Enabling smooth transition to preservation
Creating immediate benefit + reducing effort
Opening alternatives to “keep it all” and “forgetting by accident”
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Vision: Building a Bridge
15
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of
digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
ForgetIT
Enabling smooth transition to preservation
Creating immediate benefit + reducing effort
Opening alternatives to “keep it all” and “forgetting by accident”
Easing interpretation in the long run
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Vision: Building a Bridge
16
major progress in preservation technology
maturing Information extraction technology
storage as service (e.g. clouds)
Opportunities increasing amount of
digital content handled over decades
more or less systematic backup strategies used
non-paper practices for long-term perspective required
Needs
ForgetIT
Enabling smooth transition to preservation
Creating immediate benefit + reducing effort
Opening alternatives to “keep it all” and “forgetting by accident”
Easing interpretation in the long run
taking inspiration from and complementing human memory
large gap for adoption
high-up front cost
no established practices
lack of understanding of benefit
reluctance to invest
Major Obstacles
Building the Bridge
17
Managed Forgetting
Synergetic Preservation
Contextualized Remembering
Building the Bridge
18
Managed Forgetting
Synergetic Preservation
Contextualized Remembering
• as opposed to the current
“forgetting by accident”
• inspired by human
forgetting
Building the Bridge
19
Managed Forgetting
Synergetic Preservation
Contextualized Remembering
• bringing back information
into active use in a
meaningful way
• as opposed to the current
“forgetting by accident”
• inspired by human
forgetting
Building the Bridge
20
Managed Forgetting
Synergetic Preservation
Contextualized Remembering
• bringing back information
into active use in a
meaningful way
• as opposed to the current
“forgetting by accident”
• inspired by human
forgetting
• couples information
management and
preservation management
Simple Example: Holidays
21
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• High
awareness of trip details
• Showing of pictures
• Sorting out redundant pictures
• Sub-grouping and sorting
Simple Example: Holidays
22
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• High
awareness of trip details
• Showing of pictures
• Sorting out redundant pictures
• Sub-grouping and sorting
Simple Example: Holidays
23
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• Life goes on • Pictures go
out of focus • Creation of a
small diverse subset for showing occasionally
• High
awareness of trip details
• Showing of pictures
• Sorting out redundant pictures
• Sub-grouping and sorting
Simple Example: Holidays
24
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• Life goes on • Pictures go
out of focus • Creation of a
small diverse subset for showing occasionally
• Creation of summary page
• Addition of context info
• Further reduction of redundancy
• Rest of pictures into archive
February 2015 Paris Team: Me, Mary Christine, Tom
• High
awareness of trip details
• Showing of pictures
• Sorting out redundant pictures
• Sub-grouping and sorting
Simple Example: Holidays
25
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• Life goes on • Pictures go
out of focus • Creation of a
small diverse subset for showing occasionally
• Creation of summary page
• Addition of context info
• Further reduction of redundancy
• Rest of pictures into archive
February 2015 Paris Team: Me, Mary Christine, Tom
• Changes in
life (e.g. marriage)
• Addition/ update of context information
• Dealing with preservation issues
girlfriend
• High
awareness of trip details
• Showing of pictures
• Sorting out redundant pictures
• Sub-grouping and sorting
Simple Example: Holidays
26
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• Life goes on • Pictures go
out of focus • Creation of a
small diverse subset for showing occasionally
• Creation of summary page
• Addition of context info
• Further reduction of redundancy
• Rest of pictures into archive
February 2015 Paris Team: Me, Mary Christine, Tom
• Changes in
life (e.g. marriage)
• Addition/ update of context information
• Dealing with preservation issues
girlfriend Girlfriend wife
• High
awareness of trip details
• Showing of pictures
• Sorting out redundant pictures
• Sub-grouping and sorting
Simple Example: Holidays
27
+20 Years +5-10 Years +1 Years after trip +1 month
• Trip to Paris with Friends
• Thousands of picures
• Life goes on • Pictures go
out of focus • Creation of a
small diverse subset for showing occasionally
• Creation of summary page
• Addition of context info
• Further reduction of redundancy
• Rest of pictures into archive
February 2015 Paris Team: Me, Mary Christine, Tom
• Changes in
life (e.g. marriage)
• Addition/ update of context information
• Dealing with preservation issues
girlfriend Girlfriend wife
• Revisiting of
Photo of trip photos
• Re-integration into overall photo collection (link into context)
Application: Personal Preservation
Starting point: Tremendous growth of information in personal sphere Diversity and fast evolution of devices, platforms and formats Keeping info sustainably available: Only ad hoc solutions for mid-term, long-term solutions
ForgetIT approach: Preservation solution for personal information space Based on concept of Semantic Desktop Consideration of social web content, multimedia content, other types of personal content, knowledge structures Additional short/mid-term benefit: de-cluttering information space by managed forgetting Consideration of multi-level infrastructures (e.g. mobile, PC, cloud)
Dissemination/Exploitation: Personal Preservation as a service (e.g. to customers of a telco company)
28
Application: Organizational Preservation
Starting point: existing and popular CMS (TYPO3)
Sophisticated workflows for content creation and publication
But: Separation of publication and preservation/archival Access to archived content is difficult and costly
obsolete and even outdated information stays online
ForgetIT approach:
Preservation as integral part (binary model gradual managed forgetting)
Bolder attitude towards removing content possible
Automated support of cleaning up processes
Support of many stages of archiving, e.g. offline but still in index, aggregates online/ content in archive, only aggregates kept, etc.
Dissemination/Exploitation: Involvement of TYPO3 community, TYPO3 with preservation extension as open source project to TYPO3 community
29
Variables & Dimensions
Personal Organization
Scenarios • Personal events (years at school, holidays, social events, graduations, marriage, etc)
• Public events
• Work-related events (project starts/closing, business trips, new products, etc.)
Data Type • Local: photos, contacts, sms • Online: user-generated content • Feature:
1. user behaviors 2. social context
• Local: textual documents, files • Online: web pages • Feature:
1. user roles 2. policies
Interaction search/retrieve, re-find, organize, explore
Action summarization, aggregation, delete
30
Information Value Assessment
Memory Buoyancy Preservation Value
Short-term relevance/interests E.g., current meeting documents
Long-term interests E.g. important life events
Subjective metrics + usage logs (views, edits, modifies) + social context, influences
Objective metrics + diversity, coverage, quality
31
Managed Forgetting
32
Automatic
Deletion?
Managed Forgetting
Inspired by central role of human forgetting
Aim:
– help in identifying and focus on relevant information
– supporting preservation content selection
Will replace inadvertent forgetting
Managed forgetting ≠ automatic deletion
Instead: range of forgetting options e.g.
– resource condensation
– change of indexing & ranking
– reduction of redundancy
33
Managed Forgetting
Inspired by central role of human forgetting
Aim:
– help in identifying and focus on relevant information
– supporting preservation content selection
Will replace inadvertent forgetting
Managed forgetting ≠ automatic deletion
Instead: range of forgetting options e.g.
– resource condensation
– change of indexing & ranking
– reduction of redundancy
Based on:
34
Managed Forgetting
Inspired by central role of human forgetting
Aim:
– help in identifying and focus on relevant information
– supporting preservation content selection
Will replace inadvertent forgetting
Managed forgetting ≠ automatic deletion
Instead: range of forgetting options e.g.
– resource condensation
– change of indexing & ranking
– reduction of redundancy
Based on:
Careful information value assessment
35
decreasing
memory
buoyancy
Managed Forgetting
Inspired by central role of human forgetting
Aim:
– help in identifying and focus on relevant information
– supporting preservation content selection
Will replace inadvertent forgetting
Managed forgetting ≠ automatic deletion
Instead: range of forgetting options e.g.
– resource condensation
– change of indexing & ranking
– reduction of redundancy
Based on:
Careful information value assessment
Forgetting strategies via policies
36
decreasing
memory
buoyancy
Managed Forgetting
Inspired by central role of human forgetting
Aim:
– help in identifying and focus on relevant information
– supporting preservation content selection
Will replace inadvertent forgetting
Managed forgetting ≠ automatic deletion
Instead: range of forgetting options e.g.
– resource condensation
– change of indexing & ranking
– reduction of redundancy
Based on:
Careful information value assessment
Forgetting strategies via policies
Forgetting options to integrate final manual checking before deletion
37
decreasing
memory
buoyancy
Managed Forgetting
Inspired by central role of human forgetting
Aim:
– help in identifying and focus on relevant information
– supporting preservation content selection
Will replace inadvertent forgetting
Managed forgetting ≠ automatic deletion
Instead: range of forgetting options e.g.
– resource condensation
– change of indexing & ranking
– reduction of redundancy
Based on:
Careful information value assessment
Forgetting strategies via policies
Forgetting options to integrate final manual checking before deletion
Combination with multi-tier storage solution possible
38
decreasing
memory
buoyancy
Use of
tiers
Contextualized Remembering
Aim:
– bringing back information into active use in a meaningful way even if a lot of time has passed
– aiming for semantic level of preservation
Based on:
taking into account relevant parts of context when moving to archive
increasing contextualization of preserved content
considering context evolution over time (evolution-aware contextualization)
39
Evolution-aware Contextualization & Re-contextualization
40
Context of
Interpretation
t
C
Archival Information
System
Information
System
D
Evolution-aware Contextualization & Re-contextualization
41
Context of
Interpretation
t
C C‘
Archival Information
System
Information
System
Human Forgetting
Change in focus
Structural changes
D
Evolution-aware Contextualization & Re-contextualization
42
Context of
Interpretation
t
C C‘
Archival Information
System
Information
System
Human Forgetting
Change in focus
Structural changes
Contextualization
D
D
Evolution-aware Contextualization & Re-contextualization
43
Context of
Interpretation
t
C C‘
Archival Information
System
Pres(D‘)
Pres(C‘)
Information
System
Human Forgetting
Change in focus
Structural changes
Contextualization
D
Context-aware
Preservation
D
D
Evolution-aware Contextualization & Re-contextualization
44
Context of
Interpretation
t
C C‘
Archival Information
System
Pres(D‘)
Pres(C‘)
Information
System
Human Forgetting
Change in focus
Structural changes
C‘‘
Semantic evolution
Structural evolution
Terminology evolution
Contextualization
D
Context-aware
Preservation
D
D
Evolution-aware Contextualization & Re-contextualization
45
Context of
Interpretation
t
C C‘
Archival Information
System
Pres(D‘)
Pres(C‘)
Information
System
Human Forgetting
Change in focus
Structural changes
C‘‘
Semantic evolution
Structural evolution
Terminology evolution
Contextualization
D
Context-aware
Preservation
Semantic Evolution
Detection
D
D
Evolution-aware Contextualization & Re-contextualization
46
Context of
Interpretation
t
C C‘
Archival Information
System
Pres(D‘)
Pres(C‘)
Information
System
Human Forgetting
Change in focus
Structural changes
C‘‘
Evolution-aware
Contextualization
Pres(D‘)
Pres(C‘‘)
Semantic evolution
Structural evolution
Terminology evolution
Contextualization
D
Context-aware
Preservation
Semantic Evolution
Detection
D
D
Evolution-aware Contextualization & Re-contextualization
47
Context of
Interpretation
t
C C‘
Archival Information
System
Pres(D‘)
Pres(C‘)
Information
System
Human Forgetting
Change in focus
Structural changes
C‘‘
Evolution-aware
Contextualization
Re-contextualization
Pres(D‘)
Pres(C‘‘)
Semantic evolution
Structural evolution
Terminology evolution
Pres(D‘)
Pres(C‘‘)
D
Contextualization
C‘‘‘
D
Context-aware
Preservation
Semantic Evolution
Detection
D
D
Synergetic Preservation
smooth and step-wise transition between active information use and preservation
enables rich information flow in both directions
supports more informed preservation decisions
eases preservation adoption
ForgetIT Project, GA600826 - Kickoff Meeting, Hannover, February 2013
48
Data
Management Descr. Info.
Archival
Storage AIPs
Access
Ingest
Administration
Preservation Planning
Preserve-or-Forget Framework
Synergetic
Preservation
Extraction &
Contextualization
Re-
Contextualization
Content
Management
Access
Authoring
Administration
Adapter
Layer
Managed Forgetting
Information
Assessment Condensation
Arc
hiv
al
Info
rma
tio
n S
ys
tem
Info
rmati
on
Man
ag
em
en
t S
yste
m
Integration Framework
49
Information Management System
• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)
Forgettor
Assessor calculates: + Memory Buoyancy + Perservation Value
Analyzer 1. Classification of resources
w.r.t. startegies 2. Triggers forgetting actions
Strategies
Values Statistics
Resources Meta-Info
Resources Values + Decisions
Integration Framework
50
Information Management System
• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)
Forgettor
Assessor calculates: + Memory Buoyancy + Perservation Value
Analyzer 1. Classification of resources
w.r.t. startegies 2. Triggers forgetting actions
Strategies
Values Statistics
Resources Meta-Info
Resources Values + Decisions
Input: strategy meta-infomation (content, context,
neigbours ) previous values
Integration Framework
51
Information Management System
• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)
Forgettor
Assessor calculates: + Memory Buoyancy + Perservation Value
Analyzer 1. Classification of resources
w.r.t. startegies 2. Triggers forgetting actions
Strategies
Values Statistics
Forgetting strategies for
different types of resources
Resources Meta-Info
Resources Values + Decisions
Input: strategy meta-infomation (content, context,
neigbours ) previous values
Integration Framework
52
Information Management System
• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)
Forgettor
Assessor calculates: + Memory Buoyancy + Perservation Value
Analyzer 1. Classification of resources
w.r.t. startegies 2. Triggers forgetting actions
Strategies
Values Statistics
Forgetting strategies for
different types of resources
Resources Meta-Info
Resources Values + Decisions
Input: strategy meta-infomation (content, context,
neigbours ) previous values
Processing Resources based on stategies and
information values
Integration Framework
53
Information Management System
• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)
Forgettor
Assessor calculates: + Memory Buoyancy + Perservation Value
Analyzer 1. Classification of resources
w.r.t. startegies 2. Triggers forgetting actions
Strategies
Values Statistics
Forgetting strategies for
different types of resources
Resources Meta-Info
Resources Values + Decisions
Input: strategy meta-infomation (content, context,
neigbours ) previous values
Processing Resources based on stategies and
information values
Storing the new values and sending them back to IMS
Integration Framework
54
Information Management System
• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)
Forgettor
Assessor calculates: + Memory Buoyancy + Perservation Value
Analyzer 1. Classification of resources
w.r.t. startegies 2. Triggers forgetting actions
Strategies
Values Statistics
Forgetting strategies for
different types of resources
Resources Meta-Info
Resources Values + Decisions
Input: strategy meta-infomation (content, context,
neigbours ) previous values
Processing Resources based on stategies and
information values
Storing the new values and sending them back to IMS
Archives
Acc
ess
Sto
re
Store & access data
Thank you
http://ForgetIT-Project.eu/
55