Planning and Watch

69
SCAP E Christoph Becker Vienna University of Technology www.ifs.tuwien.ac.at/~becker SCAPE First year project review, Luxembourg March 20-21, 2012 Planning and Watch Review presentation

description

Planning and Watch. Review presentation. Christoph Becker Vienna University of Technology www.ifs.tuwien.ac.at/~becker. SCAPE First year project review, Luxembourg March 20-21, 2012. Outline. Objectives and overall progress Key results Watch design (D12.1) - PowerPoint PPT Presentation

Transcript of Planning and Watch

Page 1: Planning and Watch

SCAPE

Christoph BeckerVienna University of Technologywww.ifs.tuwien.ac.at/~becker

SCAPE First year project review, LuxembourgMarch 20-21, 2012

Planning and WatchReview presentation

Page 2: Planning and Watch

SCAPEOutline

• Objectives and overall progress• Key results

• Watch design (D12.1)• Decision factors analysis (D14.1)Sneak preview: The knowledge browser

• Integration and outlook• Time for questions

2

Page 3: Planning and Watch

SCAPE

Preservation Planning: Key concepts Repeatable, standardized planning workflow A weighted hierarchy of objectives

Measurable criteria on the leaf level of the tree Utility functions make criteria comparable

Controlled experimentation on sample content Evidence-based decision making

Standardized structure for plan specification Transparency and documentation Comparability across scenarios

Planning tool Plato guides, validates, documents

Page 4: Planning and Watch

SCAPE

Scalability Challenges Creating a plan is effort-intensive

Sharing experience is difficult

Monitoring changes is manual

Integrating context, strategies and operations is difficult

Page 5: Planning and Watch

SCAPE

Scalability Challenges Creating a plan is effort-intensive

Increase efficiency of planning Sharing experience is difficult

Increase standardisation and reusability Monitoring changes is manual

Introduce automation Integrating context, strategies and operations is difficult

Manage policies Integrate systems

Page 6: Planning and Watch

SCAPE

Work packages and major goals

• PW.WP.1 (WP12): Automated Watch• Watch component for monitoring aspects of interest• Simulation component for prediction

• PW.WP.2 (WP13): Policies Representation• Catalogue of high-level policy statements• Machine-understandable model of low-level policy statements• Structural and procedural relations between these

• PW.WP.3 (WP14): Automated Planning• Refinement of the planning method• Analysis of decision factors and criteria• Planning component (integrated with repositories)

6

Page 7: Planning and Watch

SCAPE

Overall progress in year 1 • Startup phase

• Conceptual advances • Development started a bit delayed• No major impact on delivery schedule

• Parallel interacting streams • Analysis of methods: planning, policies, monitoring • Prototype development: Plato4, analysis module, watch services • Integration experiments: Components and Taverna workflows

• Milestones and deliverables MS58 Policy elements (m6) D14.1 Decision factors analysis (m10) MS59 Policy catalogue (m12) D12.1 Watch design (m12)

7

Page 8: Planning and Watch

SCAPE

Status WP12: Watch

• Watch service definition completed• Clarification of goals, scope and key concepts

• Watch component design finalised: D12.1• Analysis of drivers and constraints• Analysis of events and triggersArchitecture design

• Development started• First milestone release in autumn 2012• Simulation environment: Preliminary work started

8

Page 9: Planning and Watch

SCAPED12.1: Key goals for Automated Watch

1. Enable the planning component to automatically monitor entities and properties of interest

2. Enable human users and software components to pose questions about entities and properties of interest

3. Act as a central place for collecting relevant knowledge that can be used to preserve an object or a collection

4. Collect information from different sources through adaptors5. Enable human users to add specific knowledge6. Notify interested agents when an important event occurs7. Act as an extensible component

9

Page 10: Planning and Watch

SCAPEWatch: Key concepts

• Knowledge base• Entities and their properties• Measures of properties over time• Triggers define conditions and events

• Flexibility and extensibility• A well-defined, flexible data model• Adaptors for different information sources

• Monitoring Capabilities• Internal Monitoring• External Monitoring• Monitor compliance, risks and opportunities

10

Page 11: Planning and Watch

SCAPE

Information sources and clients

11

Format registries

Content profiles

Planning

Operations

Component catalogue

Workflows

Policies

Watch Frontend

Browser snapshots

Watch core

Source AdaptorsKnowledge base

ConditionsNotifications

Planning

Watch Frontend

Experiments

Page 12: Planning and Watch

SCAPE

Example conditions and events

• Policies specify object properties, content profiles describe object properties Policy violation (e.g. objects that are not well-formed)

• Plan specification includes tolerance levels for operations QA measures on migration results outside specified boundaries Migration performance below specified threshold

• Plan specification includes format properties Number of tools supporting a certain format drops below threshold

• Plans specify criteria to be measured New components developed/tested on platform that support desired

QA measures Experiments show risks related to tools in use

12

Page 13: Planning and Watch

SCAPE

Current status in Watch

• Proof-of-concept (May/June)• Full-circle architecture validation • Mockup data sources

• First iteration of Watch focuses on web content• Watch central service• Content profile adaptor• Focused vs. dispersed web crawls over time

• Incremental addition of information sources• New adaptors may reveal new requirements

13

Page 14: Planning and Watch

SCAPE

Content profile

• Global view of content• Distribution of file formats• Distribution of characteristics• Representative data sets

• Stages• Collect metadata• Combine and filter• Reason on the result

14

Page 15: Planning and Watch

SCAPE

Status WP13: Policies

• Policies are governance statements, not executable rules• 3 levels of policy statements

• Hi-level guidance: A Policy catalogue• Mid-level procedures and structures• Low-level control policies:

A machine-understandable Policy model• Milestone 59: Policy catalogue closed in March• 1st semantic model of control policies in m15• Further refinement in second iteration

15

Page 16: Planning and Watch

SCAPE

Status WP14: Planning

• Development baseline based on Plato 3• Removed: PLANETS and other legacy dependencies• Refactored: Modularise, decoupling, testing, ....• Upgraded: JBoss7, JSF2, Richfaces 4....• Moving: maven, github, continuous builds...• First milestone release in July

(Policy model, repository integration, Taverna integration)• Define interfaces and integration

• Taverna experimentation• Requirements for components catalogue• Repository and platform interface

• Collect decision points to automateAnalyse decision factors and criteria

16

Page 17: Planning and Watch

SCAPE

D14.1: Analysis of decision criteria

• PLATO, the Planning Tool• Evidence-based, well-documented plans• Hierarchy of objectives leading to quantified decision criteria• Traceability from decision factors to decisions• Case studies in and after Planets

• Challenges: Effort, sharing, automation, scalability

Analysis of the measurability and automation of criteria Standardisation and alignment of criteria Systematic assessment of the impact of certain criteria

Page 18: Planning and Watch

A method and tool for decision criteria

analysis

Collect• P

reservation plans

• Decision criteria

Align• S

ignificant properties models

• ISO SQUARE Software quality attributes

• Format properties

Categorise• S

pecify uniquely identified criteria

• Categorise all case study decision criteria

Develop• D

efine and implement impact factors

• Visual analysis tools

Analyse• I

mpact factors for criteria

• Impact factors for sets of criteria

Page 19: Planning and Watch

Collect: Some case study data from PlatoNo. Object type (Original) object format Organization type1 Databases MS Access Archive2 Documents Word Perfect Archive3 Documents PDF (versions) Library4 Documents PDF (versions) Library5 Documents PDF (versions) Research6 Documents PDF (versions) Archive7 Images TIFF-6 Library8 Images TIFF-6 Library9 Images TIFF-5 Library10 Images NEF raw image files Archive11 Images Different raw image file formats Research12 Images GIF (versions) Library13 Video Games ROMs of SNES video games Research14 Video Games Media images of floppies and CD-ROMs Research... ... …

Page 20: Planning and Watch

SCAPE

Collect: Decision Criteria

• Objective Tree• Utility Function• Semantics• Taxonomy of criteria measurements

Criterion

Action

Runtime Static Judgm

ent

Outcome

Object Format Effect

Page 21: Planning and Watch

SCAPE

Decision criteria: What to measure and how

• 13 case studies with 617 criteria• Frequency distribution of criteria

across taxonomy• Taxonomy is complete• Preservation of scanned images -

distribution over four case studies • But: no analysis of impact

CriterionAction

Runtime Static Judgm

ent

OutcomeObject Forma

t Effect

Page 22: Planning and Watch

SCAPE

Align models for decision factors

• Format Properties• Library of Congress format evaluation• PRONOM format evaluation• Actual decision criteria

• Software Quality• ISO SQUARE: Standardised software quality model

• Object properties• Formats • Representation Instances• Significant Properties

Page 23: Planning and Watch

A method and tool for decision criteria

analysis

Collect• P

reservation plans

• Decision criteria

Align• S

ignificant properties models

• ISO SQUARE Software quality attributes

• Format properties

Categorise• S

pecify uniquely identified criteria

• Categorise all case study decision criteria

Develop• D

efine and implement impact factors

• Visual analysis tools

Analyse• I

mpact factors for criteria

• Impact factors for sets of criteria

Page 24: Planning and Watch

SCAPE

Develop

• Impact factors for criteria and sets• Frequency• Weighting• Utility function

• Impact• Selectivity

• Measures• Analysis tool

• Criteria browser, set builder and analyser Integrated in upcoming release of the SCAPE planning component

Page 25: Planning and Watch

Analyse a criterion (set) C

Goal

Question

Metric

Understand key decision factors

How often does C occur in scenario S?

How important is C?

How critical is C?

Coverage Range Criticality

Page 26: Planning and Watch

SCAPE

Sneak preview:The Knowledge browser

• Analysis module for decision criteria• Part of the planning component• First milestone release: July 2012

Page 27: Planning and Watch

SCAPEConclusions

• Systematic approach for analysis of decision criteria in preservation planning• Standardisation, cross-referencing, reusability• Method and tool for quantitative impact assessment

• Enables SCAPE Planning and Watch to Facilitate experience sharing and knowledge creation Reduce complexity Optimize decision making Guide automation

Integrated in upcoming planning component Enable sharing and alignment Real-time analysis over time (Watch) Guidance and QA of planning activities

Page 28: Planning and Watch

SCAPEYear 2 work plan for Planning and Watch (1)

• Watch • Proof of concept prototype• Content profile adaptor and monitor• Additional adaptors• Simulation environment prototype in September• Watch core services (version 1) in November

• Policies• Control policy model• Catalogue elaboration• Model refinement and validation

28

Page 29: Planning and Watch

SCAPEYear 2 work plan for Planning and Watch (2)

• Planning• Automated planning component in July (Plato 4)• Scalability roadmap

• Integration• Content profiling• Repositories• Workflow discovery and execution

• Evaluation• Case studies in Testbeds• Key Performance Indicators

29

Page 30: Planning and Watch

SCAPEMost critical technical dependencies

• Preservation components• Planning evaluates action components• Watch uses (the output of) characterisation components to create

content profiles• Quality Assurance measures quality of preservation actions for

evaluation (including as part of planning)• Web browser watch service uses QA components

• Platform• Planning and Watch queries components and workflows• Planning runs experiments as Taverna workflows (directly in real-time)• Planning and Watch components interface with repositories• Plans specify workflows to be run on the platform• Watch monitors REF

30

Page 31: Planning and Watch

SCAPEOther results and publications

• Lessons learned in Preservation Planning (JCDL)• Automated planning experiments

• Actions, characterisation, QA and results reporting (ICADL) Workflow construction in Taverna, components discovery and invocation• Automation and crowdsourcing (CIKM)

• Decision making and governance• Relationship of preservation planning and IT Governance (ASIST, IPRES)• Maturity model for preservation planning and operations (ASIST)

• Repository simulation• Evolution of a repository over time, given starting point and rules (IPRES)

31

Page 32: Planning and Watch

SCAPE

It’s 2014. You have content, a mandate, no action plans defined. What do you do?

1. Deploy the content profiler (uses characterization components for identification and property extraction)

2. Sign up with SCAPE Planning and Watch3. Connect your repository to SCAPE Planning and Watch4. Specify your policy model Watch component starts monitoring content and policies

and detect policy violations5. You quickly create preservation plans

• by evaluating action components • using characterisation and QA components • in Taverna workflows, all integrated in planning• The finished Plans contain workflow specification including SLAs

6. Deploy plans to repository (running e.g. on SCAPE platform)

Page 33: Planning and Watch

SCAPEIn 2015...

Watch monitors compliance of operations to plans and risks and opportunities connected to plans and policies

Monitoring conditions are automatically generated New content? Monitored Changed policies? Monitored Changed environment, format risks? Monitored New, better tools? Monitored New QA tools that measure critical features you had to check

manually? Monitored Need an outlook on the status in 2017? Run a simulation Is there something else you want to have monitored?

Write a watch adaptor and plug it in. Upon changes, you can swiftly adapt plans and redeploy

Page 34: Planning and Watch

SCAPEThank you!

• Questions?

„SCAPE is set to move forward the control of digital preservation operations from ad-hoc decision makingto proactive, continuous preservation management,

through a context-aware planning and monitoring cycle integrated with operational systems.“

Page 35: Planning and Watch

SCAPE

Page 36: Planning and Watch

SCAPE

Page 37: Planning and Watch

SCAPEWhat is a policy? Goals and constraints

Goals and constraints are often not defined explicitly Policy definitions...

“an official expression of principles that direct an organization’s operations” “Formal statement of direction or guidance as to how an organization will carry out

its mandate, functions or activities’ But: “Policies” are encountered on a variety of levels in DP

From TRAC statements to enforceable processing rules From the perspective of planning:

Preservation Policies are governance statements (about constraints, goals, preferences, directives) that constrain or drive operational planning, but may also have other effects outside of operational planning.

They are not directly enforceable (they are business policies) Preservation planning translates them into concrete actions.

Page 38: Planning and Watch

SCAPE

38

Page 39: Planning and Watch

SCAPE

39

Page 40: Planning and Watch

SCAPE

40

Domain model for the Knowledge Base

Page 41: Planning and Watch

SCAPECompliance, risk and opportunities

PLAN C1 C2 C3 C4

Automated? Yes Yes No NoAlternative 1 Alternative 2

Alternative 3

Alternative 4

Compliance of operations to deployed plan (SLAs)

Risks to operations (errors uncovered in QA tool)

Opportunities for operations (new QA tool)

Opportunities for operations (new action tool)

• Planning will generate SLAs and monitoring conditions automatically

Page 42: Planning and Watch

SCAPECompliance, risk and opportunities

PLAN C1 C2 C3 C4

Automated?Alternative 1 Alternative 2

Alternative 3

Alternative 4

Compliance of operations to deployed plan

Risks to operations (errors uncovered in QA tool)

Opportunities for operations (new QA tool)

Opportunities for operations (new action tool)

• Monitor criteria: change in objectives (caused by driver or constraint)• Add the policy context Governance, Risk and Compliance

Page 43: Planning and Watch

SCAPEContent-related triggers

Page 44: Planning and Watch

SCAPEEnvironment-related triggers

Page 45: Planning and Watch

SCAPECommunity-related triggers

Page 46: Planning and Watch

SCAPEOrganisation-related triggers

Page 47: Planning and Watch

SCAPEHigh-level design of the Watch component

47

Page 48: Planning and Watch

SCAPE

Four cases, three solutions: Scanned images

Bavarian State Library, 72TB TIFF6: Leave and monitor British Library, 80TB TIFF5: Migrate to JP2 (ImageMagick) Royal Library of Denmark, ~10.000 aerial photographs in TIFF6:

Leave and monitor State and University Library Denmark, scanned yearbooks in GIF:

Migrate to TIFF 6

Scenario Chosen action Main reasons

72 TB scanned book pages in TIFF6

Leave unchanged and monitor

Color profile complications, lack of JP2 browser support, Process costs

80 TB scanned newspapers in TIFF5

Migrate to JP2 Storage costs,Standardization

Aerial photographs in TIFF6

Leave unchanged and monitor

Lack of JP2 browser support, Process costs

Page 49: Planning and Watch

SCAPE

Scanned books requirements

Page 50: Planning and Watch

SCAPE

Scanned books results

Page 51: Planning and Watch

SCAPE

Scanned books requirements

Page 52: Planning and Watch

SCAPE

Page 53: Planning and Watch

SCAPE

Results summaryFactor group Coverage Impact Criticality Variance

Format High High High Low/ Medium

Action: Performance High Medium Medium High

Action: other High High Low/ Medium Low/ Medium

Representation Instance Criteria

Medium Low Medium High

Transformation Information Criteria

High High High High

Page 54: Planning and Watch

SCAPEUpcoming milestones

• Now: Catalogue of high-level policy elements• M15: Machine-understandable model of control policies• M18: First prototype of the Planning component• M20: First prototype of the simulation environment• M22: First prototype of the Watch core services

54

Page 55: Planning and Watch

SCAPEWatch: Work done in year 1

• User group survey on watch current practice• Testbed scenarios and watch relationships• RODA repository measures• Watch Component definition checkpoint (m6)• Watch deliverable D12.1 (m12)

• Concepts, design, architecture, usage scenarios, triggers, data model, technology discussion

• Preliminary work on Simulator• “Simulating the Effect of Preservation Actions on Repository Evolution”

by Christian Weihs and Andreas Rauber, published at iPres’11

55

Page 56: Planning and Watch

SCAPE

• ISO 25010 SQUARE: Systems and Software Quality Requirements and Evaluation• Functional suitability

(completeness, correctness, appropriateness)• Performance efficiency• Compatibility• Usability• Reliability• Maintainability• Portability

Reconcile: Software Quality

Page 57: Planning and Watch

SCAPE

• Business factors not part of SQUARE• Some aspects are of very varying relevance

• Portability• Maintainability• Usability

• Functional correctness = authenticity Unified property model

Software quality and preservation decisions

Page 58: Planning and Watch

SCAPE

Format vs. Object properties

• Format (or Representation) Properties• Representation Instance Properties• Information Properties• Significant Properties

• aka Transformation Information Properties• Functional correctness of preservation actions

Page 59: Planning and Watch

SCAPEDissemination of results from PW

• D12.1 :Watch design• D14.1: Decision factors analysis• Blog entries on OPF• Published articles

• Preservation Decisions: Terms and Conditions apply. Challenges, Misperceptions and Lessons Learned in Preservation Planning. JCDL 2011

• Decision criteria in digital preservation: What to measure and how. JASIST 62/6, 2011

• Impact Assessment of Decision Criteria in Preservation Planning. IPRES 2011• Automated Preservation: The Case of Digital Raw Photographs. ICADL 2011• Control Objectives for DP: Digital Preservation as an Integrated Part of IT

Governance. ASIST-AM 2011• Simulating the Effect of Preservation Actions on Repository Evolution. IPRES 2011• Quality assurance in Document Conversion: A HIT? (BooksOnline@CIKM 2011)

59

Page 60: Planning and Watch

SCAPE

Analysis tools• Criteria browser

• Accesses knowledge base of PLATO• quantitative impact factors of criteria• browse, sort, filter

• Criteria set builder• Flexible configuration of criteria sets• Quantitative impact factors of sets

Page 61: Planning and Watch

SCAPE

Visualise• Format Standardization: consistent preferences

Page 62: Planning and Watch

SCAPE

Visualise

• Format compression: differing preferences

Page 63: Planning and Watch

SCAPE

Core Preservation Capabilities

Preservation Planning Preservation Operation

Monitor, steer and control the preservation operation of content

Control the deployment and execution of preservation plans.

•Influencers and Decision making•Options diagnosis•Specification and delivery•Monitoring

•Analyze content•Execute preservation actions•Ensure adequate provenance trail•Handle preservation metadata•Conduct Quality Assurance•Provide reports and statistics

“Migrate this set of images (in TIFF-5) to JP2 using ImageMagick 6.3 with parameters a,b,c”

•Analyse original•Migrate, analyse output•Conduct quality assurance•Provenance, metadata, Reporting

Page 64: Planning and Watch

SCAPE• Driven by specific goals and controls

• Organized into activities with assigned responsibilities• Related to other processes• Measured on all levels: Internal vs. external goals and metrics

COBIT processes...

IT Goals Process goals Activity goals

Key Performance Indicators

Process metrics Activity metrics

Page 65: Planning and Watch

SCAPE

Preservation Planning example

Ensure understandability…

Manage obsolescence threats at logical level…

Diagnose all options against requirements…

Number of objects with breach of understandability during time horizon…

Number of obsolescence issues successfully responded to…

Options diagnosis:Efficiency, completeness, correctness and timeliness…

Page 66: Planning and Watch

SCAPE

Preservation Planning Process

Page 67: Planning and Watch

SCAPE

Awareness andCommunication

Policies, Plans and Procedures

Tools andAutomation

Skills andExpertise

ResponsibilityandAccountability

Goal Setting andMeasurement

1

2

3

4

5

A Capability Maturity Model for Preservation Planning

Initial / ad-hoc

Repeatable, but Intuitive

Defined

Managed and Measurable

Optimized

Coming from Software Engineering, the CMM has been shown to be a powerful instrument for assessment and improvement

Page 68: Planning and Watch

SCAPE

Awareness andCommunication

Policies, Plans and Procedures

Tools andAutomation

Skills andExpertise

ResponsibilityandAccountability

Goal Setting andMeasurement

1 Some recog-nition of the need for control

Disorganised ad-hoc decisions

… Not defined Unclear goals, no measurement

2 Managementrecognizes the need for controlling and communicates issues

Planning process emerges, but informal and incident-driven

Sporadic tool usage withoutSystematic integration.

Some awareness of required skills, hands-on experience

People takeownership ofissues based on their owninitiative on areactive basis.

3 Importance of a planning approach isunderstood, accepted and communicated.

Formal planning process in place, some strategy takes place

Automated tools, but processes defined by available services

… Responsibilitiesassigned,documented andclearlycommunicated.

4 Systematic planning ispart of theorganization’s culture

Planning fully supported by well-specifiedmethods; inter-nal best practice

Automated planning system + operationalmonitoring

… … …

5 Continuous improvement

Industry best practice

… … … …

Page 69: Planning and Watch

SCAPE

Capability maturity increments