Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML...
Transcript of Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML...
![Page 1: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/1.jpg)
Preservation by Migration to XML
Dirk Roorda
![Page 2: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/2.jpg)
work on a preservation strategy
• positioning of the XML preservation strategy
• implementing the strategy in software
• pursuing a standard
• with international partners
• welcome to MIXED
![Page 3: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/3.jpg)
Dynamics of preservation
• when the digital context changes• emulate: re-implement data and tools• migrate: re-represent data, use new tools
• whatever you do, do it smart
![Page 4: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/4.jpg)
Data and tools
![Page 5: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/5.jpg)
Smart strategies
• multiple related tasks?
• seek normalization
![Page 6: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/6.jpg)
Diachronic and synchronic
• migration across time (diachronic)• original is nearly obsolete• original intention might be unclear
• migration within time (synchronic)• many different formats• vendor specific peculiarities
![Page 7: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/7.jpg)
Smart migration
![Page 8: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/8.jpg)
MIXED explained
Migration to
Intermediate
XML for
Electronic
Data
![Page 9: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/9.jpg)
MIXED scenario
![Page 10: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/10.jpg)
Kinds of data
•documents•spreadsheets•databases•statistical data•images
•word, open(red)office•excel, openoffice•access, mysql•spss, sas•photoshop, irfanview
![Page 11: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/11.jpg)
Umbrella format
![Page 12: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/12.jpg)
Making it work
• software • = framework
• + modules
• standard • = wrapper
• + metadata • + for each kind of data:
• selected XML standard for that kind
![Page 13: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/13.jpg)
Making software
• make a good product as initial effort• generic framework• substantial number of conversion plugins
• for spreadsheets and databases• for statistical data
• connect to repositories, Fedora ready
• integrate efforts of all interested parties by• using an open architecture
• webservices for framework and plugins• use third party plugins for SPSS and DDI
• using an open source paradigm
![Page 14: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/14.jpg)
SPSS reader as plugin
![Page 15: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/15.jpg)
Helping a standard emerge
• finding a name: Preferred Data Formats for Archives (PDFA) suggestions welcome
• using existing auxiliary standards
• connecting to open source software
• seeking a user base in the archiving world
![Page 16: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/16.jpg)
Using MIXED
• repositories: preservation planning
• interested in file conversion: web services
• individual users: stand alone
![Page 17: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/17.jpg)
Using standards
• XML (Schema, UNICODE)
• OAIS (interface to repositories)
• ODF (spreadsheets)
• SOAP, ESB
• Java, SPRING, OSGI
![Page 18: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/18.jpg)
Co-operation
• DExT (Data Exchange Tools) (UKDA)
• ODaF (Open Data Foundation)
• PLANETS (Preservation and Long-term Access through Networked Services)
• you?
![Page 19: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/19.jpg)
Trends
• end of vendor-specific binary formats in sight
• interchange formats more concerned with preservation
![Page 20: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/20.jpg)
The future ...
• the major data kinds have a preferred preservation format
• the set of preservation formats is standardized
• easy-to-use software converts between preservation formats and custom formats
![Page 21: Preservation by Migration to XML Dirk Roorda. work on a preservation strategy positioning of the XML preservation strategy implementing the strategy in.](https://reader036.fdocuments.net/reader036/viewer/2022070305/5514a695550346d36e8b5c86/html5/thumbnails/21.jpg)
Discussion
• One preservation standard per data kind?
• ODF <=/=> OOXML !
• The role of DDI in metadata and data
• is there an interchange format for statistical data between SPSS, SAS, STATA etc. ?
• Is formatting and action relevant for preservation?
• fonts, colors, formula’s
• Usage of MIXED
• it is a collection of quality convertors
• it will belong to a collection of preservation tools
• Plugin interoperability
• how can we reuse other conversion plugins