Pacific And Regional Archive for Digital Sources in Endangered Cultures
-
Upload
austine-janus -
Category
Documents
-
view
27 -
download
1
description
Transcript of Pacific And Regional Archive for Digital Sources in Endangered Cultures
Pacific And Regional Archive for Digital Sources in Endangered Cultures
Project GoalsProject Goals
Archiving recorded field material from the region around Australia– Preservation (longterm storage and backup)– Description (metadata or cataloguing
information)– Rights (access and deposit conditions)– Access (providing data online)– Training in appropriate methods to support
these goals
RationaleRationale
Records of cultural material are themselves becoming endangered– (reel to reel tapes are deteriorating, audio-
cassettes are becoming obsolete)Such records are held in poor
conditions, not catalogued and so unlocatable remotely
Policy issuesPolicy issues
Research policies of participating universities with regard to safeguarding research data
Policies of engagement with the Australian region
University of MelbourneUniversity of Melbourne
“A core principle is that research data and records should be maintained for as long as they are of continuing value to the researcher and in accordance with the requirements set down by the University, research funding bodies, legislation, regulatory authorities and patent agencies. ..... Recognising that there is a requirement for researchers to retain research data and records, the University seeks to facilitate compliance by requiring each department or unit to make arrangements for facilities to be available for the storage of research data and records. Researchers may either use such storage space or make alternative arrangements acceptable to the supervising staff member or head of department.”
Records Management Policy & Procedures Manual CHAPTER SEVEN - Research Data and Records (7.2)
http://www.unimelb.edu.au/records/manual/chapter7.html
University of MelbourneUniversity of Melbourne
Encouraging the Melbourne Institute of Asian Languages and Societies and the Asialink Centre to work together to implement the 'Melbourne Agenda', particularly by promoting political, industry and community understanding of Asia in Australia, and positioning the University of Melbourne as a significant East Asian institution.
(MU Operational plan 2003)
Pacific ??
Implementation of PARADISECImplementation of PARADISEC
Currently ARC LIEF funded (1 year) Steering Committee made up of
representatives of University of Sydney, ANU and University of Melbourne
Three staff, admin officer and audio officer (Sydney), project manager (Melbourne)
Preservation (longterm storage and Preservation (longterm storage and backup)backup)
Conform to data standards to allow data migration– Audio - Broadcast Wave Format (BWF) – Text - pdf, rtf, ASCII, UNICODE – Video - (still in progress, DigiBeta)– Images - TIFF
Steering CommitteeSteering Committee
Linda Barwick (Team Leader, PARADISEC Sydney) Jane Simpson (Linguistics, University of Sydney) Allan Marett (Music, University of Sydney) John Bowden (Linguistics, Research School of Pacific and Asian
Studies, ANU) Stuart Hungerford (ANU Internet Futures Project) Ewan Maidment (Pacific Manuscripts Bureau Executive Officer,
ANU) Steven Bird (Computing Science, University of Melbourne) Nick Evans (Linguistics and Applied Linguistics, University of
Melbourne)
Preservation (longterm storage and Preservation (longterm storage and backup)backup)
Broadcast Wave Format (BWF) (encapsulated metadata) 24-bit, 96khz
Metadata extension chunk including unique permanent identifiers, coding history, and content descriptors
Preservation (longterm storage and Preservation (longterm storage and backup)backup)
Distributed data– Mirrored data in different locations using the
Australian Partnership for Advanced Computing national facility (APAC) at the ANU.
– Moved between Sydney and ANU via GrangeNet– Stored on tape and hard disk– downsampled version provided to depositor (MP3
or similar)
Metadata (cataloguing info)Metadata (cataloguing info)
For discovery / harvestingFor assessmentFor preservation
DiscoveryDiscovery
Conforms to Open Languages Archives Community (OLAC), which itself conforms to Dublin Core
Export to XML • PARADISEC is a static repository of OLAC conformant
metadata.• BWF encapsulated data
Use of controlled vocabularies • language names / codes, from Ethnologue• Country names / codes, ISO• Roles (OLAC)• Etc
Database used to constrain entry (FMPro)
DiscoveryDiscovery
Ability to locate data via a standard search mechanism. Metadata in OLAC format
Allows a user to identify material using standard descriptorsAssessed (ie not just digitised) data also listed
Tapes undiscoverable Tapes + catalogue discoverable locallyTapes + catalogue + web discoverable globallyTapes + catalogue + web+digital objects discoverable globally
and available (subject to permissions)
Data assessment (metadata / Data assessment (metadata / cataloguing information)cataloguing information)
Assessing tape collections for ingestion (=digitisation), triage based on:- state of current media- content (regional, endangered)- state of current storage- share of services between partner campuses
Selection based also on individual items within a collection that may have special value.
Data preservation (metadata / Data preservation (metadata / cataloguing information)cataloguing information)
Tracking data through the process of ingestion and duplication for depositor’s copies. Metadata exported to BWF to transcript level
Digital preservation copy stored on 100 GB digital access tapes in the MSS at the University of Sydney
Mirrored at the robotic mass storage system at the ANU (APAC) Digital data in the mass storage system is automatically backed up and migrated to new tapes on a
regular basis. Examples
Technicalities - HardwareTechnicalities - Hardware
• Quadriga AudioCube 4-II has 2 Pentium III 1Ghz processors, 256 MB RAM, 36 GB audio hard-disk (shuttle), and a 9GB system hard disk.
• Digital Audio Denmark ADDA 2402 converter box, located between the analogue playback equipment and the AudioCube, handles analog-to-digital conversion.
Technicalities - SoftwareTechnicalities - SoftwareWavelab sound editing software to record the audio data at 24-
bit 96khz .wav files (PC-standard PCM raw audio files).
Audio File-Inspector • converts standard WAV-files into BWF• analyses them for maximum and average level, signal-to-
noise ratio, dynamics and phase correlation, with results being recorded automatically and saved as metadata in the ‘quality chunk’ of the BWF file.
• BWF files are then digitally 'sealed' with a File Security Code to maintain the integrity of the audio and metadata.
16 bit 44.1 kHzCD x 2
ResearcherHome community
24/96 BWF
Authorised general users
Streaming MP3
AuthenticationDigital Rights Management System
MirrorAPAC (ANU)
AzoulayMSS Sydney
24 bit 96 kHz BWFpreservation master
128kbps MP3browser copies
DigitisationQuadriga system
Sydney
Preservationmetadata
Tapes assessed by Project admin
Public accessto onlinemetadata
WebsiteRSPAS (ANU)
Metadata database
Content metadata
Researchers
Rights (access and deposit Rights (access and deposit conditions)conditions)
Deposit / access formsEmbedding rights information in the
processing system for eventual automated access or restriction of access
Access (providing data online) Access (providing data online)
Data store http://store.anu.edu.au:8081/PARADISEC
http://store.anu.edu.au/PARADISEC/researcher/R123-321/abcd.aiff
http://...PARADISEC/researcher/R123-321/abcd.aiff?start=00:01:24.00?end=00:02:43.01
Training in appropriate methodsTraining in appropriate methods
Structuring data with best current tools– Key concepts: reusability, open formats
Archiving achieved in everyday practice not just an end point– Documentation of recordings, notes,
photographs etc
Schedule: end of the first yearSchedule: end of the first year
Developed control vocabularies for use in metadata (language names, country names, data types)
Digitised versions of 500 hours of reel to reel tape with associated metadata descriptions.
Developed draft data deposit and access forms for rights management.
Developing collaborative arrangements with the Pacific Manuscripts Bureau (PAMBU, ANU), National Library of Australia (NLA) and ScreenSound.
Schedule: end of the first yearSchedule: end of the first year
Established presence of PARADISEC as a service to researchers
Training of staff in standard formats, techniques for data transfer and digitisation
Developed local data entry mechanism for metadata Developed data export function for metadata to XML and OLAC
static repository. Published metadata descriptions to the web Presentation of methodology at national and international
conferences Established ongoing funding
Current StatusCurrent Status
Equipment installed and operating in Sydney Data transfer arrangements in place with APAC Metadata draft revision #3 FMPro catalogue working well Staff commenced work in Sydney (April) Project Manager working in Melbourne Static repository validated by OLAC June 2003
Current StatusCurrent Status
Support from national institutions– ScreenSound (National Film and Sound Archive)– National Library– AIATSIS
Current StatusCurrent Status
Current StatusCurrent Status
Involvement with: EMELD (Electronic Metastructures for
Endangered Languages Data)
DELAN (Digital Endangered Languages Archives Network)
OLAC (Open language archives community)
Related Projects in the regionRelated Projects in the region
E-archives– ASEDA
P-archives– Monash music archive– Auckland - Maori and Pacific Music archive– Regional museums and cultural centres
Aims for 2004Aims for 2004
Get ARC LIEF funding again Broaden focus to include text and images. Development of a web-based data entry mechanism for
metadata, to allow depositors to enter their own information.
Development of a web-based text-linked- audio-delivery tool to facilitate access to audio material.
Digitised versions of 1000 hours of reel to reel tape with associated metadata descriptions.
Trial linkage between images of fieldnotes and digitised fieldtapes.
Aims for 2004Aims for 2004
Developed data deposit and access forms for rights management.
Targeting of existing analog data archives (like the Monash music archive) for advice and potential collaborative work on digitisation
Presentation of methodology at national and international conferences.
Locating funding sources for establishment of a longterm repository for PARADISEC.
ContactsContacts
www.paradisec.org.au
Team leader
Project manager