Australian Museum EMu Project
Embed Size (px)
Transcript of Australian Museum EMu Project
Australian Museum EMu ProjectOr how to make 11=1Dr Penny Berents
Backgrounddatabasing of collections began in early 1980s with vertebrate collectionsTITAN first databases were identical but as the years went by Collection Managers modified their database(s)in 2000 the AM had 11 collection databases in Texpress on 4 servers
AM collection databasesBirdsFishesHerpetology MammalsMarine InvertebratesTissue
Choice of softwareKE EMu or something else?how much could we combine with AM Anthropology EMu database ?
whose version of EMu ?American versions AMNH, NMNH ?Australian natural history users ?AM Anthropology ?
our limited knowledge of product architecture and functionality before commencing customisation
data free software
.We analysed all collections for individual needs before customising Catalogue and before migrating any data
Catalogue, Taxonomy, Sites, Collecting Events, Parties, Loans, Movements, Bibliography, Multimedia, Locations
Latitude and Longitude
Registration numbers and IRNs
.Salinity data not displayed with float data
.EMu loginsRegistry settings
.EMu loginsRegistry settingsstandard environments for EMu
.EMu loginsRegistry settingsstandard environments for EMu mapping fields from Texpress to EMu
.EMu loginsRegistry settingsstandard environments for EMu mapping fields from Texpress to EMudata mapping benchmarks
establishing load benchmarks for testing each collection after the load
establishing load benchmarks for testing each collection after the loadrange queries
establishing load benchmarks for testing each collection after the loadrange queriesSQS (simple query screen)
establishing load benchmarks for testing each collection after the loadrange queriesSQS (simple query screen)keeping track of different datasets at varying stages of implementation
.storage locationsissue registration
historiestaxon/ type of
historiestaxon/ type ofHelp/Manuals/Training
Where are we up to? 4 datasets loaded and operationalMarine Invertebrates, Herpetology, Birds & Spiders Mammals loaded to test databaseMapping underway for next datasetsdue to finish migrating all datasets by end 2004
the column on the left lists those collections that have all the registered collection databasedthe column on the right lists those collections that still have some paper based records some databases were very complex with linked databases eg. for station data, higher classification, bibliographic data, and most included a loans system, In 2001, on our second attempt, we put a successful business case to NSW government to develop an Integrated Collection Management SystemInternet delivery of data, standardisation of datathe project was to be a 3 year project from 2001-2004, $900Kbudget for server, project co-ordinator, design and development of system and migration of data
How easy would it be next time!!!
Some time was spent investigating whether we could use our current Anthrop EMu Catalogue and adapt it to the Natural History collections. Even though tab switching and Registry settings are very advanced in EMu, we felt the two collections would not work well if they were integrated, as there was limited commonality in the core data for each.We had limited access to relevant Australian flavour software to make design changes on; our options being either American or,Partially implemented Australian Natural History versions orPartially implemented Australian non-Natural History versions,Our existing AM Anthropology EMu dataset, the software for which varied so immensely from the Natural History collections, it was subsequently disregarded as a base model to work from.
Answer- we chose a hybrid of AMNH and NMNH with a few ideas from MVHaving chosen a base Catalogue to work from (a cross between AMNH and NMNH), we had limited knowledge of the way in which the software worked and interacted between modules. This made customisation of the base product to our AM collection practices, very difficultAgain, having chosen a base Catalogue to work from, it was difficult to ascertain functionality with no real data in the system.We were given access to a relevant version of the software which helped us greatly in the design phase, but with out actual records in the different modules, it proved difficult understand the relationship between the modules, and how the data interacted.We felt it would be beneficial to have a small subset of carefully chosen data per collection, that displayed a good cross section of records, so that knowledge of the functionality could be gained whilst undertaking the customisation process.We decided to look at both the generic and the non-generic qualities of the 11 collections we were undertaking. This step, although time consuming, allowed us to make qualified decisions across the collections and to ensure standards were implemented on behalf of the organisation.This process allowed us to find commonality in work practices;Legacy Databases that were alikeSpecimen labeling similaritiesCore Data normally recorded by each collectionOutgoing LoansIt also highlighted obvious non-generic qualities about the data from different collections.Collection SizesTaxonomic usageRegistration Numbering SystemsStation Data (48 varying fields across the collections who use station data)Various coding standards between collectionsThe manner in which different collections record Lats & LongsThere were 21 working modules offered in the base product (and another 6 administrative ones; task templates, admin, registry etc.)We only use (or have mapped data to) 10 of these; Catalogue, Taxonomy, Sites, Collecting Events, Parties, Loans, Movements, Bibliography, Multimedia and Storage Locations.The remaining 11 may or may not be useful, but only time and familiarity with the product will determine which other modules are utilised; Accession Lots, Programs/Exhibitions, Conservation, Insurance, Rights, Narratives, Condition Checks, Internal Movements, Valuations, Gazetteer, Thesaurus.This is not a decision that can be lightly made until collection legacy data has been mapped.Analysis of each collection was required to determine how people recorded Lats & Longs; Deg/Min/Sec or Deg/Dec. Min.EMu stores all permutations of Lat & Long, but a decision had to be made about how it was displayed, and therefore printed, as it had to be standard for everyone.Registration numbers and IRNsWas not originally displayed with float dataTo round migrated data up or down is to render it inaccurateTo drop the float data also dilutes the accuracy of the dataKE subsequently altered the field parameters for us, so legacy data could be migrated with accuracy.A structure was implemented to determine, document and maintain logins and access levels to the system.KE implemented a templated email method of setting up or changing registry settings which has been quite usefulAfter our first migration, we implemented a system of service numbers to cater for the different environments we needed to progressively migrate each collection, so it could be kept quarantined until the load was checked.We subsequently set up 3 service environments for each collection; Test, Live and TrainThese 3 environments were then adopted across the 11 Collections plus Anthropology plus the Integrated Collection, to enable us to progressively move live loads of data to an integrated environment.At our request, KE devised a method of being able to view EMu in a more useable way, by dumping the software structure to a spreadsheet.This gave us a basis to work from in a sortable, searchable form, so that we could initially document our customisations.This Field List is periodically regenerated when we have made major changes so it stays relevant to the software.It has facilitated ongoing communication with KE for further software changes.It has allowed us to generate data maps for each collection; Texpress to EMu and EMu to Texpress, whilst retaining the relationship between modules, tabs, group boxes and fields.How many databases per collection needed movingWere there duplications in fields to be mappedWhich module was most appropriate to map the data to Relevance of legacy data to both the collection and its placement in EMu (depth for Fish is different to depth for MI)This has varied from collection to collection, but weve built up a set of criteria that gets tested by people from differing skill sets at the conclusion of a load; registration staff, personnel who handle loans, the database administrator (for login ability, correct registry settings and mapping integrity) and collection management.Printing functionality is checked across a number of usage parameters.This has ensured that any new load gets checked from a number of different view points, and has on many occasions highlighted a variety of issues.
At this point I should explain that we used the Marine Invertebrate collection as the guinea pig collection. We went live on the 3rd load with this collection. Subsequent collections have had one test load and then the live load. We had some difficulty coming to terms with loss of functionality in querying compared to that which was available in Texpress.lack of capacity range queries for registration numbers is a problem but this will be implemented by KE in a planned new versionImplementation of the SQS was somewhat untimelyAnalysis across collections needed to be undertaken to determine how many fields to include to cover collection needs for sear