The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI...
-
Upload
claire-turner -
Category
Documents
-
view
217 -
download
0
Transcript of The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI...
The Data Documentation Initiative: more
discussionChuck Humphrey
University of Alberta
Atlantic DLI Workshop 2005, Acadia University
2
Outline
• Two metadata challenges
• The evolution of DDI
• The story of the RDC / DDI project
3
Metadata Challenge #1
The correspondence between metadata and the tools that make use of it tends to be one-to-one.
MARC OPAC
SPSS syntax file SPSS
PDF file Acrobat
SAS commands SAS
4
Metadata Challenge #1
And we have to create and re-create metadata for each application.
MARC OPAC
SPSS syntax file SPSS
PDF file Acrobat
SAS commands SAS
5
Metadata Models for Data
• Here are some examples of historical metadata models for social science data. Notice that the characteristics of the metadata were bound by the tools of the day.
6
7
11
Metadata Challenge #2
Our application tools have tended to constrain our metadata.
Direction of control
Desired service
title search
Choose a tool
card catalog
Tool definesthe metadataformat
3x5 card
We createmetadata tofit a format
Small. (e.g.,3 subjects headings max.)
12
Metadata Challenge #2Consider how the length of variable labels for various statistical software has constrained our metadata about brief variable descriptions.
Statistical Package Max. Length of Var. Labels
SPSS 7.5 120
SPSS 12 255
SAS 6.12 40
SAS 8.0 256
STATA 6.0 80
13
Metadata Challenge #2The dilemma created by limiting our metadata to current tools is that when new tools arise or new services are sought that can make use of richer metadata, we will not have created it and must face re-creating the metadata.
14
Lessons from These Challenges
Metadata should be created to go beyond simple one-to-one use and should be reusable for more than one purpose.
Metadata should be created to describe data, not to meet the needs of one system, one service.
Blaise
SAS
IMDB
Word
Paper
DDI
IMDBNesstar
OracleLibrary OPACStat Software
PDF, printhtmlRSSDDI 3, 4 ...
Proposal
Sample design
Questionnaire
Pre-test
Revisions
Collection
Processing
Dissemination
Function Tools Metadata Uses
Applying These Lessons
16
DDI Versions 1 & 2
1.0 Document Description2.0 Study Description3.0 Data Files Description4.0 Variable Description5.0 Other Study-related
Materials
The first two versions of DDI were modeled after the traditional ‘codebook’ made up of a user’s guide, data dictionary and record layout.
17
DDI Version 3 (Draft)
The draft for Version 3 is based on a process model and attempts to describe the stages within data creation using a life cycle perspective.
1. Start up 2. Planning 3. Execution 4. Close Out
DDI Versions 3 (Draft)
19
DDI Versions 3 (Draft)
20
Project Partnerships• RDC Network
• RDC’s in the pilot include McMaster, Prairie and Alberta
• RDC Central has a Nesstar Licence
• DLI• DLI Central shares the Nesstar
License and is working on converting PUMF’s to DDI
• DLI EAC approved joining the DDI Alliance
21
Project Partnerships
• General Social Survey
• Permission to use Cycle 17 in the pilot
• Provided a contact to assist with the data documentation
• Standards Division
• Interested in a pilot that would expose the issues of using DDI to document data
22
Project Operation
• No formal budget at this point. All contributions to the project are in kind.
• Irene Wong is conducting the evaluation and creation of DDI documentation in the Alberta RDC.
• Sharon Neary, associated with the Prairie RDC, is coordinating training for end-users.
23
Project Operation
• Byron Spencer is coordinating an evaluation of the Nesstar application of DDI in the McMaster RDC with end-users.
We need for data discovery tools in DLI and the RDCs.
24
Project Status
• The DDI compliant documentation for the GSS Cycle 17 master file has been completed and is now being tested as McMaster’s RDC.
• Irene is completing a report describing the process of creating the DDI version of the documentation and an assessment of DDI strengths and weaknesses.
25
Metadata Life-Cycle Research
One outcome of this project will be to comment on the amount of metadata produced over the life cycle of a survey and to identify the existing tools in which this metadata had been created and stored.