Download - CESSDA Question Databank

Transcript
Page 1: CESSDA Question Databank

CESSDA Question Databank

Tender, results and future

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 2: CESSDA Question Databank

Introduction

• Data Archiving and Networked Services– Institute of both KNAW and NWO– Mission– Departments:• Archive and dissemination• Infrastructure• Software development

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 3: CESSDA Question Databank

Outline

• Background• Question Bank Tender• Discussion of technical specifications• Conclusion• Approach

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 4: CESSDA Question Databank

Background

• Cross-national survey programmes introduce comparability and harmonization issues.

• Supporting infrastructure: – Constructs, Classifications, Conversions Database

(CCCDB or CHARMCATS)– Question Database (QDB)

• Pre- and post harmonization

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 5: CESSDA Question Databank

Tender

• Specification of tender– Requirements, use cases– Need for CESSDA-wide architecture

• Execution– Metadata Technology– Marratech Sessions– Involvement of architecture WP

• Report and review

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 6: CESSDA Question Databank

Report

• General– QDB should not function stand alone• References to variables, questionnaire, etc.• DDI3 metadata model• Webservice architecture

– DDI v1 and v2 in use by CESSDA archives

• Discussion– Will tools be able to migrate to DDI v3?

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 7: CESSDA Question Databank

Report

• Purpose and Functionality– Link questions via concepts, variables– Link additional survey metadata / physical data– Query questions based on references– QDB needs to include references

• Discussion– Either use DDI3– Use generic model

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 8: CESSDA Question Databank

Report

• Architecture– Repositories povide content– Registry indexes content– 3CDB and QDB provide functionality– Increasing identification and communication

• Discussion– Question bank vs. QDB?– Identification designed for DDI3 context

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 9: CESSDA Question Databank

Report

• Repository– Contains content from one or more archives– Contains one or more banks

• Studies, variables, concepts, universes, questions, ...– Dedicated or on top of existing systems– Additional administration, logs, etc.

• Discussion– Existing systems fall short (identification, version,...)– Quality essential for stability

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 10: CESSDA Question Databank

Report

• Registry– Banks register content– Minimal metadata required for searching– Responsible for searching / locating, not for

retrieval– Use SDMX approach

• Discussion– How much metadata is needed for proper

functioning?

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 11: CESSDA Question Databank

Report

• QDB– Function as repository for local questions and

proxy for non-local questions– Stores comparison information

• Discussion– Should QDB archive questions / comparison

information– Who is responsible for QDB (LTP)

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 12: CESSDA Question Databank

Report

• Requirements and use cases– A ‘Gold Standard’ promotes the use of certain

proven objects and increases comparability– Use registry for searching

• Discussion– Assign to existing questions or define them

centrally?– Use registry or QDB for searching questions?

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 13: CESSDA Question Databank

Report

• Metadata and technology overview– Many open source components– Database might require proprietary software

• Discussion– Start with open source database. Good design

allows replacement when needed.

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 14: CESSDA Question Databank

Report

• Implementation– Start prototype implementations to demonstrate

functionality– Start improving legacy metadata– Use / extend SDMX registry

• Discussion– Deadlock-situation: get tools to improve metadata,

improve metadata to demonstrate functionality– How DDI3 is improved metadata from Nesstar without

workflow, versioning, identification? DDI3-ready?

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 15: CESSDA Question Databank

Alternative Solution

• MT approach is similar / better than intuitive solution– DDI3 metadata

approach is essential– Web service is more

flexible than harvesting– MT approach is more

distributed

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 16: CESSDA Question Databank

Conclusion

• DDI3 is an obvious choice, adopt it and improve it

• It will change workflow, infrastructure and responsibility

• How can archives justify, pay, risk and achieve this?

• What is the role of CESSDA?

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 17: CESSDA Question Databank

Approach

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 18: CESSDA Question Databank

Approach

• Phase 1: search, browse and access questions– Question text + response domain– Results in having some base material

• Phase 2: add references– To/from concepts and questionnaires– Implement registry to facilitate search– Explore organiation,publishing issues

Maarten Hoogerwerf, CESSDA expert seminar 2009

Page 19: CESSDA Question Databank

Approach

• Phase 3: Add QDB/3CDB– What functions do these provide– What metadata functions do these require

• Etc.

Maarten Hoogerwerf, CESSDA expert seminar 2009