Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics...
-
Upload
joleen-atkins -
Category
Documents
-
view
221 -
download
0
description
Transcript of Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics...
Use of Standardized Use of Standardized Metadata to Find, Select Metadata to Find, Select
and Access Statistical Dataand Access Statistical Data- - Experience of Statistics Experience of Statistics
Canada -Canada -
Joint UNECE/Eurostat/OECD Joint UNECE/Eurostat/OECD Work Session on Statistical Work Session on Statistical
MetadataMetadata(METIS)(METIS)
Geneva, February 9-11, 2004Geneva, February 9-11, 2004
Objective of PresentationObjective of Presentation
Answer the question:Answer the question:
““How can the corporate How can the corporate Metadatabase (IMDB) of Statistics Metadatabase (IMDB) of Statistics Canada help users find, select and Canada help users find, select and access statistical data held in its access statistical data held in its on-line database (CANSIM)?”on-line database (CANSIM)?”
Contents of PresentationContents of Presentation What is CANSIM?What is CANSIM? What is the IMDB?What is the IMDB? Accessing CANSIM data from IMDBAccessing CANSIM data from IMDB
– Demonstration Demonstration – Naming and defining variablesNaming and defining variables– Finding variable & accessing dataFinding variable & accessing data– Implementation scheduleImplementation schedule
What is CANSIM?What is CANSIM?
Stands for:Stands for: CANCANadian adian SSocio-economic ocio-economic IInformation nformation MManagement systemanagement system
Corporate data dissemination databaseCorporate data dissemination database Accessible on STC Web siteAccessible on STC Web site 1.3K tables (+ 700 “terminated”)1.3K tables (+ 700 “terminated”) 18.3M time series (incl. 413K “terminated”)18.3M time series (incl. 413K “terminated”)
(over 14M for “health” alone)(over 14M for “health” alone) 800 variables 800 variables (as defined in IMDB)(as defined in IMDB)
What is the IMDB?What is the IMDB? Corporate repository of information on Corporate repository of information on
over 350 surveysover 350 surveys (+400 “discontinued”) (+400 “discontinued”) Development began in 1999 Development began in 1999 4 pre-existing systems integrated4 pre-existing systems integrated Supports on-line dissemination activities:Supports on-line dissemination activities:
The DailyThe DailyCANSIMCANSIMOn-line catalogueOn-line catalogueCanadian Statistic TablesCanadian Statistic Tables
What is the IMDB content?What is the IMDB content? HTML pages generated from IMDB:HTML pages generated from IMDB:
- Overview of survey (mandate, users, uses)Overview of survey (mandate, users, uses)- Survey population & Questionnaire imageSurvey population & Questionnaire image- Methodology description (10 components)Methodology description (10 components)- Data accuracy measuresData accuracy measures
In the Fall of 2004:In the Fall of 2004:- Variable names and definitionsVariable names and definitions- Link to classifications & CANSIM tablesLink to classifications & CANSIM tables- ““Time Travel” from November 2000 onTime Travel” from November 2000 on
Naming and defining Naming and defining variablesvariables
Variable = Statistical unit + property + Variable = Statistical unit + property + representation representation (as per ISO 11179 model)(as per ISO 11179 model)
Statistical unit is agent, event or item Statistical unit is agent, event or item about which data are produced about which data are produced
Property is characteristic of statistical Property is characteristic of statistical unit being measuredunit being measured
Representation is form given to Representation is form given to resulting data, e.g. Name, Index, Type resulting data, e.g. Name, Index, Type
… … Naming and defining Naming and defining variablesvariables
Naming convention: all three elements Naming convention: all three elements used to create name of variableused to create name of variable
- Value of Sales of Establishment- Value of Sales of Establishment- Type of Assets of Establishment- Type of Assets of Establishment- Name of Geographic location of Person- Name of Geographic location of Person- Type of Occupation of Person- Type of Occupation of Person- Value of GDP of Economy- Value of GDP of Economy
… … Naming and defining Naming and defining variablesvariables
Definition of variable provided by joined Definition of variable provided by joined definitions of its 3 components definitions of its 3 components
+ specification of associated + specification of associated classificationclassificationss (or unit(or unitss of measure) of measure)
Note about Variable – Classification relationship:Note about Variable – Classification relationship:- ISO 11179: one-to-one relationshipISO 11179: one-to-one relationship- IMDB: one-to-many, but one-to-one between IMDB: one-to-many, but one-to-one between
classification and variable in one CANSIM table classification and variable in one CANSIM table
Finding variable & accessing Finding variable & accessing datadata Browsing the list of 800 variablesBrowsing the list of 800 variables
– By variable topic (20) and sub-topic (156)By variable topic (20) and sub-topic (156)– By statistical unit (75)By statistical unit (75)– By classification domain (20)By classification domain (20)
Search engine to scan the list of:Search engine to scan the list of:– variable namesvariable names in IMDB and return the ones in IMDB and return the ones
containing the word entered or its thesaurus containing the word entered or its thesaurus equivalent; orequivalent; or
– class names/codes class names/codes within classificationswithin classifications, , search the word entered or its thesaurus search the word entered or its thesaurus equivalent,equivalent, and return the variables and and return the variables and CANSIM table numbers associated with the CANSIM table numbers associated with the matching codesmatching codes
Implementation scheduleImplementation schedule
Winter 2004: loading variables and classifications Winter 2004: loading variables and classifications in IMDB, implementing Browsing mechanism and in IMDB, implementing Browsing mechanism and “time travel”, finalizing re-design of web pages “time travel”, finalizing re-design of web pages
Spring 2004: display new pages with new Spring 2004: display new pages with new features on Intranet to obtain feedback from features on Intranet to obtain feedback from survey managerssurvey managers
Fall 2004: display on Internet Fall 2004: display on Internet Winter 2005: Implementation of Search Winter 2005: Implementation of Search
mechanismmechanism