Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics...

19
Metadata to Find, Metadata to Find, Select and Access Select and Access Statistical Data Statistical Data - - Experience of Experience of Statistics Canada - Statistics Canada - Joint UNECE/Eurostat/OECD Joint UNECE/Eurostat/OECD Work Session on Statistical Work Session on Statistical Metadata Metadata (METIS) (METIS) Geneva, February 9-11, 2004 Geneva, February 9-11, 2004

description

Contents of Presentation l What is CANSIM? l What is the IMDB? l Accessing CANSIM data from IMDB – Demonstration – Naming and defining variables – Finding variable & accessing data – Implementation schedule

Transcript of Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics...

Page 1: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

Use of Standardized Use of Standardized Metadata to Find, Select Metadata to Find, Select

and Access Statistical Dataand Access Statistical Data- - Experience of Statistics Experience of Statistics

Canada -Canada -

Joint UNECE/Eurostat/OECD Joint UNECE/Eurostat/OECD Work Session on Statistical Work Session on Statistical

MetadataMetadata(METIS)(METIS)

Geneva, February 9-11, 2004Geneva, February 9-11, 2004

Page 2: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

Objective of PresentationObjective of Presentation

Answer the question:Answer the question:

““How can the corporate How can the corporate Metadatabase (IMDB) of Statistics Metadatabase (IMDB) of Statistics Canada help users find, select and Canada help users find, select and access statistical data held in its access statistical data held in its on-line database (CANSIM)?”on-line database (CANSIM)?”

Page 3: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

Contents of PresentationContents of Presentation What is CANSIM?What is CANSIM? What is the IMDB?What is the IMDB? Accessing CANSIM data from IMDBAccessing CANSIM data from IMDB

– Demonstration Demonstration – Naming and defining variablesNaming and defining variables– Finding variable & accessing dataFinding variable & accessing data– Implementation scheduleImplementation schedule

Page 4: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

What is CANSIM?What is CANSIM?

Stands for:Stands for: CANCANadian adian SSocio-economic ocio-economic IInformation nformation MManagement systemanagement system

Corporate data dissemination databaseCorporate data dissemination database Accessible on STC Web siteAccessible on STC Web site 1.3K tables (+ 700 “terminated”)1.3K tables (+ 700 “terminated”) 18.3M time series (incl. 413K “terminated”)18.3M time series (incl. 413K “terminated”)

(over 14M for “health” alone)(over 14M for “health” alone) 800 variables 800 variables (as defined in IMDB)(as defined in IMDB)

Page 5: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

What is the IMDB?What is the IMDB? Corporate repository of information on Corporate repository of information on

over 350 surveysover 350 surveys (+400 “discontinued”) (+400 “discontinued”) Development began in 1999 Development began in 1999 4 pre-existing systems integrated4 pre-existing systems integrated Supports on-line dissemination activities:Supports on-line dissemination activities:

The DailyThe DailyCANSIMCANSIMOn-line catalogueOn-line catalogueCanadian Statistic TablesCanadian Statistic Tables

Page 6: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 7: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

What is the IMDB content?What is the IMDB content? HTML pages generated from IMDB:HTML pages generated from IMDB:

- Overview of survey (mandate, users, uses)Overview of survey (mandate, users, uses)- Survey population & Questionnaire imageSurvey population & Questionnaire image- Methodology description (10 components)Methodology description (10 components)- Data accuracy measuresData accuracy measures

In the Fall of 2004:In the Fall of 2004:- Variable names and definitionsVariable names and definitions- Link to classifications & CANSIM tablesLink to classifications & CANSIM tables- ““Time Travel” from November 2000 onTime Travel” from November 2000 on

Page 8: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 9: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 10: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 11: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 12: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 13: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 14: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Page 15: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

Naming and defining Naming and defining variablesvariables

Variable = Statistical unit + property + Variable = Statistical unit + property + representation representation (as per ISO 11179 model)(as per ISO 11179 model)

Statistical unit is agent, event or item Statistical unit is agent, event or item about which data are produced about which data are produced

Property is characteristic of statistical Property is characteristic of statistical unit being measuredunit being measured

Representation is form given to Representation is form given to resulting data, e.g. Name, Index, Type resulting data, e.g. Name, Index, Type

Page 16: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

… … Naming and defining Naming and defining variablesvariables

Naming convention: all three elements Naming convention: all three elements used to create name of variableused to create name of variable

- Value of Sales of Establishment- Value of Sales of Establishment- Type of Assets of Establishment- Type of Assets of Establishment- Name of Geographic location of Person- Name of Geographic location of Person- Type of Occupation of Person- Type of Occupation of Person- Value of GDP of Economy- Value of GDP of Economy

Page 17: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

… … Naming and defining Naming and defining variablesvariables

Definition of variable provided by joined Definition of variable provided by joined definitions of its 3 components definitions of its 3 components

+ specification of associated + specification of associated classificationclassificationss (or unit(or unitss of measure) of measure)

Note about Variable – Classification relationship:Note about Variable – Classification relationship:- ISO 11179: one-to-one relationshipISO 11179: one-to-one relationship- IMDB: one-to-many, but one-to-one between IMDB: one-to-many, but one-to-one between

classification and variable in one CANSIM table classification and variable in one CANSIM table

Page 18: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

Finding variable & accessing Finding variable & accessing datadata Browsing the list of 800 variablesBrowsing the list of 800 variables

– By variable topic (20) and sub-topic (156)By variable topic (20) and sub-topic (156)– By statistical unit (75)By statistical unit (75)– By classification domain (20)By classification domain (20)

Search engine to scan the list of:Search engine to scan the list of:– variable namesvariable names in IMDB and return the ones in IMDB and return the ones

containing the word entered or its thesaurus containing the word entered or its thesaurus equivalent; orequivalent; or

– class names/codes class names/codes within classificationswithin classifications, , search the word entered or its thesaurus search the word entered or its thesaurus equivalent,equivalent, and return the variables and and return the variables and CANSIM table numbers associated with the CANSIM table numbers associated with the matching codesmatching codes

Page 19: Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.

Implementation scheduleImplementation schedule

Winter 2004: loading variables and classifications Winter 2004: loading variables and classifications in IMDB, implementing Browsing mechanism and in IMDB, implementing Browsing mechanism and “time travel”, finalizing re-design of web pages “time travel”, finalizing re-design of web pages

Spring 2004: display new pages with new Spring 2004: display new pages with new features on Intranet to obtain feedback from features on Intranet to obtain feedback from survey managerssurvey managers

Fall 2004: display on Internet Fall 2004: display on Internet Winter 2005: Implementation of Search Winter 2005: Implementation of Search

mechanismmechanism