Development of metadata in the National Statistical Institute of Spain Work Session on Statistical...
-
Upload
emory-dixon -
Category
Documents
-
view
219 -
download
1
Transcript of Development of metadata in the National Statistical Institute of Spain Work Session on Statistical...
Development of metadata in the National Statistical Institute of Spain
Development of metadata in the National Statistical Institute of Spain
Work Session on Statistical MetadataGenève, 6-8 May-2013
Ana Isabel Sánchez-Luengo Murcia
1
Development of metadata in the National Statistical Institute of Spain
Background
INE has been working on metadata for 10 years
Closely following the METIS and Eurostat working groups´ developments
During these 10 years, prototypes have been developed for repositories of concepts, questions and classifications related to different surveys carried out by the INE
2
Development of metadata in the National Statistical Institute of Spain
Background
Before:
Metadata were not considered a part of the process.
Now:
•Process is defined in terms of GSBPM
•Metadata will be considered by the Regulations.
Metadata are integrated in the process
3
Development of metadata in the National Statistical Institute of Spain
Current project
Statistical Operation Database Process
ReferenceStructuralProcess
Efficiency and re-use Governance Exchange of information
Metadata Integrated System (SIM)
4
Development of metadata in the National Statistical Institute of Spain
Reference metadata. Methodological sheets for dissemination on the web
(ESMS)
5
Development of metadata in the National Statistical Institute of Spain
Reference metadata. Methodological sheets for dissemination on the web (ESMS)
Origin of the project
The Board of Directors approved it and enhance it
3. Improvement and standardisation of the methodological information on the INE’s web corresponding to the statistical operations carried out.
6
Development of metadata in the National Statistical Institute of Spain
Part of the process
Standardisation, Coordination, Quality,
DisseminationIT
Approved by a
standard
ImplementationJanuary- December
Reference metadata
Board of Directors
7
Development of metadata in the National Statistical Institute of Spain
Some examples of reference metadata at the INE
8
Development of metadata in the National Statistical Institute of Spain
Standardised Methodological Report
Some examples of reference metadata at the INE
9
Development of metadata in the National Statistical Institute of Spain
Reference metadata. Methodological sheets for dissemination on the web (ESMS)
10
Development of metadata in the National Statistical Institute of Spain
The structural metadata. The databases
11
Development of metadata in the National Statistical Institute of Spain
The structural metadata. The databases
12
Development of metadata in the National Statistical Institute of Spain
Link to the microdata
13
Development of metadata in the National Statistical Institute of Spain
Link to the microdata
Legal marital status
Legal marital status is defined as the (legal) conjugal status of each individual in relation to the marriage laws (or customs) of the country (i.e. de jure status).
Source: Core Social Variables
Microdata
14
Development of metadata in the National Statistical Institute of Spain
Link to the macrodata
15
Development of metadata in the National Statistical Institute of Spain
SIMStatistical Operations
Questionnaires Questions Classifications
Variables
Subjects
Microdata Macrodata
The structural metadata. The databases
16
Development of metadata in the National Statistical Institute of Spain
Searching tools
Searching tools
- What surveys include the variable ‘Nationality’?
- How is it done?
- Is there any standard in INE for collecting this variable?
17
Development of metadata in the National Statistical Institute of Spain
Searching tools
18
Development of metadata in the National Statistical Institute of Spain
Implementation plan
• April 2012: submitted to the High Council on Statistics
•Beginning of 2013: Concept repository available on INE’s web site
• End of 2013: Classification repository
• Beginning of 2014: It would be possible to access the structural metadata base
The structural metadata
19
Development of metadata in the National Statistical Institute of Spain
The process metadata
20
Development of metadata in the National Statistical Institute of Spain
Process metadata
The Board of Directors, on its meeting of March 8th, 2012, approved a standard
promoting the use of the GSBPM as the language for describing the production
model of the different statistical operations.
21
Development of metadata in the National Statistical Institute of Spain
Pilot tests have already been carried out:
Description of the process for the production of
•Retail Trade Indices
•Survey on Equipment and Use of Information and Communication Technologies in Households
Process metadata
General remarks:Different kind of tasks have to be distinguished in the production process:
A. Monthly tasks: These tasks are basically data collection, editing, imputation and dissemination.
B. Yearly tasks:
B1. Yearly tasks related to the sample rotation.
B2. Yearly tasks related to other potential changes in the survey, like data collection improvements.
C. Other sporadic tasks, like a change of questionnaire
22
Development of metadata in the National Statistical Institute of Spain
1. The unit in charge of producing the index described the different tasks in terms of GSBPM
• A lack of information on data, people involved, used software, time used… was noticed
2. The metadata unit proposed some items (as a minimum) for covering this lack of information
• GSIM could solve this problem
Process metadata
Reflections:
23
Development of metadata in the National Statistical Institute of Spain
4Collect
5Process
6Analyse
7Disseminate
8 Archive
4.1Select sample
5.1Integrate data
6.1. Prepare draft outputs
7.1Update output
systems
8.1 Define archive rules
4.2Set up
collection
5.2Classify & Code
6.2Validate outputs
7.2Produce
dissemination products
8.2 Manage archive
repository
4.3Run
collection
5.3Review, Validate
& edit
6.3Scrutinize &
explain
7.3Manage
release of dissemination
products
8.3 preserve data and
associated metadata
4.4Finalize
collection
5.4Impute
6.4Apply
disclosure control
7.4Promote
dissemination products
8.4 dispose of data &
associated metadata
5.5Derive new variables &
statistical unitst
6.5Finalize outputs
7.5Manage user
support
5.6Calculate weights
5.7Calculate
aggregates
5.8Finalize data files
M1
M2
M3 DDPP
M4
M 9
M 10
M 13
M 14
M5 SP
M6
M7
M8
M 12
M 11M 15
M 16
M 17
M 18
Process metadata: Retail Trade Indices
24
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Process metadata
Example
25
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Where in GSBPM is it located?
Process metadata
26
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Process metadata
27
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
•Monthly and yearly tasks: Aimed at the production of information (short-term and structural)
•Yearly tasks: Related to samples, improvement of the software tools, questionnaires, ...
•Non-periodical tasks: New base year, change of classification, methodological change
Process metadata
28
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Timetables
Process metadata
29
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Work flows and their documentation
Process metadata
30
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Databases produced. Which ones are part of the corporative system?
Process metadata
31
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Makes it easy to reuse
Process metadata
32
Development of metadata in the National Statistical Institute of Spain
Subprocess: 2.4. Sample frame & Design Methodology
Actions:
Action Periodicity (eg monthly, yearly,...)
Starting date Final date
Required input
Initial file Final file
Software (standard and/or tailor-made):
Documentation, manual, handbook:
Unit in charge
Collaborating units
Process metadata
33
Development of metadata in the National Statistical Institute of Spain
Process metadata: Example
Subprocess: 4.3 Run collection
Actions: At the end of the reference time (day t) questionnaires are sent to respondent units, which send them back before day t+7
Action Periodicity (eg monthly, yearly,...) Monthly
Starting date t+1 Final date t+7
Required input Questionnaires with updated postal addresses pre-printed on them
Initial file Not applicable (N/A) Final file N/A
Software (standard and/or tailor-made): Software tool for collection via web (ARCE), software tool for paper collection
Documentation, manual, handbook: Questionnaires, passwords, Data collection handbook, Validation rules, labels
Unit in charge Collection units located in the regional delegations
Collaborating units Data collection unit and Retail Trade Index unit
Task: M1
34
Development of metadata in the National Statistical Institute of Spain
Adjusted series
•Series published•Method used•Software used•Aggregation•Reviews•Quality indicators
Process metadata
35
Development of metadata in the National Statistical Institute of Spain
Conclusion
• MODEL
Consideration of the whole process
Relevance and need of linking metadata to data
• WORK METHOD
Access to software tools by means of user and password
Collaboration of the different units
Dissemination inside and outside the INE
• SOFTWARE TOOLS
Improvement of searching tools
More friendly and flexible
Aiming at promoting the reuse
36
Development of metadata in the National Statistical Institute of Spain
It makes easy the change from vertical to thematic information
It was a good documentation
Well organised
Conclusion
Positive aspects:
37
Development of metadata in the National Statistical Institute of Spain
Future
• Production of SIM
• Implementation of the reference metadata in the institution
• Dissemination of the metadata
• Inclusion of administrative sources
• Increase standardisation, and efficiency
38
Development of metadata in the National Statistical Institute of Spain
• Thank you for your attention
• Any questions?
39