Development of metadata in the National Statistical Institute of Spain Work Session on Statistical...

39
Development of metadata in the National Statistical Institute of Spain Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo Murcia 1

Transcript of Development of metadata in the National Statistical Institute of Spain Work Session on Statistical...

Page 1: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Development of metadata in the National Statistical Institute of Spain

Work Session on Statistical MetadataGenève, 6-8 May-2013

Ana Isabel Sánchez-Luengo Murcia

1

Page 2: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Background

INE has been working on metadata for 10 years

Closely following the METIS and Eurostat working groups´ developments

During these 10 years, prototypes have been developed for repositories of concepts, questions and classifications related to different surveys carried out by the INE

2

Page 3: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Background

Before:

Metadata were not considered a part of the process.

Now:

•Process is defined in terms of GSBPM

•Metadata will be considered by the Regulations.

Metadata are integrated in the process

3

Page 4: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Current project

Statistical Operation Database Process

ReferenceStructuralProcess

Efficiency and re-use Governance Exchange of information

Metadata Integrated System (SIM)

4

Page 5: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Reference metadata. Methodological sheets for dissemination on the web

(ESMS)

5

Page 6: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Reference metadata. Methodological sheets for dissemination on the web (ESMS)

Origin of the project

The Board of Directors approved it and enhance it

3. Improvement and standardisation of the methodological information on the INE’s web corresponding to the statistical operations carried out.

6

Page 7: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Part of the process

Standardisation, Coordination, Quality,

DisseminationIT

Approved by a

standard

ImplementationJanuary- December

Reference metadata

Board of Directors

7

Page 8: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Some examples of reference metadata at the INE

8

Page 9: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Standardised Methodological Report

Some examples of reference metadata at the INE

9

Page 10: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Reference metadata. Methodological sheets for dissemination on the web (ESMS)

10

Page 11: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

The structural metadata. The databases

11

Page 12: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

The structural metadata. The databases

12

Page 13: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Link to the microdata

13

Page 14: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Link to the microdata

Legal marital status

Legal marital status is defined as the (legal) conjugal status of each individual in relation to the marriage laws (or customs) of the country (i.e. de jure status).

Source: Core Social Variables

Microdata

14

Page 15: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Link to the macrodata

15

Page 16: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

SIMStatistical Operations

Questionnaires Questions Classifications

Variables

Subjects

Microdata Macrodata

The structural metadata. The databases

16

Page 17: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Searching tools

Searching tools

- What surveys include the variable ‘Nationality’?

- How is it done?

- Is there any standard in INE for collecting this variable?

17

Page 18: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Searching tools

18

Page 19: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Implementation plan

• April 2012: submitted to the High Council on Statistics

•Beginning of 2013: Concept repository available on INE’s web site

• End of 2013: Classification repository

• Beginning of 2014: It would be possible to access the structural metadata base

The structural metadata

19

Page 20: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

The process metadata

20

Page 21: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Process metadata

The Board of Directors, on its meeting of March 8th, 2012, approved a standard

promoting the use of the GSBPM as the language for describing the production

model of the different statistical operations.

21

Page 22: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Pilot tests have already been carried out:

Description of the process for the production of

•Retail Trade Indices

•Survey on Equipment and Use of Information and Communication Technologies in Households

Process metadata

General remarks:Different kind of tasks have to be distinguished in the production process:

 A. Monthly tasks: These tasks are basically data collection, editing, imputation and dissemination.

 B. Yearly tasks:

B1. Yearly tasks related to the sample rotation.

  B2. Yearly tasks related to other potential changes in the survey, like data collection improvements.

 C. Other sporadic tasks, like a change of questionnaire

  22

Page 23: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

1. The unit in charge of producing the index described the different tasks in terms of GSBPM

• A lack of information on data, people involved, used software, time used… was noticed

2. The metadata unit proposed some items (as a minimum) for covering this lack of information

• GSIM could solve this problem

Process metadata

Reflections:

23

Page 24: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

4Collect

5Process

6Analyse

7Disseminate

8 Archive

4.1Select sample

5.1Integrate data

6.1. Prepare draft outputs

7.1Update output

systems

8.1 Define archive rules

4.2Set up

collection

5.2Classify & Code

6.2Validate outputs

7.2Produce

dissemination products

8.2 Manage archive

repository

4.3Run

collection

5.3Review, Validate

& edit

6.3Scrutinize &

explain

7.3Manage

release of dissemination

products

8.3 preserve data and

associated metadata

4.4Finalize

collection

5.4Impute

6.4Apply

disclosure control

7.4Promote

dissemination products

8.4 dispose of data &

associated metadata

5.5Derive new variables &

statistical unitst

6.5Finalize outputs

7.5Manage user

support

5.6Calculate weights

5.7Calculate

aggregates

5.8Finalize data files

M1

M2

M3 DDPP

M4

M 9

M 10

M 13

M 14

M5 SP

M6

M7

M8

M 12

M 11M 15

M 16

M 17

M 18

Process metadata: Retail Trade Indices

24

Page 25: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Process metadata

Example

25

Page 26: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Where in GSBPM is it located?

Process metadata

26

Page 27: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Process metadata

27

Page 28: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

•Monthly and yearly tasks: Aimed at the production of information (short-term and structural)

•Yearly tasks: Related to samples, improvement of the software tools, questionnaires, ...

•Non-periodical tasks: New base year, change of classification, methodological change

Process metadata

28

Page 29: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Timetables

Process metadata

29

Page 30: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Work flows and their documentation

Process metadata

30

Page 31: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Databases produced. Which ones are part of the corporative system?

Process metadata

31

Page 32: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Makes it easy to reuse

Process metadata

32

Page 33: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

 

Subprocess: 2.4. Sample frame & Design Methodology

Actions:

Action Periodicity (eg monthly, yearly,...)

Starting date Final date

Required input

Initial file Final file

Software (standard and/or tailor-made):

Documentation, manual, handbook:

Unit in charge

Collaborating units

Process metadata

33

Page 34: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Process metadata: Example

 

Subprocess: 4.3 Run collection

Actions: At the end of the reference time (day t) questionnaires are sent to respondent units, which send them back before day t+7

Action Periodicity (eg monthly, yearly,...) Monthly

Starting date t+1 Final date t+7

Required input Questionnaires with updated postal addresses pre-printed on them

Initial file Not applicable (N/A) Final file N/A

Software (standard and/or tailor-made): Software tool for collection via web (ARCE), software tool for paper collection

Documentation, manual, handbook: Questionnaires, passwords, Data collection handbook, Validation rules, labels

Unit in charge Collection units located in the regional delegations

Collaborating units Data collection unit and Retail Trade Index unit

Task: M1

34

Page 35: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Adjusted series

•Series published•Method used•Software used•Aggregation•Reviews•Quality indicators

Process metadata

35

Page 36: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Conclusion

• MODEL

Consideration of the whole process

Relevance and need of linking metadata to data

• WORK METHOD

Access to software tools by means of user and password

Collaboration of the different units

Dissemination inside and outside the INE

• SOFTWARE TOOLS

Improvement of searching tools

More friendly and flexible

Aiming at promoting the reuse

36

Page 37: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

It makes easy the change from vertical to thematic information

It was a good documentation

Well organised

Conclusion

Positive aspects:

37

Page 38: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

Future

• Production of SIM

• Implementation of the reference metadata in the institution

• Dissemination of the metadata

• Inclusion of administrative sources

• Increase standardisation, and efficiency

38

Page 39: Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.

Development of metadata in the National Statistical Institute of Spain

• Thank you for your attention

• Any questions?

[email protected]

39