Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central...

51
Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND

Transcript of Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central...

Page 1: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Pilot Census in PolandSome Quality Aspects

Geneva, 7-9 July 2010

Janusz DygaszewiczCentral Statistical Office

POLAND

Page 2: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

2

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Data processing infrastructure

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 3: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Key elements of census process in terms of census quality • Census planning - scope of census,• Data sources,• Data collecting,• Data storing,• Data processing,• Development of census results,• Dissemination of census results,• Census Metadata System.

Census Quality

3

Page 4: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

CENSUS PLANNING

4

Page 5: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Census planning Quality aspects: relevance, accuracy, costs including the burden on respondents, information security

• Determining the data scope defined in Act including:• Compliance with needs of domestic and

EU users,• Quality of data source,• Coherence and comparability of results

from census 2011 and 2002,

Census Quality

5

Page 6: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

DATA ACQUISITION

6

Page 7: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

7

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Data acquisition

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 8: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Files format:• Flat files,• XML files,• Local Databases XML files integration,

Data acquisition

8

Page 9: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Data acquisition - Portal

9

Page 10: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Datasources Quality aspects: accuracy, timeliness and punctuality, comparability and coherence, costs including the burden on respondents, information security• Assessment of data sources quality for census:

• analyses of methodological compliance of concepts definitions from registers with those adopted in statistics and the UNECE and EUROSTAT Recommendations for the 2010 Censuses on Population and Housing,• developing methodology for compliance

analyses,• constructing the IT system PiK for describing,

comparing and assessing coherence level,

Census Quality – data acquisition

10

Page 11: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Registers• developing methodology for assessing the

quality: dimensions, quality indicators,• evaluation and description of sources

quality,• MATRIX that represents the possibility of

obtaining the values for the census from registers:• census variable compliance indicators

(methodology compliance indicator), • register suitability indicators (population

coverage indicator for data from the register),

Census Quality – data acquisition

11

Page 12: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Data sets• developing methodology for assessing

the quality,• evaluation and description of data sets

quality,• developing methodology for improving

source data sets quality – rules for: standardization, normalization, de-duplication, editing, imputation, calibration

Census Quality – data acquisition

12

Page 13: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

CENSUS FRAME PREPARATION

13

Page 14: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Citizens, buildings and dwelling list preparing,

Citizens, buildings and dwelling list and statistical data integration,

Census Frame preparing.

Census Frame preparation

14

Goal Frame Preparation,

Random Sample preparation,

Page 15: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Quality of Census Frame

15

Census frame pre-census revision - checking in field by enumerators

Census frame preparation – validation and updating in counties,

Page 16: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Enumerator tracking

Page 17: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.
Page 18: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

18

Page 19: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

19

Page 20: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

20

Page 21: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

21

Page 22: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

22

Page 23: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Census Completeness Monitoring

Page 24: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

24

Page 25: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

TRANSFORMATION TO STATISTICAL REGISTER

25

Page 26: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

26

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Source data collection and preparation

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 27: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Registers loading into data laboratory envroiment,

Denormalization,

Standarization,

Deduplication,

Validation,

Data completion,

Vocabulary validation and automatic correction,

Statistical files (register) generation,

Source data collection and preparation

27

Page 28: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Collecting dataQuality aspects: accuracy, costs including the burden on respondents, information security

• Collecting data from information systems• Central registers,• Distributed registers,

• format / file structure (XSD schemas),• data transfer platform,• application for encrypted data transfer,• application for validation and data set control

Census Quality – collection and preparation

28

Page 29: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Data loading to Operational Microdatabase,

Validation

Manual and automatic correction (cleaning),

Deduplication,

Variables calculating,

Source data loading and correction

29

Page 30: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

30

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

CAxI

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 31: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

•CAII - Computer Assisted Internet Interview,•CAPI - Computer Assisted Personal Interview,•CATI - Computer Assisted Telephone Interviewing.

CAxI

CAxI

31

CAXI

Page 32: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

• Collecting data from respondents: CAII, CAPI, CATI;• CAxI input validation:

• Numerical data validation (answers within boundaries)• Cross question arithmetical validation• Hints and automatic answer completion• Dictionaries and drop down menus

• CAxI logical validation: • Answers determined by questions• Cross question logical validation• Data collection logical paths

Census Quality – data collection by electronic questionare

32

Page 33: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Data storingQuality aspects: information security

• Data storing in Operational Microdata Base,• Notification of Operational Microdata Base

to registration by General Inspector for Protection of Personal Data,

Census Quality

33

Page 34: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

GOLDEN RECORD,

34

Page 35: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

35

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Golden Record generation

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 36: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

36

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Export to Analitycal Microdata Base

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 37: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Integration with Census Frame and CAxI data,

Validation,

Correction,

Operational Imputation,

Transfer proper values to Golden Record,

Golden Record generation

37

Registers 1..n

CAxI

Golden Record

OMB Layers

Page 38: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Transition Tables Preparing,

Golden Records anonymisation,

Transfer to Analitycal Microdatabase,

Export to Analitycal Microdata Base

38

Page 39: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Data processingQuality aspects: accuracy

• Developing quality indicators for data sets at each stage of data processing and the procedures for calculating their value,

• Developing procedures for bringing data from administrative sources to full compliance or minimum discrepancy with appropriate methodology adopted in statistics,

• Developing procedures for normalization, editing of data sets from the administrative systems, including the imputation of data (administrative data sets),

• Developing procedures for synchronization of data from administrative systems,• Developing rules for linking data from different administrative systems,• Developing rules for linking data from administrative systems with data from CAII, CAPI, CATI,• Developing rules for calculation of Golden Record census variables,• Developing rules for anonymisation of Golden Record census data.

Census Quality

39

Page 40: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

ANALITYCAL MICRODATABASE

40

Page 41: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

41

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Analitycal Microdata Base

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 42: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Analitycal Microdata Base - process

42

Process

data

Load dat a and m et adat aI nt egrat e dat aCl assi f y and code dat aEdi t and val i dat e dat aI m put eD er i ve new var i abl esWageAggregat eCreat e fil es

Analyse

Disse

minate

Archive

Manage metainformation

Manage quality

Page 43: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Functionality

43

AdministrationInformation

Security Management

Data Processing

Information Analisys

Requirement and Product Management

Dissemination

Metadata

Quality Management

Analitycal Microdatabase

Page 44: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Development of census resultsQuality aspects: relevance, accuracy, comparability and coherence

• Developing rules for missing data completion - imputation and calibration,• Developing rules for creating derived objects - creation of new objects

(households, families),• Developing a model / method of data estimation with the use of the data

from administrative systems and sample surveys,• Developing rules for calculating data outputs.

Census Quality

44

Page 45: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

DISEMINATION

45

Page 46: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Dissemination of census resultsQuality aspects: relevance, timeliness and punctuality, accessibility and clarity, comparability and coherence, information security

• Designing Analitycal Microdata Base features including compliance with users needs, accessibility and clarity of census data.

Census Quality - disemination

46

Page 47: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

METAINFORMATION MANAGEMENT

47

Page 48: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

48

XML

TXT

Registry 1Registry 1

Metadata serverMetadata server

Operational Microdata

Base

Operational Microdata

Base

Registry 2Registry 2

Registry nRegistry nAnalitycalMicrodata

Base

AnalitycalMicrodata

Base

ETL ToolsETL

Tools

Portal

CAXI

Metadata server

XML

FilesStatistical

FilesGolden Record

Metadata MetadataMetadata

SDMX

Questionaries

Page 49: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Metainformation management

49

Metainformation

Definition

BussinesReferencial

Conceptual Methodical Quality

Structural

Technical

System

Postprocessing

Page 50: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Census Metadata SystemQuality aspects: accessibility and clarity

• Developing quality indicators at each stage of census and the procedures for calculating their value.

Census Quality – metainformation

50

Page 51: Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

51

POLAND