Combining data from different sources and modes Register ... · A System of statistical registers...

Post on 22-Jul-2020

3 views 0 download

Transcript of Combining data from different sources and modes Register ... · A System of statistical registers...

ENP Course Georgia

2018

Combining data from different sources and modes

Register system and some extra on Datafusion

Statistical unit centric model

2

Source: Holmberg, A Discussion on coverage in Administrative data, Journal of Official Statistics, Vol. 31, No. 3, 2015, pp. 515–525

Statistical Units in Practise

Domains with statistical registers

A System of statistical registers has1. Base or Core registers – with important

statistical sets

2. Other statistical sources – to access important

variables

3. Linkage options between entities in different

base registers. Linkage options base registers

and other statistical sources

4. Standard variables (fundamental variables)

5. Tailored statistical methods, quality assurance

6. Metadata

7. IT-tools for processing and maintenance

8. Rules for protecting Confidentiality and Privacy

Properties of a Base register

Phase 1 – Create a ‘augmented’ register

7

Phase 2 – Create Statistical Register (SR) for the

targeted population

8

Typical situation of a statistical register

9

Source: Falorsi, Fortini, Di Zio, DIME&ITDG Steering Group, Hungary, 19 October 2016 ISTAT

Building a statistical register cont.

Phase 3 – Compute Estimates from the SR in Main Domains

10

Phase 4 – Quality feedback from maintenance and validation

surveys

11

http://www.afdb.org/en/knowledge/publications/guidelines-for-building-statistical-business-registers-in-africa/

SBR Guidelines: Economic Units Model

12 Source: African Development Bank

Data Fusion or Statistical Matching

X Y

Y Z

D’Orazio, M., Di Zio, M., and Scanu, M. (2006) Statistical Matching: Theory and Practice. Wiley and Sons, Chichester. http://www.wiley.com/go/matching

X Y Z

The microdata objective

means creation of a

synthetic dataset

Statistical matching aims at determining information

on (X;Y;Z), or at least on the pairs of variables which

are not observed jointly (X;Z)

Data Fusion or Statistical Matching

X Y Z

Y Z

D’Orazio, M., Di Zio, M., and Scanu, M. (2006) Statistical Matching: Theory and Practice. Wiley and Sons, Chichester. http://www.wiley.com/go/matching

D’Orazio M , Di Zio M , and Scanu, M. Statistical Matching for Categorical Data: Displaying Uncertainty and Using Logical Constraints Journal of Official Statistics, Vol. 22, No. 1,

2006, pp. 137–157

Usually through Imputation

using Conditional

Independence Assumption

(cia)

File A

File B donor

A ‘New’ Opportunity, Networks and the

Semantic Web

• World Wide Web Consortium (W3C), RDF, (Resource Description Framework), Neo4j etc

• Networks….

• Linked Open Data Initiative

• LOD and LOD2 (EU’s 7th Framework program)

The Semantic Web

Man-Made Technology Networks

Nature/Bio/Cognitive Networks

Information/Knowledge Networks

LESS STRUCTURED DATA, TRIPLETS AND RDFs

Located in

Located in

Location

Person

JOB

LKAU

Works at

Has job

Employee of

Member of

Lives in

Is owned by

Dwelling

Household Member of Enterprise

Industry

Employees

Employs