On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa , Monica ...

13
On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa, Monica Scannapieco, Diego Zardetto, Istat, Italy Istituto Nazionale di Statistica – ISTAT

description

On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa , Monica Scannapieco , Diego Zardetto , Istat , Italy Istituto Nazionale di Statistica – ISTAT. The CSPA concept. - PowerPoint PPT Presentation

Transcript of On Implementing CSPA Specifications for Editing and Imputation Services Donato Summa , Monica ...

Page 1: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

On Implementing CSPA Specifications for Editing and

Imputation Services 

Donato Summa, Monica Scannapieco, Diego Zardetto, Istat, Italy

Istituto Nazionale di Statistica – ISTAT

Page 2: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

2

The CSPA concept

• National Statistical Institutes (NSIs) produce Official Statistics having very similar goals

• Common activities carried on in an independent way, almost without relying on shared solutions

• Statistical organizations have attempted many times to share their processes, methodologies and software solutions (significant work to integrate)

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 3: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

3

The CSPA concept

• As part of the modernization effort in the Official Statistics field, the High Level Group for the Modernization of Statistical Production and Services (HLG) has taken action in order to address these issues

• promotion of development and implementation of the CSPA (Common Statistical Production Architecture)

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 4: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

4

The CSPA concept

CSPA provides a template architecture for official statistics, describing:

• What the official statistical industry wants to achieve

• How the industry can achieve this, i.e. principles that guide how statistics are produced

• What the industry will have to do, compliance with the CSPA

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 5: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

5

The CSPA concept

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

editrules CANCEIS SCSTools

Services CSPA compliant

Platforms

Page 6: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

6

The CSPA concept

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 7: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

7

The Error Localization service

• In the POC initiative of 2013 CSPA project Istat undertook the responsibility of developing the CSPA Error Localization service, with the roles of designer, builder and assembler

• It was decided to wrap the “localizeErrors” function contained in the

“editrules” R package developed at Statistics Netherlands

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 8: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

8

The Error Localization service

• Data used for test cases come from Istat’s Structure of Earning Survey

• Input unit data sets involve 20 variables• The rules set consists of 44 edits involving 17

numeric variables appearing in the unit data sets• 3 different test cases with the same rules set

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Data set 1

1000erroneus records

Data set 2

2000exact

records

Data set 3

3000mixed records

Page 9: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

9

The Error Localization service

• The service was implemented technically as a Java standalone application (jar executable file) that wraps up the “localizeErrors” function of the “editrules” R package

• The jar can be called by GUI or by command line and is responsible of:– Take input parameter from user (or application)

– Invoke the execution of the R script in the R environment with provided input parameters

– Return the output parameters (output file generation) Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 10: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

10

The Error Localization service

• The Error Localization service wrapped by the Java program was then deployed on CORE thus proving the fully compatibility of CSPA services with respect to a specific NSI’s internal platform

• CORE (COmmon Reference Environment) is the Istat internal platform for statistical processes execution

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 11: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

11Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Tool

CSPA

Platform

Service

….

….

….

Page 12: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

12

Conclusion

• Istat is currently involved in the 2014 CSPA Implementation project, with the role of developing the Error Correction service.

• the following activities are ongoing:– study how to extend such a service in order to

perform a full editing and imputation process– design a CSPA specification, to be shared and agreed

among CSPA implementation project participants– implement the specifications provided at by concrete

CSPA services wrapping existing tools.

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014

Page 13: On  Implementing CSPA Specifications for Editing and Imputation Services Donato  Summa , Monica  Scannapieco , Diego  Zardetto ,  Istat , Italy

13

Thank you for the attention !

Donato Summa, Work Session on Statistical Data Editing, Paris, 28-30/04/2014