The new multiple-source system for Italian Structural Business Statistics based on administrative...

14
The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi Italian National Statistical Institute (Istat) Q2014 Conference - Vienna, 3-5 June, 2014

Transcript of The new multiple-source system for Italian Structural Business Statistics based on administrative...

Page 1: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

The new multiple-source system for Italian Structural Business Statistics based on

administrative and survey data

Orietta Luzi, Ugo Guarnera, Paolo Righi

Italian National Statistical Institute (Istat)

Q2014 Conference - Vienna, 3-5 June, 2014

Page 2: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Outline

- The new statistical information system «frame SBS»

- The sources of the «frame SBS»

- The estimation strategy

- Concluding remarks and future work

The new system for estimating structural economic statistics on enterprises based on the integrated use of survey data and administrative data – Istat, Rome, 11 January 2013

Page 3: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Statistical information system for estimating structural economic variables

on business accounts (Turnover, Purchases of goods and Services,

Production Value, Value Added, … ) for small and medium enterprises based

on the primary use of integrated administrative/fiscal data, “complemented”

with survey data

Until now, SBS for enterprises with less than 100 employees (~4.4 mln units in

2011) have been estimated based on a direct sample survey (~100,000 units) -

administrative data were used as auxiliary information.

The «frame SBS»: a multiple-source system for Italian Structural Business Statistics based on administrative and survey data

The new system for estimating structural economic statistics on enterprises based on the integrated use of survey data and administrative data – Istat, Rome, 11 January 2013

Page 4: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Financial Statements (FS) of corporate enterprises liable to fill in the financial statement (about 800.000 enterprises each year)

The Sector Studies survey (SS), which is a Fiscal Authority survey that includes each year about 3.5 mln enterprises with a turnover lower than 7.5 mln and greater than 30,000 euros belonging to many economic activity sectors

The Tax Return Data (Unico model), based on a unified model of tax declarations by legal form, and IRAP , the Italian regional tax on productive activities

The Business Register (BR). Used as population list, auxiliary source of information

The Social Security Data (SSD), which includes firm level data and employee data on wages and labor cost. Auxiliary source of information

The sources of the «frame SBS»

The new system for estimating structural economic statistics on enterprises based on the integrated use of survey data and administrative data – Istat, Rome, 11 January 2013

Page 5: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

The sources of the «frame SBS»Units ID Ateco N Emp Turn N Emp PC WS WH SC Y 1

1 Y 21 .....… Y k

1 Y 12 Y 2

2 .....… Y k2 Y 1

3 Y 23 .....… Y k

3 Y 1S Y 2

S ……...… Y pS

1

2 SME Survey.

.

.

. SME Survey

.

.

.

.

.

.

.

. SME Survey

.

.

.

.

.

.

.

.

.

. SME Survey

.

.

.

.

.

. SME SurveyN (4.4

mil)

BR

Soci

al S

ecur

ity D

ata

(SSD

)

Not covered (~4%)

Financial Statements (~16% of SMEs)

Sector Studies Survey (~80% of SMEs)

Tax Returns Data (UNICO, IRAP)

(~97% of SMEs)

Page 6: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Only the survey respondents provide information (YjS) on all the target variables Yj*

(j=1,..p), based on the SBS Regulation definitions

Information on target variables Yj*, say Yji, may be available in one or more source

i, on either disjoined or overlapping sub-populations

Two main steps for each source i and variable j:

1) harmonization of the Yji definition with the one described by the SBS Regulation for

the corresponding Yj*

2) quality evaluation of harmonized Yji based on the comparison with the

corresponding YjS

Only some harmonized Yji are considered reliable enough in terms of reported

values (Main economic aggregates). In case of overlap, sources are prioritized

For the other target variables (Components of the main economic aggregates) the only reliable information is that provided by the survey resp.

The estimation strategy

Page 7: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Main economic aggregates

Section Label Target variable % missing data

Reve

nues

Y1 Income from sales and Services (Turnover) 2.7 Y2 Changes in stock of finished and semi-finished products 58.3 Y3 Changes in contract work in progress 8.2 Y4 Changes in internal work capitalized under fixed assets 4.7 Y5 Other income and earnings (neither financial, nor extraordinary) 8.2

Cost

s

Y6 Purchases of goods 19.2 Y7 Purchases of services 19.2 Y8 Use of third party assets 19.8 Y9 Changes in stocks of raw materials and for resale 58.3 Y10 Other operating charges 19.8 PC Personnel Costs

Der

ived

Var

. CS Total Changes of Stocks 6.7 GS Purchases of Goods and Services 15.6 IC Intermediate Costs 15.6 VA Value Added

Page 8: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Coverage rate of the SME population by source and some main economic aggregates

Source Number Units Number Employees Revenues Value Added

FS 16.1 38.2 66.2 54.1 SS 64.0 49.2 24.5 36.4 Unico 16.2 8.3 5.5 6.1 Total covered 96.3 95.7 96.2 96.6 Not covered 3.7 4.3 3.7 3.4

Total 100.0 100.0 100.0 100.0

Page 9: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Main economic aggregates: model based (predictive) approach

“Mixed” unit-level mass imputation to compensate for not covered units and variables values

Components of the main economic aggregates: design based/model assisted estimation approach

Projection Estimator to obtain consistent domains estimates w.r.t. the main economic aggregates estimates

A “hybrid” estimation strategy

Page 10: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Main economic aggregates: Mass imputation

• Direct use of administrative and fiscal data

• Choice of methods: variables relations and distributions characteristics

• Predictive Mean Matching, Nearest Neighbor Donor, two-step logistic + regression models, deterministic imputation)

• Avoid inconsistencies between estimates at whatever domain levels

Source Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 PC GS CS IC Coverage rate (%)

FS X X X X X X X X X X X X X X 16.1 SS-F X ? X X X X X X ? X X X X X 51.1 SS-G X ? X X X ? ? ? X X X ? X X 13.0 Unico1 X X X X X ? ? ? X ? X X X ? 0.8 Unico2 X X X X X ? ? X X ? X X X ? 0.0 Unico3 X ? X X X ? ? ? ? ? X X X ? 2.7 Unico4 X ? X X X X X ? ? ? X X X ? 0.8 Unico5 X X X X X ? ? ? X X X X ? X 10.1 Unico6 X ? ? ? ? ? ? ? ? ? X ? ? ? 0.2 Unico7 X ? ? ? ? ? ? ? ? ? X ? ? ? 0.3 Unico8 X ? ? ? ? ? ? ? ? ? X ? ? ? 0.5 Unico9 X X X X X X X X X X X X X X 0.1 NA ? ? ? ? ? ? ? ? ? ? X ? ? ? 3.7

Page 11: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Components of the main economic aggregates: the projection estimator (*)

«Synthetic imputation» of variable values non observed in the sample based on weighted regression models estimated on the SME survey respondents (~40.000 enterprises)

• Auxiliary Variables: main economic aggregates, structural information (BR)

• Consistency

• Among components and their reference aggregates

• Between estimates at the planned SBS+SEC estimation levels

• Approximately unbiased estimates at the level of model estimation domain

• Trade-off between bias (high detail level) and variance (low sample size) of parameters estimates

(*) Kim, J. K. K., Rao, J. N. K. (2011). Combining data from two independent surveys: a model-assisted approach. Biometrika. No.8, pp. 1–16.

Page 12: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

6

CVs for some components of main aggregates (year 2011)

Turnover C11101 C11102 C11103 C11104 C11105 C11106 C11107

1.67% 1.05% 7.22% 9.81% 6.75% 13.62% 1.69%

Purchases goods C12101 C12102 C12103

1.81% 9.36% 1.10%

Purchases services (1)

C12201 C12202 C12203 C12205 C12206 C12207 C12208 C12209

3.7% 4.7% 5.7% 5.3% 5.1% 11.6% 3.5% 6.1%

Purchases services (2)

C12210 C12211 C12212 C12213 C12214 C12245 C12246 C12247

3.00% 9.55% 5.82% 2.45% 3.06% 2.98% 9.44% 2.15%

Use of third party assets

C12301 C12302 C12304

1.20% 2.44% 2.90%

Other oper. charges

C12903 C12905

1.15% 6.13%

Page 13: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

Some results: survey-based vs frame-based estimates on SMEs by main economic aggregates, by size class (year 2011)

Size class Turnover Value Added Labor Cost Gross Operating Margin

0-9 6.4 -2.2 3.7 -4.9 10-19 9.9 4.1 1.4 9.4 20-49 10.2 4.9 0.8 14.5 50-99 6.5 -0.6 -1.6 1.7 Total 4.5 0.2 0.8 -0.5

Page 14: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.

• Overcome some limitations of the current statistical production strategy (costs, burden, accuracy). Expected increase of SBS consistency over time

• Higher levels of consistency between annual statistics on enterprises and National Accounts, starting from the 2011 Benchmark

… and future work

• Managing unit identification problems over time (splits, fusions,…)

• Assessing estimates accuracy for the main economic aggregates

• Improve inferences for some components of the main economic aggregates in specific economic sectors

• Consistent estimation w.r.t. the frame information in the different domains of statistics on enterprises (R&D, ICT, etc.)

Concluding remarks….