The new multiple-source system for Italian Structural Business Statistics based on administrative...
-
Upload
ruth-briggs -
Category
Documents
-
view
215 -
download
1
Transcript of The new multiple-source system for Italian Structural Business Statistics based on administrative...
![Page 1: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/1.jpg)
The new multiple-source system for Italian Structural Business Statistics based on
administrative and survey data
Orietta Luzi, Ugo Guarnera, Paolo Righi
Italian National Statistical Institute (Istat)
Q2014 Conference - Vienna, 3-5 June, 2014
![Page 2: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/2.jpg)
Outline
- The new statistical information system «frame SBS»
- The sources of the «frame SBS»
- The estimation strategy
- Concluding remarks and future work
The new system for estimating structural economic statistics on enterprises based on the integrated use of survey data and administrative data – Istat, Rome, 11 January 2013
![Page 3: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/3.jpg)
Statistical information system for estimating structural economic variables
on business accounts (Turnover, Purchases of goods and Services,
Production Value, Value Added, … ) for small and medium enterprises based
on the primary use of integrated administrative/fiscal data, “complemented”
with survey data
Until now, SBS for enterprises with less than 100 employees (~4.4 mln units in
2011) have been estimated based on a direct sample survey (~100,000 units) -
administrative data were used as auxiliary information.
The «frame SBS»: a multiple-source system for Italian Structural Business Statistics based on administrative and survey data
The new system for estimating structural economic statistics on enterprises based on the integrated use of survey data and administrative data – Istat, Rome, 11 January 2013
![Page 4: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/4.jpg)
Financial Statements (FS) of corporate enterprises liable to fill in the financial statement (about 800.000 enterprises each year)
The Sector Studies survey (SS), which is a Fiscal Authority survey that includes each year about 3.5 mln enterprises with a turnover lower than 7.5 mln and greater than 30,000 euros belonging to many economic activity sectors
The Tax Return Data (Unico model), based on a unified model of tax declarations by legal form, and IRAP , the Italian regional tax on productive activities
The Business Register (BR). Used as population list, auxiliary source of information
The Social Security Data (SSD), which includes firm level data and employee data on wages and labor cost. Auxiliary source of information
The sources of the «frame SBS»
The new system for estimating structural economic statistics on enterprises based on the integrated use of survey data and administrative data – Istat, Rome, 11 January 2013
![Page 5: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/5.jpg)
The sources of the «frame SBS»Units ID Ateco N Emp Turn N Emp PC WS WH SC Y 1
1 Y 21 .....… Y k
1 Y 12 Y 2
2 .....… Y k2 Y 1
3 Y 23 .....… Y k
3 Y 1S Y 2
S ……...… Y pS
1
2 SME Survey.
.
.
. SME Survey
.
.
.
.
.
.
.
. SME Survey
.
.
.
.
.
.
.
.
.
. SME Survey
.
.
.
.
.
. SME SurveyN (4.4
mil)
BR
Soci
al S
ecur
ity D
ata
(SSD
)
Not covered (~4%)
Financial Statements (~16% of SMEs)
Sector Studies Survey (~80% of SMEs)
Tax Returns Data (UNICO, IRAP)
(~97% of SMEs)
![Page 6: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/6.jpg)
Only the survey respondents provide information (YjS) on all the target variables Yj*
(j=1,..p), based on the SBS Regulation definitions
Information on target variables Yj*, say Yji, may be available in one or more source
i, on either disjoined or overlapping sub-populations
Two main steps for each source i and variable j:
1) harmonization of the Yji definition with the one described by the SBS Regulation for
the corresponding Yj*
2) quality evaluation of harmonized Yji based on the comparison with the
corresponding YjS
Only some harmonized Yji are considered reliable enough in terms of reported
values (Main economic aggregates). In case of overlap, sources are prioritized
For the other target variables (Components of the main economic aggregates) the only reliable information is that provided by the survey resp.
The estimation strategy
![Page 7: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/7.jpg)
Main economic aggregates
Section Label Target variable % missing data
Reve
nues
Y1 Income from sales and Services (Turnover) 2.7 Y2 Changes in stock of finished and semi-finished products 58.3 Y3 Changes in contract work in progress 8.2 Y4 Changes in internal work capitalized under fixed assets 4.7 Y5 Other income and earnings (neither financial, nor extraordinary) 8.2
Cost
s
Y6 Purchases of goods 19.2 Y7 Purchases of services 19.2 Y8 Use of third party assets 19.8 Y9 Changes in stocks of raw materials and for resale 58.3 Y10 Other operating charges 19.8 PC Personnel Costs
Der
ived
Var
. CS Total Changes of Stocks 6.7 GS Purchases of Goods and Services 15.6 IC Intermediate Costs 15.6 VA Value Added
![Page 8: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/8.jpg)
Coverage rate of the SME population by source and some main economic aggregates
Source Number Units Number Employees Revenues Value Added
FS 16.1 38.2 66.2 54.1 SS 64.0 49.2 24.5 36.4 Unico 16.2 8.3 5.5 6.1 Total covered 96.3 95.7 96.2 96.6 Not covered 3.7 4.3 3.7 3.4
Total 100.0 100.0 100.0 100.0
![Page 9: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/9.jpg)
Main economic aggregates: model based (predictive) approach
“Mixed” unit-level mass imputation to compensate for not covered units and variables values
Components of the main economic aggregates: design based/model assisted estimation approach
Projection Estimator to obtain consistent domains estimates w.r.t. the main economic aggregates estimates
A “hybrid” estimation strategy
![Page 10: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/10.jpg)
Main economic aggregates: Mass imputation
• Direct use of administrative and fiscal data
• Choice of methods: variables relations and distributions characteristics
• Predictive Mean Matching, Nearest Neighbor Donor, two-step logistic + regression models, deterministic imputation)
• Avoid inconsistencies between estimates at whatever domain levels
Source Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 PC GS CS IC Coverage rate (%)
FS X X X X X X X X X X X X X X 16.1 SS-F X ? X X X X X X ? X X X X X 51.1 SS-G X ? X X X ? ? ? X X X ? X X 13.0 Unico1 X X X X X ? ? ? X ? X X X ? 0.8 Unico2 X X X X X ? ? X X ? X X X ? 0.0 Unico3 X ? X X X ? ? ? ? ? X X X ? 2.7 Unico4 X ? X X X X X ? ? ? X X X ? 0.8 Unico5 X X X X X ? ? ? X X X X ? X 10.1 Unico6 X ? ? ? ? ? ? ? ? ? X ? ? ? 0.2 Unico7 X ? ? ? ? ? ? ? ? ? X ? ? ? 0.3 Unico8 X ? ? ? ? ? ? ? ? ? X ? ? ? 0.5 Unico9 X X X X X X X X X X X X X X 0.1 NA ? ? ? ? ? ? ? ? ? ? X ? ? ? 3.7
![Page 11: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/11.jpg)
Components of the main economic aggregates: the projection estimator (*)
«Synthetic imputation» of variable values non observed in the sample based on weighted regression models estimated on the SME survey respondents (~40.000 enterprises)
• Auxiliary Variables: main economic aggregates, structural information (BR)
• Consistency
• Among components and their reference aggregates
• Between estimates at the planned SBS+SEC estimation levels
• Approximately unbiased estimates at the level of model estimation domain
• Trade-off between bias (high detail level) and variance (low sample size) of parameters estimates
(*) Kim, J. K. K., Rao, J. N. K. (2011). Combining data from two independent surveys: a model-assisted approach. Biometrika. No.8, pp. 1–16.
![Page 12: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/12.jpg)
6
CVs for some components of main aggregates (year 2011)
Turnover C11101 C11102 C11103 C11104 C11105 C11106 C11107
1.67% 1.05% 7.22% 9.81% 6.75% 13.62% 1.69%
Purchases goods C12101 C12102 C12103
1.81% 9.36% 1.10%
Purchases services (1)
C12201 C12202 C12203 C12205 C12206 C12207 C12208 C12209
3.7% 4.7% 5.7% 5.3% 5.1% 11.6% 3.5% 6.1%
Purchases services (2)
C12210 C12211 C12212 C12213 C12214 C12245 C12246 C12247
3.00% 9.55% 5.82% 2.45% 3.06% 2.98% 9.44% 2.15%
Use of third party assets
C12301 C12302 C12304
1.20% 2.44% 2.90%
Other oper. charges
C12903 C12905
1.15% 6.13%
![Page 13: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/13.jpg)
Some results: survey-based vs frame-based estimates on SMEs by main economic aggregates, by size class (year 2011)
Size class Turnover Value Added Labor Cost Gross Operating Margin
0-9 6.4 -2.2 3.7 -4.9 10-19 9.9 4.1 1.4 9.4 20-49 10.2 4.9 0.8 14.5 50-99 6.5 -0.6 -1.6 1.7 Total 4.5 0.2 0.8 -0.5
![Page 14: The new multiple-source system for Italian Structural Business Statistics based on administrative and survey data Orietta Luzi, Ugo Guarnera, Paolo Righi.](https://reader036.fdocuments.net/reader036/viewer/2022082711/56649eef5503460f94bff49f/html5/thumbnails/14.jpg)
• Overcome some limitations of the current statistical production strategy (costs, burden, accuracy). Expected increase of SBS consistency over time
• Higher levels of consistency between annual statistics on enterprises and National Accounts, starting from the 2011 Benchmark
… and future work
• Managing unit identification problems over time (splits, fusions,…)
• Assessing estimates accuracy for the main economic aggregates
• Improve inferences for some components of the main economic aggregates in specific economic sectors
• Consistent estimation w.r.t. the frame information in the different domains of statistics on enterprises (R&D, ICT, etc.)
Concluding remarks….