Reliability of estimates in socio-demographic groups with small samples

21
Reliability of estimates in socio- demographic groups with small samples D.Buono Statistical Office of European Union 19 August 2016, SAE, Maastricht All expressed opinions are of the author

Transcript of Reliability of estimates in socio-demographic groups with small samples

Page 1: Reliability of estimates in socio-demographic groups with small samples

Reliability of estimates in socio-demographic groups with small samples

D.Buono

Statistical Office of European Union19 August 2016, SAE, Maastricht

All expressed opinions are of the author

Page 2: Reliability of estimates in socio-demographic groups with small samples

Facts and figures about Eurostat

• About 800 people with 28 different nationalities• Small central methodology team

• TS, Econometrics, SDC, research & EA• Plus domain methodologists networking

• Statistical Office but not independent authority, General Directorate of the European Commission• Subsidiary principle!

Page 3: Reliability of estimates in socio-demographic groups with small samples

Eurostat core business

• Euro-zone (19) & EU (28) aggregates

• harmonization, best practices, guidelines, trainings & international cooperation

Page 4: Reliability of estimates in socio-demographic groups with small samples

Why interested in SAE?

• European regional policies • Different sizes of Member States, primary data providers

• According to the EU 2011 Population Census there are 79,652,380 residents in DE and 512,353 in LU!!!

• Some dilemmas: • How big is a small area? • Can SAE help with data breakdown demand by users?

Page 5: Reliability of estimates in socio-demographic groups with small samples

Outline

• Reliability of indicators• At-risk-of-poverty indicators• SAE techniques for Official Statistics• Application for 2 EU countries• Learnings and open questions

• ADS and EU research funds for SAE expertise

Page 6: Reliability of estimates in socio-demographic groups with small samples

Some Notation• U – finite population of size N• D – number of socio-demographic groups in the target

population• s – sample of size n• sd – sub-sample from domain d of size nd

• r – not sampled elements of size N-n• rd – not sampled elements from domain d of size Nd-nd

• y – target variable• X – vector of auxiliary information

Page 7: Reliability of estimates in socio-demographic groups with small samples

Indicator of interest: ARPT

Page 8: Reliability of estimates in socio-demographic groups with small samples

Estimation methods

Page 9: Reliability of estimates in socio-demographic groups with small samples

Empirical Bayes (EB) method

Page 10: Reliability of estimates in socio-demographic groups with small samples

Hierarchical Bayes (HB) method

Page 11: Reliability of estimates in socio-demographic groups with small samples

packages and functions used• sae.R

• Functions:directebBHFpbmseBHF

• hbsae.R• Functions:

fSAEfSAE.Area

Page 12: Reliability of estimates in socio-demographic groups with small samples

Application: Target and data

• Target: Calculate direct and indirect at-risk-of-poverty rate estimates by socio-demographic breakdowns

• Data sources: Survey on Income and Living Conditions (EU-SILC) and Census data of some EU countries in 2011

• Sample: divided in 18 disjoint socio-demographic groups of small and large sizes

• Auxiliary variables: unit level information on economic activity status and highest level of education attained

Page 13: Reliability of estimates in socio-demographic groups with small samples

Application 1: Results

Page 14: Reliability of estimates in socio-demographic groups with small samples

Application 1: Results

Page 15: Reliability of estimates in socio-demographic groups with small samples

Application 2: Results

Page 16: Reliability of estimates in socio-demographic groups with small samples

Application 2: Results

Page 17: Reliability of estimates in socio-demographic groups with small samples

Learnings and future work • By applying model-based SAE techniques reliability of

estimates could be increased

• Enlargement of number auxiliary variables

• Further investigation is needed to assess the most appropriate estimator (call for harmonization?)

• Extension to additional countries and socio-demographic groups

Page 18: Reliability of estimates in socio-demographic groups with small samples

Open questions on SAE

• EB vs. HB dichotomy calls for harmonised practices in Official Statistics?

• Design based to model based to algorithm based: maybe there is a possible link between SAE and statistical learning?

• Reversing the approach: starting from the data rather than from the goal?

• How about the use of SAE for data protection?

Page 19: Reliability of estimates in socio-demographic groups with small samples

Advertisement CESS2016, Conference of European Statistics StakeholdersBudapest, 20–21 Oct 16 (by ESTAT, ECB & HCSO), free!

• Session B3: Official statistics on cross-border phenomena• Session C9: Small area estimation and weighting

NTTS2017, New Techniques and Technologies for StatisticsBrussels, 14–16 March 17 (by ESTAT), free!

• abstract by 28 Oct 16, track C includes SAE

Page 20: Reliability of estimates in socio-demographic groups with small samples

Research funds under Horizon 2020 TOPIC : Towards a new growth strategy in Europe - Improved economic and social measurement, data and official statisticsOpening: 4 of October 2016 Closing: 2 of February 2017For more info here to submit a proposal here

"Disaggregation of statistics - geographically, or by other domains (e.g. identifying vulnerable population groups) - to provide greater insights and providing evidence allowing more focused policy decisions should be covered. At the same time data protection concerns should be addressed. Small Area Estimation expertise could cover the geographical/domain disaggregation aspect"

Page 21: Reliability of estimates in socio-demographic groups with small samples

Thank you!

[email protected]://ec.europa.eu/eurostat