Small Area Estimation: An Appraisal - eframeproject.eu · Acknowledgements (In alphabetical order)...
Transcript of Small Area Estimation: An Appraisal - eframeproject.eu · Acknowledgements (In alphabetical order)...
Small Area Estimation: An Appraisal
Nikos Tzavidis 1
Workshop on Measuring Progress at a Local LevelPisa, May 28-29, 2013
1Southampton Statistical Sciences Research Institute, University ofSouthampton ([email protected])
Small Area Estimation & Measuring Progress at Local Level
Acknowledgements (In alphabetical order)
Ray Chambers
Hukum Chandra
Emannuela Dreasi
Enrico Fabrizi
Caterina Giusti
Stefano Marchetti
Monica Pratesi
Giovana Ranalli
Nicola Salvati
Small Area Estimation & Measuring Progress at Local Level
Outline
A Non-technical Introduction to SAE
Motivation & Definition
Data requirements
Small Area Methods & Case Studies
Direct Vs. indirect methods
Model-based Vs. design-based methods
Methodologies for continuous outcomes
Methodologies for discrete outcomes
Case studies
Income & povertyUnemploymentHealth outcomes
Concluding remarks
From the statistcian’s desk to policy
Small Area Estimation & Measuring Progress at Local Level
Part I
Introduction to SAE
Small Area Estimation & Measuring Progress at Local Level
Motivation
Surveys are used to provide estimates for large domains
Estimates for smaller domains are important
Direct Estimation: Use only domain-specific data
Problems with direct estimation
1 Direct estimates may suffer from low precision
2 Not applicable with zero sample sizes
Small Area Estimation & Measuring Progress at Local Level
What is Small Area Estimation?
A definition
Small area estimation is concerned with the development ofstatistical procedures for producing efficient (precise) estimates fordomains (planned or unplanned) with small or zero sample sizes.Domains are defined by the cross-classification of geographicaldistricts by social/economic/demographic characteristics.The target is the estimation of a parameter(average/percentile/proportion/rate) and the estimation of thecorresponding prediction error.
.
Small Area Estimation & Measuring Progress at Local Level
Is SAE Relevant for Policy Makers?
Letter from the House of Commons to ONS
The House of Commons has for many years produced a monthlyreport for Members on unemployment by constituency. This reporthas been highly valued by them. Over the last few years the Officefor National Statistics has been developing and improving itsportfolio of labour market statistics. We recognise that what wereally need is a consistent and reliable set of up to dateconstituency level labour market statistics covering unemployment,employment and inactivity. I am aware that research is underwaywithin ONS to achieve this goal. The purpose of this letter is tostress the importance of bringing this work to a conclusion so thatboth of our organisations can provide a common authoritative setof labour market information for Members of Parliament andothers.
Small Area Estimation & Measuring Progress at Local Level
SAE - Data Requirements
Survey Data: Available for y and for x related to y
Census/Administrative Data: Available for x but not for y
SAE in 3 Steps
1 Use survey data to estimate models that link y to x
2 Combine the estimated model parameters with x, for out ofsample units, to form predictions
3 Use these predictions to estimate the target parameters
Small Area Estimation & Measuring Progress at Local Level
SAE - Data Requirements (Cont’d)
Access to good auxiliary information is crucial
Data requirements depend what is the target parameter
Case 1: Averages: Domain-level means/totals of auxiliaryvariablesCase 2: Percentiles: Auxiliary information available for everyunit in the population
Small Area Estimation & Measuring Progress at Local Level
Direct Vs. Indirect Methods
Direct methods use only domain-specific data
Indirect methods borrow information from all data
Small Area Estimation & Measuring Progress at Local Level
Design-based Vs. Model-based methods
Model-based methods
Borrow strength by using a model
Estimation using frequentist or Bayesian approaches
Inference is under model conditional on the selected sample
Design-based (Model-assisted) methods
Direct estimation
Can allow for use of models (model-assisted)
Inference is under the randomization distribution
Small Area Estimation & Measuring Progress at Local Level
SAE: A Paradigm Shift
NSIs producing official statistics avoid the use of models
SAE: One area in official statistics where models accepted
Presents a paradigm shift for NSIs
Impacts on the use of SAE methods in practice
Small Area Estimation & Measuring Progress at Local Level
Part II
SAE for Continuous Outcomes
Small Area Estimation & Measuring Progress at Local Level
Popular model-assisted estimators of domainaverages
Synthetic estimator
ˆyk = XTk βw
βw is the probability weighted estimator
Can be biased BUT
Fairly stable
Survey regression & Generalised regression estimators (GREG)
ˆyk = ˆY HTk + (Xk − ˆXHT
k )T βw
Corrects the potential bias of the synthetic estimator BUT
Can be unstable
Small Area Estimation & Measuring Progress at Local Level
Model-based MethodsNested Error Regression Model
Key Concept (Battese, Harter & Fuller, 1988)
Include random area-specific effects to account for between area variationbeyond that explained by model covariates
Notation: (k =domain, i =individual)
yik = xTikβ + uk + εik, i = 1, ..., nk, k = 1, ...d
Estimator of the small area average
ˆyk = γk(yk + (Xk − xk)β) + (1− γk)Xkβ,
γk =σu
(σu + σe/nk)
Small Area Estimation & Measuring Progress at Local Level
Advances in Model-based SAE - Nested ErrorRegression Model & Beyond
Empirical Best Prediction (Molina & Rao, 2010)
Dealing with outliers(Sinha & Rao, 2009 ; Chambers et al., 2013; Giusti et al.,2013)
Design consistent estimation(You & Rao, 2002)
Estimation with M-quantile models(Chambers & Tzavidis, 2006; Fabrizi et al.,2013; Marchetti etal.,2012)
Non-parametric models(Opsomer et al., 2008)
Incorporating spatial structures(Salvati & Pratesi, 2009)
Small Area Estimation & Measuring Progress at Local Level
Small Area Estimation with M-quantileRegression
Main idea of SAE with M-quantile regression(Chambers & Tzavidis, Biometrika, 2006)
Quantiles/M-quantiles used for describing group differences
Similar role to random effects BUT
Estimation is semiparametric
If a hierarchical structure does explain part of the variability inthe data, units within the same domain will be clustered inthe same part of f(y|x)
Small Area Estimation & Measuring Progress at Local Level
Beyond AveragesThe Small Area Distribution Function (DF)
Averages offer a rather limited picture
Fk = N−1k
[∑i∈sk
I(yk < z) +∑i∈rk
I(yk < z)]
Use an estimator of the distribution function
Derive estimates of medians and percentiles for small areas
Small Area Estimation & Measuring Progress at Local Level
Estimators of the Small Area DF
Estimators of the DF (Tzavidis et al., 2010)
Empirical distribution function
Fk = N−1k
[∑i∈sk
I(yi < z) +∑i∈rk
I(yi < z)]
The Chambers-Dunstan estimator
Fk = N−1k
[∑i∈sk
I(yi < z) +∑i∈sk
∑l∈rk
I(yl + (yi − yi) < z)]
Empirical Best Predictor - Monte-Carlo method for estimatingthe DF
Small Area Estimation & Measuring Progress at Local Level
Case Study I: Estimation of Income & Poverty
Estimation of income distributions and poverty
Two case studies: Italy and the UK
UK: Target areas - Local Authority Districts (∼400)
Italy: Target areas - Provinces in Regions
Data - Italy: EU-SILC, Census micro-data
Data - UK: Family Resources Survey, Census micro-data
Target parameters: Income distributions & Poverty indicators
Model-based estimates using EBP and M-quantile approaches
Small Area Estimation & Measuring Progress at Local Level
Income & Poverty in Lombardia
Small Area Estimation & Measuring Progress at Local Level
Income & Poverty in Tuscany
Small Area Estimation & Measuring Progress at Local Level
Income & Poverty in Calabria
Small Area Estimation & Measuring Progress at Local Level
Income Distributions in LADs in North West& South East England
Small Area Estimation & Measuring Progress at Local Level
Head Count Ratio in LADs in North West &South East England
Small Area Estimation & Measuring Progress at Local Level
Part III
SAE for Discrete Outcomes
Small Area Estimation & Measuring Progress at Local Level
A Binomial Generalised Linear Mixed Model
yik = 0, 1
yik|uk ∼ Bin(1, pik)
uk ∼ N(0,Σu)
with
logit(pik)
= xTikβ + uk
Extensions to multinomial responses possible
Small Area Estimation & Measuring Progress at Local Level
Poisson Generalized Linear Mixed Model
yik is a count
yik|uk ∼ Poisson(µik)
uk ∼ N(0,Σu)
withlog(µik) = xTikβ + uk
Small Area Estimation & Measuring Progress at Local Level
Estimation
Plug-in Empirical Best Predictor of Yd is
E(y|x, k) = N−1k
∑i∈Uk
yik
yik = exp{xTikβ + uk}
yik =exp(xTikβ + uk)
[1 + exp(xTikβ + uk)]
Notes on the use of GLMMs in SAE
Standard methods for fitting GLMMs can be sensitive tooutliers
Prediction of the random effects with GLMMs iscomputationally complicated
Small Area Estimation & Measuring Progress at Local Level
Robust Estimation for GLMs
Cantoni & Ronchetti JASA, 2001
yi from Exponential Family
E(yi) = µi ; V (yi) = V (µi); g(µi) = xTi β∑ni=1
(yi−µi)V (µi)
∂∂βµi = 0
Large deviations of yi from µi or leverage points− > influence∑ni=1
ψ(ri)w(xi)
V 1/2(µi
∂∂βµi − α(β) = 0 (Huber quasi-likelihood)
α(β) = n−1∑E[ψ(ri)]w(xi)
1V 1/2(µi)
∂∂βµi
ri Pearson residuals; w(xi) controls leverage points
Two special cases: Poisson and Logistic regression
Small Area Estimation & Measuring Progress at Local Level
Robust SAE Estimation for DiscreteOutcomes
Chambers et al., 2013; Tzavidis et al.,2013
Extension of M-quantile approach for binary and countoutcomes
Let Qy(q|xi) = Qiqψ. Estimate βψ(q) by solving
n∑i=1
ψq(riqψ)w(xi)1
V 1/2[Qiqψ]Q′iqψ − a(βψ(q)) = 0,
where
riq =yi−QiqψV 1/2[Qiqψ ]
, are the Pearson’s residuals
Q′iqψ = ∂Qiqψ/∂βψ(q)
Estimation: Fisher scoring algorithm
Small Area Estimation & Measuring Progress at Local Level
Case Study II: Estimation of Unemployment
Estimation of Unemployment
UK: Target areas - Local Authority Districts (∼400)
Data - UK: Labour Force Survey
Auxiliary info: Age by gender and unemployment benefitcounts.
Target parameters: LAD proportions of unemployed
Model-based estimates using Binomial-glmm andBinomial-MQ
Small Area Estimation & Measuring Progress at Local Level
Case Study II: Estimation of Unemployment
Small Area Estimation & Measuring Progress at Local Level
Case Study III: Estimating the Number ofVisits to Physicians in Italy
Ageing is great concern for Italy (65+, 20.3%)
Estimate the number of visits to physicians for the elderly
Data from the Health Conditions survey (reliable estimates atNUTS 2)
Regions (Tuscany, 23.3%, Liguria, 26.7%, Umbria, 23.1%)
60 Health Authorities - Small Areas
Model-based estimates using Poisson-glmm and Poisson-MQ
Small Area Estimation & Measuring Progress at Local Level
SAE Estimates
Small Area Estimation & Measuring Progress at Local Level
Recent Applications
Mexico - Estimation of multidimensional poverty
Definition incorporates many dimensions: Income, lack ofaccess to health & education
Estimation: Treat as a multinomial or count outcome
UK - Estimation of child poverty
Currently obtaining experimental estimates
Small Area Estimation & Measuring Progress at Local Level
Future Use of SAE in the UK - Beyond 2011
Alternative, more frequently updated, Census output
Efficient use of survey and administrative data
SAE methods evaluated for producing census outputs
Small Area Estimation & Measuring Progress at Local Level
Mean Squared Error (MSE) Estimation
Approaches to MSE estimation
Important part of small area estimation
Analytic and computer intensive approaches to MSE
Analytic MSE estimator for model-based averages(Prasad & Rao, 1990; Chambers, Chandra & Tzavidis, 2011)
Parametric bootstrap (Hall & Maiti, 2006)
Parametric & Non-parametric bootstrap for model-basedestimates of distributions(Hall & Maiti, 2006; Tzavidis et al., 2011)
Parametric and non-parametric bootstrap for model-basedestimates with GLMM and robust-GLMM (Manteiga et al.,2008; Chambers et al., 2013)
Small Area Estimation & Measuring Progress at Local Level
Part IV
Small Area Estimation &
Measuring Progress Locally
Small Area Estimation & Measuring Progress at Local Level
Producing Small Area Statistics
Need for a transparent estimation framework
Model-based estimation presents organisational paradigm shift
Properties of estimators must be clearly understood
System set up for ”industrial” production of target outputs
Computing power - MSE estimation is time consuming
Start using simpler estimation procedures & adapt gradually
Small Area Estimation & Measuring Progress at Local Level
From the Statistcian’s desk to policy -Reflections
Significant advances in model-based SAE. However,
Gap between producing the estimates and using the estimates
How are estimates used for creating policies?
How to allocate resources?
How to measure progress - Measuring impact?
Small Area Estimation & Measuring Progress at Local Level