Synthetic estimators in Ireland
description
Transcript of Synthetic estimators in Ireland
Synthetic estimators in IrelandSynthetic estimators in Ireland
Anthony StainesAnthony StainesDCUDCU
What are synthetic estimators?What are synthetic estimators?
Estimates of something you haven't gotEstimates of something you haven't got Typically estimates for a small area of Typically estimates for a small area of
somethingsomething Making maximum use of what you haveMaking maximum use of what you have
ExampleExample
Lung cancer riskLung cancer risk Smoking is a key explanationSmoking is a key explanation Suppose you want to study the Suppose you want to study the
geography of lung cancergeography of lung cancer
What you haveWhat you have
Smoking data from a national survey by Smoking data from a national survey by age and sexage and sex
Small area level data on population and Small area level data on population and cancer incidence by age and sexcancer incidence by age and sex
What you can do at onceWhat you can do at once
Estimate prevalence for small areas Estimate prevalence for small areas included in the studyincluded in the study
Using the sample in the studyUsing the sample in the study
What's wrong with this?What's wrong with this?
The areas you need may not be includedThe areas you need may not be included The estimates will be very impreciseThe estimates will be very imprecise
You can do betterYou can do better
In some obvious waysIn some obvious ways And some not so obviousAnd some not so obvious
What you assumeWhat you assume
National age and sex specific rates apply National age and sex specific rates apply in each small areain each small area
And soAnd so
From these you calculate small area From these you calculate small area specific prevalence estimatesspecific prevalence estimates
This is indirect standardisationThis is indirect standardisation Can be done smarterCan be done smarter
requiring aggregation properties to holdrequiring aggregation properties to hold Adding in area level covariates (urban/rural Adding in area level covariates (urban/rural
etc.) etc.)
Can you do better?Can you do better?
YesYes
How?How?
Model based estimatorsModel based estimators
These have a long historyThese have a long history Many diverse applicationsMany diverse applications Combine survey data and some kind of Combine survey data and some kind of
'census data''census data' 'Census data' is that available for every 'Census data' is that available for every
area of interestarea of interest
RoughlyRoughly
Use the survey data to estimate Use the survey data to estimate relationships relationships
at the relevant levelat the relevant level between survey covariates between survey covariates and the census dataand the census data
ThenThen
Assume the same relationship applies in Assume the same relationship applies in the other areasthe other areas
IssuesIssues
Modelling can be hardModelling can be hard Remember these are predictive models, not Remember these are predictive models, not
explanatory modelsexplanatory models Data not easy to get at the right small Data not easy to get at the right small
area levelarea level
ModelsModels
models using individual level covariates models using individual level covariates onlyonly
models using area level covariates onlymodels using area level covariates only models combining individual and area-models combining individual and area-
level covariateslevel covariates
LimitsLimits
Available dataAvailable data ConfidentialityConfidentiality Complexity of methods, esp. multi-level Complexity of methods, esp. multi-level
methodsmethods ValidationValidation
Spatial data limitsSpatial data limits
Have to be able to link survey and census Have to be able to link survey and census to the same set of small areasto the same set of small areas
Given the primitive systems in the UK and Given the primitive systems in the UK and the nearly non-existent systems in the the nearly non-existent systems in the Republic this is a lot of workRepublic this is a lot of work
Errors here will lead to biassed estimatesErrors here will lead to biassed estimates
ConfidentialityConfidentiality
Need to respect confidentiality of survey Need to respect confidentiality of survey respondentsrespondents
May limit the data available for these May limit the data available for these purposespurposes
May need to design survey and survey May need to design survey and survey consent process carefully to get good consent process carefully to get good estimatesestimates
ModellingModelling
Can become very complexCan become very complex Clustered survey designsClustered survey designs Survey weightsSurvey weights Variable selectionVariable selection Model diagnosticsModel diagnostics
What and where to modelWhat and where to model
Data may exist at many different geographiesData may exist at many different geographies Multi-level models with individual, household, Multi-level models with individual, household,
local and regional effects can be consideredlocal and regional effects can be considered GIS might be very useful here for data GIS might be very useful here for data
handlinghandling Not advisable to aggregate covariates at Not advisable to aggregate covariates at
different spatial levelsdifferent spatial levels This is just making a bad embedded synthetic This is just making a bad embedded synthetic
estimatorestimator
ValidationValidation
Not easy to do, but essentialNot easy to do, but essential How do you validate your synthetic How do you validate your synthetic
estimates?estimates? Cross-validation?Cross-validation? Another survey?Another survey? ??
OptionsOptions
How aboutHow about Health Atlas Ireland?Health Atlas Ireland? This is a system built for HSE, (led by This is a system built for HSE, (led by
Howard Johnson) to plan health servicesHoward Johnson) to plan health services It already hasIt already has
MapsMaps CensusCensus HIPEHIPE Mortality dataMortality data
Census output optionsCensus output options
Recently they have developed a very Recently they have developed a very flexible census output systemflexible census output system
Uses census data at ED levelUses census data at ED level Locations of housesLocations of houses Assumes that all the houses in a DED are Assumes that all the houses in a DED are
exchangeableexchangeable
Census output optionsCensus output options
Allocates census data to any given areaAllocates census data to any given area Directly weighted by using the number of Directly weighted by using the number of
households and the ED composition of the households and the ED composition of the desired areadesired area
Futures?Futures?
Modern design of surveysModern design of surveys Could readily be extended to do SA from Could readily be extended to do SA from
almost any survey data where the almost any survey data where the necessary geographical data have bene necessary geographical data have bene collectedcollected
Greatly improves value for money of large Greatly improves value for money of large scale surveysscale surveys