Research Outputs for small areas: initial analysis and findings
This SlideShare highlights factors around the differences between administrative based
population estimates (Research Outputs) and official population estimates at Lower Layer
Super Output Area (LSOA) level
Please note that these Research Outputs are NOT official statistics on the population
Research Outputs - Background
• In October 2015, we published the first set of Administrative Data Census Research Outputs and provided population estimates by five year age group and sex at local authority (LA) level
• Using a Statistical Population Dataset (SPD) we matched individual records across multiple data sources into a single, coherent dataset that forms the basis for estimating the population
Research Outputs - Background
• Now we are producing administrative data population estimates at small area level using the same method
• This analysis is mainly based on a comparison with 2011 Census estimates and uses SPD v2.0 estimates
• We have also published information on the methodology used to produce the Research Outputs
Useful links
• Administrative Data Research Outputs – current release
• Information on our methodology used to produce the Research Outputs
• Feedback survey - Although we can explain some of the differences in the estimates from the examples given in this presentation, we require other data sources and local knowledge to help improve the performance of the SPD estimates at the small area level. We would welcome your feedback
Constructing the SPD to produce the population estimates
NHS PatientRegister (PR)
DWP/HMRC Customer Information
System (CIS)
Higher Education Statistics Agency Data (HESA) - students
SPD populationestimates
School Census Data
Statistical Population Dataset – SPD
Matched records from the various administrative data sources are included in the Statistical Population Dataset (SPD)
Aggregate totals of Armed Forces personnel are added in to the estimates
=A future plan is to use a coverage survey to adjust for biases on the SPD
The SPD estimates have been produced by matching individual records across the administrative data sources. To protect privacy of individuals the process involves replacing identifying fields (names, dates of birth and addresses) by one or more artificial identifiers.
What is an LSOA?
Geography Minimum population
Maximum population
Minimum number of
households
Maximum number of
households
LSOA 1,000 3,000 400 1,200
Geography England WalesLSOA 32,844 1,909
A Lower Layer Super Output Area (LSOA) is a geographic area forming part of a geographic hierarchy designed to improve the
reporting of small-area statistics in England and Wales.
LSOA analysis v Local Authority (LA) analysis
Advantages- Small area analysis allows greater
potential to understand the key issues- Ability to develop strong evidence
based explanations for the differences- Can help to explain some differences
seen at LA level
Disadvantages- Scale - there are over 35k
LSOAs in E&W- Reduced understanding and
analysis around LSOA level data, in particular how the quality of
official estimates change through the decade post census
Differences between the estimates – possible scenarios
SPD estimate higher than official estimate (SPD overestimated)
SPD estimate lower than official estimate (SPD underestimated)
SPD estimates = official estimates (both correct)*
SPD estimate higher than official estimate (official estimate overestimated)
SPD estimate lower than official estimate (official estimate underestimated)
SPD estimate = official estimate (both overestimated)
SPD estimate = official estimate (both underestimated)
* While the population estimates may be the same, either in total or for a particular age/sex group, it could still be possible that when characteristics are added they are not found to be representative of the same people.
Distribution of SPD estimates vs. census estimates 2011 - differences in 1% increments
<-10%
-10% -9% -8% -7% -6% -5% -4% -3% -2% -1% 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% >10%0%
2%
4%
6%
8%
10%
12%
14%
LSOA distribution of difference between SPD estimates and census estimates 2011
Prop
ortio
n of
LSO
As
in S
PD
These LSOAs were of most interest for us as
they showed the largest differences.
The SPD difference can be higher or lower than the official population
estimate
For approximately 80% of LSOAs, the differ-ence between the estimates was relatively
small +/- 5 %
SPD estimate lower than census estimate SPD estimate higher than census estimate
Initial analysis• Carried out at LSOA level by broad age-group and sex
• Identified samples of LSOAs with the largest differences between census and SPD estimates (2011)
• Selected 800 LSOAs (about 2% of total)
• Then conducted analysis to understand why these areas had differences
• A number of factors were identified to explain the differences which we will explore – but we don’t have all the answers
• This work has highlighted the need for additional data sources - to provide ‘activity’1 indicators to confirm usual residency in the population - local data could help us explain the differences in the small area estimates - also likely that a Population Coverage Survey (PCS) will be needed to collect information that
can evaluate the quality of the SPD and adjust accordingly for coverage errors
• An LSOA analysis tool for the SPD V2.0 estimates for England and Wales, provides interactive summary statistics and information for reference years 2011 and 2015
1. ‘Activity’ can be defined as an individual interacting with an administrative system, for example for National Insurance or tax purposes, when claiming a benefit, attending hospital appointments or updating information on government systems in some other way. Only demographic information (such as name, date of birth and address) and dates of interaction are needed from such data sources to improve the coverage of our population estimates.
.
What factors can help explain the large differences at LSOA level
Armed Forces – personnel and dependents
Prisoners – special populations
Students/Graduates
Seasonal Workers
Deprivation level and interaction withhealth and benefit systems
Real time change – housing development
Other factors?
At the LA level we may not see a large difference in the estimates
BUT analysis at LSOA level can show underlying large differences
This could be due to one of more of these factors
LA District
LSOA
Let’s see the effect these can have on the SPD estimates
1. Armed forces (AF) personnel
• The official population estimates and SPD estimates both include armed forces personnel at their place of residence which may or may not be on the base
• AF personnel may not be represented on GP patient register data because they use medical facilities on site and therefore they are excluded from the SPD
• But medical coverage can vary by base and for AF dependents (families)
• To overcome this we add in aggregate armed forces statistics because without their inclusion noticeably low estimates could be observed in the LAs containing large military bases
North Kesteven – Armed forces example
LSOA: E01026184 containsRAF WaddingtonThere is a station medical centre on the base which provides care to AF personnel but NOT their dependents
LSOA: 01026198 containsRAF Cranwell Also has a medical centre which provides medical care for personnel AND their dependents.
North Kesteven is a local authority in Lincolnshire and includes RAF Cranwell and RAF Waddington with a large number of military personnel living in the area
These two bases are reasonably close togetherLeaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis
North Kesteven – Armed forces example
LSOA: E01026184 containsRAF WaddingtonAF personnel appear to register with the on-base medical centre and do not appear on the NHS Patient Register (PR)
North Kesteven is a local authority in Lincolnshire and includes RAF Waddington and RAF Cranwell resulting in a large number of military personnel living in the area
For this LSOA the results look very close after adding the aggregated AF personnel data to the SPD
0 to 14 15 to 29 30 to 44 45 to 64 65+0
50100150200250300350400450
E01026184: Males
SPD estimates 2011 Census estimates 2011
Population
Map Data © OpenStreetMap Contributors, Nomis Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis
Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis
North Kesteven – Armed forces example
LSOA: E01026198 containsRAF Cranwell AF personnel appear to register with the on-base medical centre and a local GP and therefore are appearing on the PR.
North Kesteven is a local authority in Lincolnshire which includes RAF Cranwell and RAF Waddington resulting in a large number of military personnel living in the area
Adding in the aggregate AF data to the SPD appears to result in double counting for younger males in this LSOA
Additional data on AF interaction with health service systems could help solve this
0 to 14 15 to 29 30 to 44 45 to 64 65+0
100
200
300
400
500
600
E01026198: Males
SPD estimates 2011 Census estimates 2011
Population
2. Prisoners
Inclusion of prisoners in the official estimates and the SPDbased estimates
Census Official estimates SPD estimates Institutions housing special population groups, for example prisons,provide independent health services on site, meaning thesepopulations will not be recorded on the NHS Patient Register (PR)
This means areas with large numbers of prisoners will be underestimated in the SPD because these are unlikely to be fullyrepresented on the PR
0 to 14 15 to 29 30 to 44 45 to 64 65+0
100
200
300
400
500
600
700
E01024618: Males
SPD estimates 2011 Census estimates 2011
Population
Example the ‘Sheppey prison cluster’
The Sheppey cluster is an amalgamation of the three prisons: Elmley, Standford Hill and Swaleside.
The cluster falls within LSOA E01024618 which is in Swale in Kent
Because prisoners are not added to the SPD our estimates are lower than the official estimates in areas housing a prison
Answer?Obtain additional data to allow correct addition of prisoners in the SPD
HM Prison Swaleside Wikipedia Creative Commons License
3. Students/GraduatesExplaining the differences between SPD estimates and official estimates in an area populated with university students.
These examples may help explain why there are differences:
Using postcodes, halls of residence can be incorrectly allocated to a small area – where this occurs neighbouring small areas may be significantly over and under underestimated in the two sets of estimates
Foreign students are more likely to be excluded from the SPD estimates having not registered with a GP or applied for a national insurance number if there is no intention to work during their period of study
The official mid-year estimates of internal migration rely on moving graduates out of their local authority of study by detecting moves between updates of the PR. Delays in re-registering with a new GP has the potential to affect the estimates
List cleaning of the GP patient register can cause SPD estimates for some LAs to decrease in size between years
These examples can be more apparent at the LSOA level of detail
Example of the student factor - University of Hertfordshire (Welwyn Hatfield)
0 to
4
5 to
9
10 to
14
15 to
19
20 to
24
25 to
29
30 to
34
35 to
39
40 to
44
45 to
49
50 to
54
55 to
59
60 to
64
65 to
69
70 to
74
75 to
79
80 to
84
85 to
89
90+
0
2
4
6
8
10
12
14
SPD V2.0 and 2011 Census population estimates by five-year age group Welwyn Hatfield, 2011
Total population
SPD estimate
Census es-timate
AgeSource: Office for National Statistics
Thousands
At the LA level the results look similar for student ages
Two LSOAs covering University of Hertfordshire (Welwyn Hatfield)
The 2011 Census estimate shows a higher proportion of students resident in this LSOA when compared with the SPD estimates
The SPD estimate shows a higher proportion of students resident in this LSOA when compared with the 2011 Census estimate
E01023938
E01023937
Neighbouring small areas can show up inconsistencies in student residency when looking at the official estimates and the SPD estimates (2011)
We can see this more clearly in the next slide
Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis
0 to 14 15 to 29 30 to 44 45 to 64 65+0
500
1,000
1,500
2,000
2,500
3,000
E01023937: Total population
SPD estimates 2011 Census estimates 2011
Population
0 to 14 15 to 29 30 to 44 45 to 64 65+0
200400600800
1,0001,2001,4001,6001,800
E01023938: Total population
SPD estimates 2011 Census estimates 2011
Population
Students at LSOA level – University of Hertfordshire
When we look at the age distributions for the two LSOAs covering the University of Hertfordshire we can see the full picture
The majority of the students have been allocated to different LSOAs in the SPD and the 2011 Census
At this small area level the postcode used to determineterm-time location in the census may differ from that used in the SPD which utilises the HESA address data. Further quality checking against the census address could provide more information at this level of detail.
4. Seasonal workers
It’s reasonable to expect that administrative data sources accumulate records for people who are only temporarily resident due to seasonal working patterns
Higher SPD estimates are likely to reflect the generaltendency for younger people to take longer to updatetheir health of tax records when they leave their area ofseasonal employment
Example of the seasonal effect on the SPD estimates• Swale, North Kent, E01024556
Seasonal workers example, SwaleFarms with accommodation for seasonal workers
Caravans form temporary accommodation for
seasonal farm workers
Young workers will appear on the
administrative data sources used to create the SPD
estimates
E01024556
Difference between the Research Output and the 2011 Census estimate – LSOA E01024556
Contains Ordinance Survey Data © Crown copyright and database right 2015
0 to 14 15 to 29 30 to 44 45 to 64 65+0
50100150200250
E01024556: Males
SPD estimates 2011 Census estimates 2011
Population
0 to 14 15 to 29 30 to 44 45 to 64 65+0
50100150200250
E01024556: Females
SPD estimates 2011 Census estimates 2011
Population
Activity data could help establish if people have moved and reduce the accumulation effect
Imagery © 2016 DigitalGlobe
5. Deprivation level and interaction with health and benefit systems
SPD estimates appear better quality in areas of high need/use of public services and health systems
Theory: more interaction with systems = less error in the SPD estimate
Why?
In more deprived areas, interaction with health and benefit systems is likely to be higher; inward and outward migration is picked up in the administrative records as people update their records
In areas that are relatively affluent the reverse may be true with less interaction with health and benefit systems and associated delays in updates to administrative records when people move into these areas
Deprivation example Hammersmith and Fulham
-16%
-12%
-8%
-4%
0%
4%
8%
12%
16%
Deprivation quintiles: Hammersmith & Fulham LSOAs, 2011
Total population
LSOAs with high deprivation LSOAs with low deprivation
Aver
age
% d
iffer
ence
bet
wee
n SP
D V2
.0 a
nd C
ensu
s 20
11 The more deprived the LSOA the smaller the difference
between the SPD estimate and the 2011 Census estimate
6. Real time change – housing development
Catching up with real time change
In areas of rapid growth the administrative data will take time tocatch up with reality due to delays in registration with GPs(affecting both official estimates and administrative data outputs)and the CIS
Change in the years following the 2011 Census gives an indication of the pace at which administrative data catches up with reality
The same is true for areas where housing development results in a reduction in the housing stock at LSOA level, for example with the demolition of a tower block
0 to 14 15 to 29 30 to 44 45 to 64 65+0
100200300400500600700800900
1,000
E01033424: Total population
SPD estimate 2011 Census 2011 SPD estimate 2015Official estimate 2015
Population
Real time change – housing development
An example, E01033424, Wembley (Brent)
The post 2011 change for this LSOA is beingpicked up in each series reflecting the population expansion in the area but the SPD estimate still lags behind the census and official estimate
Wembley is one of the largest regeneration projects in the country.
According to the Mayor of London it can accommodate approximately 11,500 new homes and 10,000 new jobs through the development of sites along Wembley High Road and the land around Wembley Stadium.
Answer?Access to accurate local data could help
improve the SPD
estimates
Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis
Other factors affecting the estimates?
There are likely to be differences in areas that contain:
- Boarding schools - Homeless shelters, or - Other communal establishments
and
there will be factors we have not yet found
Although we can explain some of the differences in the estimates
from the examples given, we require other data sources and local knowledge to help improve the performance of the SPD estimates at the small area level – we would welcome your feedback (see slide 4)
SPDestimate
Official estimate
Official estimate
SPDestimate
Top Related