Delivering early benefits and trial outputs using administrative data

21
Delivering early benefits and research outputs using administrative data BSPS 2015 Natalie Shorten - Census Transformation Programme [email protected]

Transcript of Delivering early benefits and trial outputs using administrative data

Delivering early benefits and

research outputs using

administrative data

BSPS 2015

Natalie Shorten - Census Transformation Programme

[email protected]

Overview

• Outline of Census Transformation Programme

• Progress so far

• Plans for publishing admin data research outputs

2

National Statistician’s recommendation

• An online census of all households and communal

establishments in 2021 - with support for those who are

unable to complete the census online.

AND

• Increased use of administrative data and surveys in

order to enhance the statistics from the 2021 census and

improve annual statistics between censuses.

➢Make the best use of all available data to provide the population

statistics required.

AND

➢Offer a springboard to the greater use of administrative data and

annual surveys in the future.

Census Transformation Programme

initiated

1. 2021 online Census operation

• Developing and implementing 2021 online Census

2. Integrated population statistics outputs

• Producing and enhancing census outputs with admin and

survey data

• Research outputs from 2015 onwards

3. Beyond 2021

• Acquire new administrative data

• Develop new methods using admin data and surveys

Three strands:

Aspirations and Expectations

• Lots of potential with admin data

• Aiming to replicate as many census outputs as

possible using admin data (and surveys) by 2021

• Research update and outputs published each year,

subject to data access and quality

• New integrated census and admin data outputs,

subject to data access and quality

• Building towards a recommendation in 2023 on the

future of Census and population statistics

• No new surveys before 2021 – use existing surveys

and censuses to simulate surveys

5

Potential admin data for census topics

Topic Source

Household composition Various

Income HMRC / DWP

Economic status HMRC / DWP

Health status (index?)/conditions HSCIC / NHS Wales

Qualifications Census / DfE / BIS

Industry of employer HMRC?

Carers HMRC / DWP (limited)

Ethnicity NHS (quality?)

HMRC / DWP (limited)

School Census, HESA, DFE/BIS Learner

Record

Activity data * HMRC / DWP

DVLA

HSCIC / NHS Wales

* Helps identify ‘active’ or ‘inactive’ records e.g. persons in UK or abroad 6

Data already acquired

Dataset: Content includes:

NHS Patient Register Date of birth, sex, address/geography, out of

contact indicator, registration type

Customer Information System Date of birth, sex, address/geography, date of

death, last time address was updated

Higher Education Statistical Agency: Student

record

Date of birth, sex, home postcode, term-time

postcode, ethnicity

The English and the Welsh School Census Date of birth, sex, address/geography, ethnicity

7

Producing population estimates from

admin data and surveys

NHS Patient

Register

DWP/HMRC

Customer

Information

System

1%

coverage

survey

HESA data (students)

population

estimates

Statistical Population

Dataset (SPD)

8

Producing population counts from

admin data and surveys

NHS Patient

Register

DWP/HMRC

Customer

Information

System

HESA data (students)

population

counts

Statistical Population

Dataset (SPD)

9

Admin data based population counts

(SPD) compared to the 2011 Census

91% of LA total population

counts within 3.8% of

Census estimate in 2011

Admin data

method

lower than

2011 Census

Admin data

method

higher than

2011 Census

10

Admin based population counts

compared to 2011 Census by age/sex Males and females (where SPD is higher or lower than Census)

%

-8.00%

-7.00%

-6.00%

-5.00%

-4.00%

-3.00%

-2.00%

-1.00%

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

0 to

4

5 to

9

10

to 1

4

15

to 1

9

20

to 2

4

25

to 2

9

30

to 3

4

35

to 3

9

40

to 4

4

45

to 4

9

50

to 5

4

55

to 5

9

60

to 6

4

65

to 6

9

70

to 7

4

75

to 7

9

80

to 8

4

85

+

SPD5 to Official Estimates ComparisonMales and Females (where Comparison data is higher or lower than Official

Estimates) %

Males Females 11

Changes to SPD

• Updated SPD to create SPDX

• Basis of stable series for Research Outputs

• Combined rules from previous SPDs

• Use Patient Register only for 0-4 year olds

• Students counted if on HESA and either PR

or CIS

• Using extra info from CIS about people

indicating they are abroad

Improved admin based population counts

(SPDX) compared to 2011 Census by age/sex

-8.00%

-7.00%

-6.00%

-5.00%

-4.00%

-3.00%

-2.00%

-1.00%

0.00%

1.00%

2.00%

3.00%

4.00%

5.00%

0 to

4

5 to

9

10

to 1

4

15

to 1

9

20

to 2

4

25

to 2

9

30

to 3

4

35

to 3

9

40

to 4

4

45

to 4

9

50

to 5

4

55

to 5

9

60

to 6

4

65

to 6

9

70

to 7

4

75

to 7

9

80

to 8

4

85

+

SPDx v0.1 to Official Estimates ComparisonMales and Females (where SPD is higher or lower than Official Estimates)

%

Males Females

Males and females (where SPDX is higher or lower than Census)

%

13

Improved admin based population counts

(SPDX) compared to 2013 MYEs by age/sex

14

Males and females (where SPDX is higher or lower than official

estimates) %

Improved admin data based population

counts (SPDX) compared to the 2011 Census

94% of LA total population

counts within 3.8% of Census

estimate in 2011

Admin data

method

lower than

2011 Census

Admin data

method

higher than

2011 Census

15

SPDX difference to 2013 mid-year

population by LA

Admin data

method

lower than

2011 Census

Admin data

method

higher than

2011 Census

93% of LA total population

counts within 3.8% of mid-year

estimate in 2013

Aims of publishing admin based research

outputs

New annual admin based research outputs beginning

Autumn 2015 to enable:

• feedback from users on quality

• methods to be improved through time

• operationalise procedures

• users to derive early benefits

Aim to demonstrate improvements each year in:

• breadth (topics) and/or

• (geographical) detail and/or

• accuracy/timeliness

17

What’s included in the 2015 release?

The 2015 release will include:

• Research outputs for each local authority in England and Wales as

a series of population counts for 2011, 2013 and 2014 by 5 year age

groups and sex

• In addition to the methodology section, there will be an analysis

section, comparing the population counts to the ONS mid-year

population estimates

• Case studies for a selection of LAs to highlight some of the quality

issues associated with the use of admin data

• Interactive tools to facilitate user analysis

18

Admin based research update from 2016

onwards

Two series for population counts will be produced

annually:

• Population counts from the 2015 method (SPDX)

• A developmental series of population counts with

updated methodology each year

Expanding the topics will depend on data access

and quality. Our aim is to include:

• Counts at single year of age and output area level

• Number of households (by Local Authority)

• Income statistics

• Health and/or qualifications statistics

19

Annual timeline

Jul Jun Oct Sep Dec Nov Jan Feb Apr Mar May Aug

Publish

research

outputs

Research

Conference

Winter User

‘Roadshows’

Annual

Beyond 2021

assessment

Annual

MYEs

published

Any Questions?

21