1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner...

12
1 Selective data editing Selective data editing Development & implementation Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

Transcript of 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner...

Page 1: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

1

Selective data editingSelective data editingDevelopment & implementationDevelopment & implementation

Q 2010 Helsinki

Jörgen Svensson

Process Owner

Statistics Sweden (SCB)

Page 2: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

Standardization at SCB

• Decentralized production

• Development of CBM:s

• Editing costly, 33% of budgets

• Data collection departments, 2006

• Standardization – the Lotta project, in 2006

22

Page 3: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

3

Nine case studiesNine case studies

Purpose of the project:

• Try using selective data editing

• What is the potential gain using the method?

• Would it be possible to develop and use a common tool?

Page 4: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

4

Some results from case Some results from case studiesstudies

SurveyReduction

%

Short term employment, private sector 60

Business activity indicators 50

Price indices in producer & import stages 50

Short term statistics, wages & salaries, private sector

40

Wage & salary structures in the private sector 25

Foreign trade (5)

Structural business statistics ---

Page 5: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

SUSPICION

• SUSP(j, k) = Suspicion of variable j for unit k

• SUSP(j, k) = 0 if variable value falls within acceptance interval

• SUSP(j, k) → 1 as value deviates from acceptance limit

• 0 ≤ SUSP(j,k) ≤ 1

Page 6: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

POTENTIAL IMPACT

• POTIMP = Potential impact

• POTIMP is weighted absolute difference between observed and predicted value :

• POTIMP(j ,k,d) =

for variable j, unit k in domain d wk is sampling weight, k(d) is domain indicator

• SELEKT supports several ways to establish predicted value: from time series data and from cross sectional analysis within homogenous groups of units

Page 7: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

Flagging suspected errorsFlagging suspected errors

log(Potential impact)

log(Suspicion)

20

Flagged

Page 8: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

LOCAL SCORE

Local (item) score LScore (j,k,d):

LScore (j,k,d) = SUSP(j,k)*|POTIMP(j,k,d)|*Cello(j,d)

Cello(j,d) is inversely proportional to the standard error based on previous data

Page 9: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

GLOBAL SCORE

• Global (unit) score GScore(k) is obtained by aggregation of local scores

• LScore (k, j, d) → LScore (k , j) → GScore(k)

• → = Summation , Euclidian Summation or Maximum

• Only those units with GScore larger than a pre-decided threshold are followed up

Page 10: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

SELEKT, EDIT SELEKT, EDIT and process dataand process data

1010

Page 11: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

Implementation of SELEKT

So far three surveys:

• Business activity indicators

• Wage & salary structures in the private sector

• Commodity flow survey

1111

Page 12: 1 Selective data editing Development & implementation Q 2010 Helsinki Jörgen Svensson Process Owner Statistics Sweden (SCB)

1212

Documentation

A General Methodology for Selective Data Editing

[email protected]

[email protected]