The Norwegian CPI Data Validation and Editing
-
Upload
priscilla-andrews -
Category
Documents
-
view
21 -
download
2
description
Transcript of The Norwegian CPI Data Validation and Editing
1
The Norwegian CPIThe Norwegian CPIData Validation and EditingData Validation and Editing
8-9 May 2008
Tom Langer, Statistics Norway
Survey systems – data capture
The CPI survey systems The regular surveys (40 pct) The special surveys (60 pct)
Regular surveys Some 2000 outlets every month – 39 000 observations No price collectors involved in data capture Qualitative information added by respondent
Internet system ; 20 pct of respondents Postal survey – questionnaire ; 80 pct
Regular survey system – Some key figures
CPI Revision System. Some key figures 2006:08 - 2007:07. Monthly average
COICOPC, pct of
TotalB, pct of
Total
Total1
33 166 5 756 38 922 561 1,4 14,82 1 681 281 1 962 7 0,4 14,33 6 072 1 357 7 429 194 2,6 18,34 675 137 812 13 1,6 16,95 6 615 1 201 7 816 110 1,4 15,46 4 014 520 4 534 45 1,0 11,57 2 445 220 2 665 21 0,8 8,38 267 55 322 30 9,3 17,19 3 534 702 4 236 52 1,2 16,6
11 2 750 390 3 140 32 1,0 12,412 5 113 893 6 006 57 0,9 14,9
1 Excluding COICOP 1 Food and beverages (scannerdata only) and group 10 Education
SM - Subject matter specialist
C. No of interventions by
SMA. No of
observationsB. Non
response Total = A + B
Regular survey - validation
Step1 Initial cleaning Likely decimal errors and key punch errors Check against the questionnaire (electronic)
Step 2 Automatic flagging of observations HB method combined with a normalised test Decision criteria: An observation for further inspections should
be flagged in both methods
HB method set up
Basic test level: Regional product group (8 regions) Fairly homogenous Sufficient number of observations for robust estimation of
median, quartiles In some cases the number is too low In case – system expands data set to cover all observations
on national product level.
Test variables:
T1 = pt / pt-1
T2 = pt / pJuly
Validation set up
Transformation of the price relative distributions - in 2 steps:
1: Distributions symmetric around the median relative price
2: Allow for the influence of price levels U = 0,5
Leads to the effect distributions for T1 and T2
Accept intervals according to HB method: Lower Level = Em – C max (Em -Eq1;A Em) Upper Level = Em + C max (Eq3- Em;A Em)
The impact of A, U and C parametersAll tests are performed on distributions based on price changes compared to last month.Estimations are based on data from February - March 2004
The effect of parameter C - with A = 0,05 and U = 0,5
C-values
81216
The effect of parameter U - with A = 0,05 and C = 12
U-values
0,00,51,0
The effect of parameter A - with U = 0,5 and C = 12
A-values
0,00,51,0
No of flagged extremes
249243239
No of flagged extremes
244249282
No of flagged extremes
322249204
Flagged extremesCPI method. Flagged extremes per month
0
50
100
150
200
250
300
350
01:06 02:06 03:06 04:06 05:06 06:06 07:06 08:06 09:06 10:06 11:06 12:06 01:07 02:07 03:07 04:07 05:07 06:07 07:07 08:07 09:07 10:07 11:07 12:07
Editing
Data received are edited in several steps Initial cleaning of data A second round based flagged extremes Treatment of non response – automatic imputation
Macro controls Product level – region (8 regions) COICOP level A final impact control – top-down principle
Special surveys – data capture
Cover 60 pct of the total CPI weight Respondent burden Respondents have well developed computer based systems
and are positive to share data
Surveys based on scanner data ; 30 pct of CPI weight Food and beverages (300 000 obs) Alcoholic beverages (14 000 obs) New cars (1 750 obs)
Other surveys – administrative data