Sarah Tipping and Jennifer Sinibaldi, NatCen
description
Transcript of Sarah Tipping and Jennifer Sinibaldi, NatCen
Examining the Trade Off between Sampling and Non-response Error in a Targeted
Non-response Follow-up
Sarah Tipping and Jennifer Sinibaldi, NatCen
Background
• Groves and Heeringa (2006) drew a second-phase sample as part of the responsive design for the NSFG cycle 6.Estimated increased sampling variance by
approx 20%• Want to introduce Responsive Design into
NatCen surveys
Overview of Methods
• 2 months of data from the Health Survey for England 2009 was used to simulate a responsive design protocol. Created a cut off for Phase 1 and modelled
response propensitiesThree Phase 2 samples were drawn
• The designs were assessed by comparing response, bias, variance and mean square error
Objectives
• If we implement Responsive Design… Will we improve bias?Will the inflation in variance outweigh gains in
bias?
Phase 1
• Phase 1 of the simulation ended after all cases had been called four times
• Data from Phase 1 was used to model responseDiscrete hazard model using call-level dataIncluded info about calls, interviewer
characteristics, area level information (census and other measures)
• Saved the predicted probabilities and used them at Phase 2 to draw sample
Phase 2
• Select PSUs for re-issue at Phase 2• Three approaches:
1. Cost effective sample2. Pure bias reduction 3. Cost effective bias reduction
• All three designs selected PSUs with unequal selection probabilities => selection weights needed.
Results• Evaluated the three Phase 2 sample designs by
comparing response, bias, variance and mean square error
• The cost effective design had the highest Phase 2 response rate at 61%. (n = 250)Pure bias reduction = 51% (n = 227)Cost effective bias reduction = 53% ( n= 236)
Interviewer observations by sample type
0
5
10
15
20
25
Smoking Children
Perc
ent
Interviewer observations by sample type AllPhase1 Cost Bias Cost+Bias
Household type
0
5
10
15
20
25
30
35
40
45
50
Adults (<60) no kids Adults (60+) no kids Children
Perc
ent
Household type by sample AllPhase1 Cost Bias Cost+Bias
Other demographics
0
5
10
15
20
25
30
35
40
45
50
0-15 16-29 30-64 65+
Perc
ent
Respondent age by sample type AllPhase1 Cost Bias Cost+Bias
Other demographics
0
10
20
30
40
50
60
70
80
Female Employed Married/Cohab
Perc
ent
Respondent characteristics by sample AllPhase1 Cost Bias Cost+Bias
Bias – survey estimates (women)
Estimate All Phase 1 Cost Bias Cost&Bias% % % % %
Good health 70.3 54.5 66.4 71.4 73.1Poor health 8.9 15.9 6.8 9.1 6.0Recent acute illness 17.8 21.6 19.9 17.3 17.8GHQ score high 14.8 19.3 19.7 15.7 14.4Currently smokes 16.9 18.2 19.8 18.0 18.0Regularly drinks alc 21.2 22.7 25.1 16.5 20.1Non-drinker 13.6 14.8 13.4 15.9 14.15+ portions veg 24.6 21.6 27.9 22.1 25.8<5 portions veg 69.9 71.6 66.0 70.6 70.9
Bias – difference in survey estimates (women)
Estimate Phase 1 Cost Bias Cost&Bias% % % %
Good health 15.8 4.0 1.0 2.7Poor health 7.0 2.1 0.2 2.9Recent acute illness 3.8 2.1 0.5 0.0GHQ score high 4.5 4.9 0.8 0.4Currently smokes 1.2 2.8 1.0 1.1Regularly drinks alc 1.5 3.9 4.7 1.0Non-drinker 1.2 0.1 2.4 0.65+ portions veg 3.0 3.4 2.5 1.2<5 portions veg 1.7 3.9 0.7 1.0
Mean 4.4 3.0 1.5 1.2
Variance of survey estimates (women)
Estimate All Phase 1 Cost Bias Cost&Bias
Good health 21.0 25.1 22.4 20.5 19.8Poor health 8.1 13.5 6.4 8.3 5.6Recent acute illness 14.7 17.1 16.0 14.3 14.7GHQ score high 12.7 15.8 15.9 13.3 12.4Currently smokes 14.1 15.0 15.9 14.8 14.8Regularly drinks alc 16.8 17.8 18.9 13.8 16.2Non-drinker 11.8 12.7 11.7 13.4 12.25+ portions veg 18.6 17.1 20.2 17.3 19.2<5 portions veg 21.1 20.6 22.5 20.8 20.7
Mean 15.4 17.2 16.7 15.2 15.1
Mean Square Error• MSE was generated for a selection of key health
estimates MSE = Var + Bias2
MSE for survey estimates (women)
Estimate Phase 1 Cost Bias Cost&Bias
Good health 274.5 38.1 21.6 27.2Poor health 62.7 10.7 8.3 14.2Recent acute illness 31.5 20.6 14.6 14.7GHQ score high 35.9 39.7 14.0 12.6Currently smokes 16.6 23.8 15.9 16.0Regularly drinks alc 20.1 34.5 35.9 17.2Non-drinker 14.2 11.7 19.0 12.55+ portions veg 26.0 31.6 23.5 20.7<5 portions veg 23.4 37.9 21.3 21.7
Mean 56.1 27.6 19.3 17.4
Conclusions
• Results are positive!• Focusing on cost effectiveness increases bias
of estimates• ‘Pure’ bias reduction does not perform much
better than cost effective bias reduction in terms of bias, variance inflation and MSE
Discussion points
• Discrepancies between interviewer observations and actual data.
• Selection weights need careful consideration, want to avoid large weights
Actual data and interviewer obs
Survey data
Interviewer obs None
Smoke only
Kids only
Smoke & kids Total
None 400 45 31 18 494
Smoke only
12 33 0 2 47
Kids only 26 3 47 21 97
Smoke & kids
2 3 0 13 18
Total 440 84 78 54 656