Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

18
Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank

Transcript of Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Page 1: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Using Survey Data to Improve Registry Data

The Case of Syrian Refugees

Paolo VermeWorld Bank

Page 2: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Motivation• UNHCR keeps records of all refugees registered with the proGres database

• UNHCR collects a rich set of additional information via home visits’ surveys

• This information is little exploited for analysis

• UNHCR needs to target refugees with its cash assistance program due to funding shortages

• How good is the UNHCR targeting capacity?

• Can UNHCR improve its targeting capacity by making better use of available information and reduce the cost of surveys?

Page 3: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Background• A WB-UNHCR partnership: WB analytical expertise on poverty and welfare and UNHCR expertise on refugees

• Pilot study in March 2014

• Two countries (Jordan and Lebanon)

• 6 Data sets (3 Jordan and 3 Lebanon): • UNHCR registry for Jordan and Lebanon (PG: 650,000 records Jordan; 1.2 m records Lebanon); • UNHCR and WFP home visits (45,000 cases in Jordan, 2 rounds; 30,000 cases Lebanon, 1 round)• UNHCR and WFP survey (1,700 cases Lebanon, 1 round)

• Focus on refugees living outside camps

Þ How Poor Are Refugees? A Welfare Assessment of Syrian Refugees Living in Jordan and Lebanon(16 December, 2015)

Page 4: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Basic Idea

1. Construct welfare aggregates

2. Construct a welfare and poverty model with PG+HV data

3. Tests best proxies for welfare and poverty

4. Tests composite indexes for welfare

5. Derive the best welfare and poverty models (PG+HV data)

6. Find the best variables to add to the proGres data

7. Predict welfare and poverty using PG data

8. Use the new PG model for targeting assistance programs

Page 5: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

1) Construct Welfare Aggregate

Income per capita Expenditure per capita Expenditure per capita net of UNHCR assistance0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

16.1 16.1 16.1

39.2

0.0 2.1

44.7

83.9 81.8

Missing Zeroes Positive

Page 6: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

1) Construct Welfare Aggregate

0.2

.4.6

.8

0 2 4 6 8x

Exp/cap with UNHCR cash Inc/cap with UNHCR cashExp/cap without UNHCR cash

Page 7: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

2) Construct Welfare ModelsPHASE I :

Where W=welfare measure (income, expenditure or poverty); HP=vector of case characteristics present in both the PG and HV databases; H=vector of case characteristics present in the HV data but not in the PG data; P=vector of case characteristics present in the PG data but not in the HV data; h=Subset of H significant in Phase I; = normally distributed error term with zero means; i=household (case number in UNHCR data).

PHASE II : ,PHASE III :

Page 8: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

3) Test best proxies of Welfare• Basic tabulations

• Univariate regressions on 185 variables

• Ranking by R2

• Multivariate tests: Backward and forward regressions

• Manual tests of key explanatory variables

• Tests repeated for composite welfare indicators

• Maximizing R squared

Page 9: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

3) Test best proxies of Welfare

Male

Female

1/11/2013-24/2/2014

25/2/2014-1/10/2014

Non Muslim

Muslim

Non Arab

Arab

Edu 8 years of less

Edu less than 8 years

Blue Collars

White Collars

Low Skills

High Skills

gend

ertim

ere

ligio

net

hnic

educ

atoc

cup

skill

s

0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0

Page 10: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

3) Test best proxies of WelfareVariable Description Obs W R2 lincsize_hv Individuals in case (HV) 15975 0.470121638dem_p_child Proportion of children 15975 0.312118415edu_p_attend Proportion of children in school 15975 0.152996621pov_inc_unhcr UNHCR Monthly Financial Assistance 15787 0.150577538

cash_largeLarge Family &/or Family with Babies, Toddlers or Children Attending School 5000 0.135079263

prot_bail_date 708 0.100568917pov_cop_aid Humanitarian assistance 15788 0.092286375cash_smother Single Women 5000 0.058084132prot_aggr Score for work and residence permit, MOI and bail out doc 3863 0.050228105cash_elderly Elderly 60 and Above 5000 0.048847739prot_moi_diff_apolice Not comfortable approaching a police station 187 0.041721867pov_inc_aggr Total number of kinds of income 15787 0.03765949prot_bail BailedOutFromCamp 13212 0.035152565cash_decision Decision_for_cash_Assistance 15811 0.034241792pov_cop_host Living together with host family 15788 0.029711534ref_elderly Elderly alone 1111 0.028308115prot_know_school School 14949 0.022848084edu_p_notattend Proportion of children not in school 15975 0.02232405house_sanitary SanitaryFacilitiesStatus 15602 0.021637243dem_pafemale Female Principal Applicant 15835 0.021228419

Page 11: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

4) Test Composite Indexes of Welfare

Variable Obs Mean Std. Dev. Min Max R2 %

i_rent 15975 0.91518 0.278622 0 1 0.017427 1.742672

i_latrine 15975 0.773083 0.418852 0 1 0.014387 1.438695

i_good_liv~d 15975 0.476557 0.499466 0 1 0.013578 1.357762

i_housecon~n 15975 0.86723 0.339337 0 1 0.007125 0.712506

i_pipewater 15975 0.878685 0.326503 0 1 0.007003 0.700316

i_good_san~y 15975 0.138717 0.345662 0 1 0.006762 0.676154

i_good_ven~n 15975 0.28626 0.452026 0 1 0.006047 0.604679

i_waste 15975 0.746792 0.434863 0 1 0.005852 0.585157

i_water 15975 0.797684 0.401739 0 1 0.005125 0.512468

i_good_ele~y 15975 0.281189 0.449594 0 1 0.005017 0.501678

Page 12: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

4) Test Composite Indexes of Welfare

Variable Obs Mean Std. Dev. Min Max R2

ind_house_crowd 15975 1.781887 1.364509 0 16 0.267

ind_house_crowd1 15975 2.551506 1.697571 0 58 0.022

ind_wash_water 15975 3.196244 1.168634 0 4 0.014

ind_nfi 15975 0.16169 0.381374 0 7 0.011

ind_house_subjective 15975 1.736588 1.619757 0 6 0.009

ind_house_assets 15975 8.3682 3.006631 0 13 0.008

ind_cope_index 15975 2.448013 1.727429 0 5 0.007

ind_wash_hygiene 15975 4.192363 1.150333 0 5 0.007

ind_cope_wfp 15975 1.665477 1.481963 0 8 0.006

ind_food_wfp 15975 42.55236 16.63382 0 112 0.003

ind_house_quality 15975 1.685383 0.57311 0 2 0.003

ind_food_score 15975 22.13459 8.49884 0 56 0.002

ind_food_variety 15975 7.101659 1.576538 0 8 0.001

Page 13: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

5) Derive the optimal Welfare Model (PG+HV)

Coef. t Coef. t Coef. tVariable

importanceCase size (Ref. Csize=1) 2 -0.47 -34.9 -0.47 -36.5 -0.47 -35.8 18.09%

3 -0.83 -48.5 -0.83 -50.0 -0.83 -49.04 -1.04 -61.3 -1.04 -62.9 -1.05 -61.95 -1.19 -67.6 -1.19 -68.9 -1.20 -68.06 -1.42 -77.2 -1.40 -77.9 -1.41 -76.47 -1.45 -68.4 -1.43 -68.8 -1.44 -67.9 8 - 11 -1.66 -80.4 -1.62 -79.9 -1.63 -78.8>=12 -2.36 -36.6 -2.32 -37.3 -2.32 -36.9

Proportion of children <18 years (Ref. =0) 0 - 50% -0.05 -2.9 -0.06 -4.0 -0.06 -3.7 0.05%50% - 75% -0.06 -4.2 -0.07 -4.9 -0.06 -4.4>75% -0.09 -5.4 -0.08 -5.1 -0.07 -4.4

Employment of PA ( Ref. None) Low Skilled 0.00 0.1 0.03 2.4 0.03 2.2 0.07%Skilled 0.01 1.0 0.02 1.6 0.01 1.1High Skilled 0.02 1.6 0.01 0.9 0.01 0.7Professional 0.07 5.1 0.06 4.4 0.06 4.4

Age of PA ( Ref: <=34 years) 35-54 years 0.07 8.7 0.06 7.8 0.06 8.0 0.15%>=55 years 0.07 5.8 0.04 3.7 0.05 4.3

Marital status of PA (Ref. Married or engaged) Divorced or separated. -0.13 -5.5 -0.10 -4.3 -0.10 -4.2 0.20%Single -0.12 -9.7 -0.10 -8.6 -0.10 -8.2Widowed -0.10 -7.0 -0.07 -5.3 -0.06 -4.3

Highest education of PA ( Ref. Below 6 years) 6-8 years 0.06 7.1 0.02 2.2 0.02 2.2 0.10%9-11 years 0.11 10.1 0.05 5.0 0.05 4.612-14 years 0.13 10.6 0.07 5.8 0.06 5.2At least university 0.25 14.9 0.18 10.9 0.16 9.9

Origin (Ref. Damascus) Al-hasakeh -0.11 -2.8 -0.01 -0.2 -0.03 -0.7 0.17%Aleppo -0.03 -2.2 -0.05 -3.2 -0.06 -4.0Ar-raqqa -0.04 -1.6 -0.02 -0.9 -0.04 -1.6Dar'a 0.00 0.2 -0.03 -2.2 -0.02 -1.7Hama -0.33 -19.3 -0.02 -1.2 -0.02 -1.4Homs -0.06 -4.7 -0.11 -8.5 -0.09 -7.3 Idleb -0.32 -11.8 -0.04 -1.3 -0.05 -1.7Rural Damascus -0.03 -2.2 -0.03 -1.9 -0.02 -1.7Tartous -0.10 -0.9 -0.13 -1.1 -0.13 -1.1As-sweida -0.05 -0.6 0.04 0.5 0.06 0.8Deir-ez-zor -0.17 -4.3 -0.14 -3.6 -0.14 -3.6Lattakia 0.07 1.2 0.05 1.0 0.06 1.1Quneitra 0.05 1.1 -0.01 -0.2 0.00 -0.1

Only PG variables PG + house + wash All variables

Page 14: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Welfare Model(Cont.)

Legal arrival 0.12 13.3 0.09 9.9 0.08 9.0 0.16%Destination (Ref. Amman) Ajloun -0.12 -4.6 -0.14 -5.6 -0.16 -6.2 1.12%

Aqabah 0.06 1.0 0.04 0.7 0.02 0.4Balqa -0.08 -4.9 -0.07 -4.2 -0.07 -4.3Irbid -0.08 -8.7 -0.08 -9.0 -0.09 -9.7Jarash -0.13 -5.6 -0.15 -6.5 -0.19 -8.0Karak -0.14 -6.3 -0.15 -7.2 -0.15 -6.9Maan -0.14 -4.5 -0.14 -4.7 -0.13 -4.2Madaba -0.16 -7.2 -0.12 -5.4 -0.11 -5.1Mafraq -0.21 -19.8 -0.10 -9.3 -0.08 -7.0Tafilah -0.40 -9.5 -0.45 -11.1 -0.48 -11.8Zarqa -0.25 -22.6 -0.21 -20.0 -0.20 -18.6

Border crossing point (Ref. Airport) Ruwaished - Hadallat -0.01 -0.6 0.06 4.1 0.06 3.6 0.19%Tal Shihab -0.06 -3.7 -0.05 -3.2 -0.04 -2.8Nasib-official or unofficial. -0.07 -6.2 -0.05 -5.1 -0.04 -4.0other or no data. -0.04 -3.0 -0.02 -1.8 -0.02 -1.3

house_kitchen_d 0.10 11.6 0.11 11.5 0.39%house_electricity_d 0.04 3.3 0.04 3.5 0.07%house_ventilation_d 0.05 5.5 0.05 5.3 0.07%house for rent or owned 0.59 39.8 0.60 39.9 3.55%concrete_house 0.10 6.8 0.12 8.1 0.16%House area:sq m per capita (Ref. <10 sq meters) 10-15 sq meters 0.02 3.0 0.02 2.7 0.31%

>15 sq meters 0.11 14.4 0.09 11.5wash_piped (water through piped AND piped sewerage) 0.04 5.3 0.04 4.5 0.05%nfi_1_dummy (Receiving NFIs) -0.05 -5.9 0.08%pov_cop_aid (Coping strategy=humanitarian assistance) -0.15 -16.0 0.58%pov_cop_host (Coping strategy=host family) -0.12 -17.5 0.70%pov_cop_comm (Coping strategy=host community) -0.04 -6.0 0.09%prot_cert_valid (Is certificate valid) 0.06 7.1 0.13%pov_inc_unhcr (UNHCR financial assistance) -0.35 -34.4 -0.40 -40.2 -0.34 -31.7 2.28%

_cons 4.88 212.5 4.05 152.4 4.05 142.5F statistic 721.20 727.92 651.11Adjusted R-squared 0.48 0.53 0.53N 42217 40541 38694

Page 15: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

6) Select the Best Variables to Add to PG DataThe poverty model predicts poverty correctly 90.1% of the times

• PG-HV predictors:• Case size • Rent• Place of destination (country and region)• Official entry and point of entry• Place of origin (Damascus vs other regions)

• HV predictors (h)• Principal applicant characteristics (age and marital status)• Selected assets (latrine, piped water, kitchen)

• Education and former occupation are less important than expected

Page 16: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

7) Predict Welfare using new PG model

Wexp_unhcr_lncap

Coef t

Individuals in case (HV) -0.212*** -69.811

Proportion of children -0.611*** -25.245

Concrete House 0.195*** 8.017

Santitation average or above 0.109*** 7.244

Ventilation average or above 0.100*** 6.194

Free Housing -0.705*** -24.419

Proportion school-aged children 0.113*** 8.919

Proportion of children in school -0.207*** -15.267

Sharing costs with host family -0.095*** -6.628

Living together with host family 0.114*** 8.987

IsCertificateValid 0.124*** 9.190

_cons 4.715*** 162.779

Number of observations 14,150

R2 0.555

Adjusted/Pseudo R2 0.554

Page 17: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

8) Use New PG Model to Target Assistance

Income

Expenditure Non poor Poor Total

Non poor 8.5 37.2 45.6

Poor 4.6 49.8 54.4

Total 13.1 86.9 100.0

Page 18: Using Survey Data to Improve Registry Data The Case of Syrian Refugees Paolo Verme World Bank.

Conclusion

• World Bank and UNHCR have complementary skills and resources

• Existing UNHCR data can be used to improve targeting:• Shift from income to expenditure• Reduce leaking and costs• Improve coverage and targeting

• The analysis highlighted areas to improve:• Data collection (survey design, sampling, questionnaires)• Data management (PG management system, survey administration)• Data analysis (economists and social protection officers)