EVIDENCE AND PROSPECTS FOR A HEALTHY NIGERIA
March 31, 2016
IMPACT EVALUATION
David EvansWorld BankMarch 31, 2016
WHAT IS IMPACT EVALUATION? “An impact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.”
-World Bank
WHAT IS IMPACT EVALUATION? “An impact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.”
-World Bank
OBJECTIVE Evaluate the causal impact of a program or an intervention on some outcome Examples How much do exposure to a television soap opera affect HIV/AIDS awareness and testing?
How much do monetary incentives reduce turnover among midwives? What about non-monetary incentives?
How much do a Quality Improvement Plan and coaching increase the quality of care at primary health care facilities?
How much does providing improve housing for midwives reduce turnover in rural areas?
WHY EVALUATE?1. Evaluation helps to learn whether programs
are actually achieving their objectives.
2. Evaluation helps to improve program effectiveness.
3. Evaluation helps to garner resources for scale-up.
WHAT DO WE NEED?
A COUNTERFACTUAL What would have happened in the absence of the program?
COUNTERFACTUAL CRITERIATreated & comparison groups
1. Have identical average characteristics (observed & unobserved)
2. The only difference is the treatment3. Therefore the only reason for any difference in
outcomes is the treatment
Key question: What would participant look like if she hadn’t received the program?
PERFECT EXPERIMENTIdentify target beneficiaries
Clone them!
Identical on the outside (observable)Identical on the inside (unobservable)
Kami Tami
We’re both five-year-old puppets
We both love to take up new
health interventions!
Images © Sesame Workshop
PERFECT EXPERIMENTGive the intervention to one set of
clones
Kami
Tami
PERFECT EXPERIMENTObserve some time later
Because the groups are identical (inside & out), the difference is due to the bednets!
Kami
Tami
FINDING A GREAT CONTROL GROUPWhat would the participant look like if she weren’t in the program?
Room For Improvement Control GroupsBefore – AfterParticipants – Non Participants
RFI: BEFORE-AFTERBefore bednets
6 malaria episodes in 6 months
After bednets2 malaria episodes in 6 months
What else might be going on besides the bednets?• Seasonal differences• Rising incomes: Households invest in other
measures
RFI: BEFORE-AFTER Important to monitor before-afterMonitoring systems tell us if things are moving in the right direction
Insufficient to show impact of program
Too many factors changing over time Example of cash transfers in Nicaragua!
Counterfactual: What would have happened in the absence of the project, with everything else the same
RFI: PARTICIPANTS VS NON-PARTICIPANTS Compare recipients of a program to People who were not eligible for the programPeople who chose not to enroll in the program
Home births Clinic births
Example: Complications in childbirth Impact of clinic
births?What else might explain the difference?
Observable differences Income Education
Unobservable differences Heard rumor about hospitals Neighbor available to care for other children
Kami
RFI: Participants vs Non-Participants
Grover
© Muppet photos copyright Sesame Workshop
No way to know how much of difference is because of clinic
RFI: Participants vs Non-Participants
Home births Clinic births
Example: Complications in childbirth
Impact of clinic births
Other factors!
SELECTION BIAS People who choose to join the program are different! If we cannot account completely for those differences in our data… We usually cannot How do you capture attitudes toward health systems? Initiative?
…then our comparison will not show the true impact of the program
WHAT SHOULD WE DO?
Gold standard:Randomized experimental design
RANDOMIZED EXPERIMENTAL DESIGN Randomly assign potential beneficiaries to be in the treatment or comparison group
Treatment and comparison have the same characteristics (observed and unobserved), on average
Any difference in outcomes is due to treatment
Randomization with two doesn’t work!
But differences average out in a big sample
On average, same number of Kamis and Grovers Observable AND unobservable
Result: Measure true impact of program
RANDOMIZED EXPERIMENTAL DESIGN
We don’t even look similar!
ComparisonTreatment
ComparisonTreatment
RANDOM ASSIGNMENTRandom sample
Gather data from random sample of population
No guarantee of unbiased impact measure
Random assignment Randomly assign program
Unbiased impact measure!
Treatment Control Treatment Control
CAN WE RANDOMIZE? Randomization does not mean denying people the benefits of the project
Usually there are existing constraints within project implementation that allow randomization
Randomization is the fairest way to allocate treatment Tanzania CCT: Randomized across needy villages Nigeria Quality Improvement: Lottery among eligible facilities
RANDOMIZATION OPPORTUNITIESSTAGGERED ROLL-OUT OF PROGRAM
Roll-out to 200 clinics
Roll-out to 200 more clinics
Roll-out to 400 more clinics
Jan 2013
July 2013
Jan 2014
• Randomize the order in which clinics receive program
• Compare Jan 2013 group to Jan 2014 group at end of first year• Example: Mexico parent health training – staggered roll-out among vulnerable communities
Example: Program for children in Kenya Orphans – Must have program now! Randomized among less vulnerable children
RANDOMIZATION OPPORTUNITIESSOME GROUPS MUST GET THE PROGRAM!
Highly vulnerable
Moderately vulnerable Not
vulnerable
RANDOMIZ
EENROLLSORRY
RANDOMIZATION OPPORTUNITIESVARY TREATMENT
intensity nature
Malaria information campaign100 villages
Malaria information campaign + SMS reminders100 villagesRa
ndom
ize a
cros
s co
mm
uniti
es
Radio campaign100 villages
Newspaper campaign100 villages
Rand
omize
acr
oss
com
mun
ities
Additional impact of SMS reminders?
Which approach has greater impact?
UNIT OF RANDOMIZATION
At what level should I randomize?IndividualHouseholdClinicCommunity
ConsiderationsPolitical feasibility of randomization at individual levelSpillovers within groupsImplementation capacity: One clinic administering different treatments
UNIT OF RANDOMIZATIONBigger unit = Bigger study
(Because of intra-community correlation)
Individual randomization: 630 participants(315 treatment, 315 control)
Clinic-level randomization: 150 clinics (75 treatment, 75 control)
Number of units you randomize matters more than total number of units
3,000 participants!
WHAT IF RANDOMIZATION IS IMPOSSIBLE? Think again: It often is possible on some level, and it’s the best way to get a clear measure of impact
Some situations, not possibleEvaluate the effect of a national health policy Interventions in the pastLife saving vaccination (volunteers for control group?)
Alternative methods available, compelling in some circumstances
I volunteer!
A COUPLE OF LAST BIG POINTS
WE SHOULD DO AN EVALUATION IF A PROGRAM IS…1. Innovative: This approach hasn’t been used
before
2. Replicable: The program may be scaled up
3. Strategically relevant: The program could involve significant resources or affect many people
4. Untested: We don’t know how well it works
5. Influential: The results will be used to make a policy decision
Adapted from Impact Evaluation in Practice
WHAT MAKES A GREAT IMPACT EVALUATION QUESTION?1. Cause-effect• YES: “What is the impact of ______ on ______?”• NOT “Who is taking up our antenatal care
program?”
2. Prospective (future-looking) YES: “What is the impact of this program we are
about to roll out?” NO: “What was the impact of a program we
rolled out 5 years ago?”
KEY CONCLUSIONS Impact evaluation tells us if our programs are working
Randomization of treatment leads to unbiased estimate of impact
Other methods rely on more assumptions
Lots of opportunities for randomization No withholding of benefits Staggered roll-out Varied treatment
Thank you!
Top Related