Download - What is impact evaluation?

EVIDENCE AND PROSPECTS FOR A HEALTHY NIGERIA

March 31, 2016

IMPACT EVALUATION

David EvansWorld BankMarch 31, 2016

WHAT IS IMPACT EVALUATION? “An impact evaluation assesses changes in the well-being of individuals, households, communities or firms that can be attributed to a particular project, program or policy.”

-World Bank

OBJECTIVE Evaluate the causal impact of a program or an intervention on some outcome Examples How much do exposure to a television soap opera affect HIV/AIDS awareness and testing?

How much do monetary incentives reduce turnover among midwives? What about non-monetary incentives?

How much do a Quality Improvement Plan and coaching increase the quality of care at primary health care facilities?

How much does providing improve housing for midwives reduce turnover in rural areas?

WHY EVALUATE?1. Evaluation helps to learn whether programs

are actually achieving their objectives.

2. Evaluation helps to improve program effectiveness.

3. Evaluation helps to garner resources for scale-up.

WHAT DO WE NEED?

A COUNTERFACTUAL What would have happened in the absence of the program?

COUNTERFACTUAL CRITERIATreated & comparison groups

1. Have identical average characteristics (observed & unobserved)

2. The only difference is the treatment3. Therefore the only reason for any difference in

outcomes is the treatment

Key question: What would participant look like if she hadn’t received the program?

PERFECT EXPERIMENTIdentify target beneficiaries

Clone them!

Identical on the outside (observable)Identical on the inside (unobservable)

Kami Tami

We’re both five-year-old puppets

We both love to take up new

health interventions!

Images © Sesame Workshop

PERFECT EXPERIMENTGive the intervention to one set of

clones

Kami

Tami

PERFECT EXPERIMENTObserve some time later

Because the groups are identical (inside & out), the difference is due to the bednets!

Kami

Tami

FINDING A GREAT CONTROL GROUPWhat would the participant look like if she weren’t in the program?

Room For Improvement Control GroupsBefore – AfterParticipants – Non Participants

RFI: BEFORE-AFTERBefore bednets

6 malaria episodes in 6 months

After bednets2 malaria episodes in 6 months

What else might be going on besides the bednets?• Seasonal differences• Rising incomes: Households invest in other

measures

RFI: BEFORE-AFTER Important to monitor before-afterMonitoring systems tell us if things are moving in the right direction

Insufficient to show impact of program

Too many factors changing over time Example of cash transfers in Nicaragua!

Counterfactual: What would have happened in the absence of the project, with everything else the same

RFI: PARTICIPANTS VS NON-PARTICIPANTS Compare recipients of a program to People who were not eligible for the programPeople who chose not to enroll in the program

Home births Clinic births

Example: Complications in childbirth Impact of clinic

births?What else might explain the difference?

Observable differences Income Education

Unobservable differences Heard rumor about hospitals Neighbor available to care for other children

Kami

RFI: Participants vs Non-Participants

Grover

© Muppet photos copyright Sesame Workshop

No way to know how much of difference is because of clinic

RFI: Participants vs Non-Participants

Home births Clinic births

Example: Complications in childbirth

Impact of clinic births

Other factors!

http://www.monsterexchange.org/monsterpictures/31046macleangreenblob.jpg

SELECTION BIAS People who choose to join the program are different! If we cannot account completely for those differences in our data… We usually cannot How do you capture attitudes toward health systems? Initiative?

…then our comparison will not show the true impact of the program

http://www.monsterexchange.org/monsterpictures/31046macleangreenblob.jpg

WHAT SHOULD WE DO?

Gold standard:Randomized experimental design

RANDOMIZED EXPERIMENTAL DESIGN Randomly assign potential beneficiaries to be in the treatment or comparison group

Treatment and comparison have the same characteristics (observed and unobserved), on average

Any difference in outcomes is due to treatment

Randomization with two doesn’t work!

But differences average out in a big sample

On average, same number of Kamis and Grovers Observable AND unobservable

Result: Measure true impact of program

RANDOMIZED EXPERIMENTAL DESIGN

We don’t even look similar!

ComparisonTreatment

ComparisonTreatment

RANDOM ASSIGNMENTRandom sample

Gather data from random sample of population

No guarantee of unbiased impact measure

Random assignment Randomly assign program

Unbiased impact measure!

Treatment Control Treatment Control

CAN WE RANDOMIZE? Randomization does not mean denying people the benefits of the project

Usually there are existing constraints within project implementation that allow randomization

Randomization is the fairest way to allocate treatment Tanzania CCT: Randomized across needy villages Nigeria Quality Improvement: Lottery among eligible facilities

RANDOMIZATION OPPORTUNITIESSTAGGERED ROLL-OUT OF PROGRAM

Roll-out to 200 clinics

Roll-out to 200 more clinics

Roll-out to 400 more clinics

Jan 2013

July 2013

Jan 2014

• Randomize the order in which clinics receive program

• Compare Jan 2013 group to Jan 2014 group at end of first year• Example: Mexico parent health training – staggered roll-out among vulnerable communities

Example: Program for children in Kenya Orphans – Must have program now! Randomized among less vulnerable children

RANDOMIZATION OPPORTUNITIESSOME GROUPS MUST GET THE PROGRAM!

Highly vulnerable

Moderately vulnerable Not

vulnerable

RANDOMIZ

EENROLLSORRY

RANDOMIZATION OPPORTUNITIESVARY TREATMENT

intensity nature

Malaria information campaign100 villages

Malaria information campaign + SMS reminders100 villagesRa

ndom

ize a

cros

s co

mm

uniti

es

Radio campaign100 villages

Newspaper campaign100 villages

Rand

omize

acr

oss

com

mun

ities

Additional impact of SMS reminders?

Which approach has greater impact?

UNIT OF RANDOMIZATION

At what level should I randomize?IndividualHouseholdClinicCommunity

ConsiderationsPolitical feasibility of randomization at individual levelSpillovers within groupsImplementation capacity: One clinic administering different treatments

UNIT OF RANDOMIZATIONBigger unit = Bigger study

(Because of intra-community correlation)

Individual randomization: 630 participants(315 treatment, 315 control)

Clinic-level randomization: 150 clinics (75 treatment, 75 control)

Number of units you randomize matters more than total number of units

3,000 participants!

WHAT IF RANDOMIZATION IS IMPOSSIBLE? Think again: It often is possible on some level, and it’s the best way to get a clear measure of impact

Some situations, not possibleEvaluate the effect of a national health policy Interventions in the pastLife saving vaccination (volunteers for control group?)

Alternative methods available, compelling in some circumstances

I volunteer!

A COUPLE OF LAST BIG POINTS

WE SHOULD DO AN EVALUATION IF A PROGRAM IS…1. Innovative: This approach hasn’t been used

before

2. Replicable: The program may be scaled up

3. Strategically relevant: The program could involve significant resources or affect many people

4. Untested: We don’t know how well it works

5. Influential: The results will be used to make a policy decision

Adapted from Impact Evaluation in Practice

WHAT MAKES A GREAT IMPACT EVALUATION QUESTION?1. Cause-effect• YES: “What is the impact of ______ on ______?”• NOT “Who is taking up our antenatal care

program?”

2. Prospective (future-looking) YES: “What is the impact of this program we are

about to roll out?” NO: “What was the impact of a program we

rolled out 5 years ago?”

KEY CONCLUSIONS Impact evaluation tells us if our programs are working

Randomization of treatment leads to unbiased estimate of impact

Other methods rely on more assumptions

Lots of opportunities for randomization No withholding of benefits Staggered roll-out Varied treatment

Thank you!