POLS 7170X Master’s Seminar Program/Policy Evaluation

24
POLS 7170X Master’s Seminar Program/Policy Evaluation Class 5 Brooklyn College – CUNY Shang E. Ha

description

POLS 7170X Master’s Seminar Program/Policy Evaluation. Class 5 Brooklyn College – CUNY Shang E. Ha. How to Measure Impact?. Impact Assessment : to determine what effects programs have on their intended outcomes and whether perhaps there are important unintended effects - PowerPoint PPT Presentation

Transcript of POLS 7170X Master’s Seminar Program/Policy Evaluation

Page 1: POLS 7170X Master’s Seminar Program/Policy Evaluation

POLS 7170XMaster’s SeminarProgram/Policy Evaluation

Class 5Brooklyn College – CUNYShang E. Ha

Page 2: POLS 7170X Master’s Seminar Program/Policy Evaluation

How to Measure Impact? Impact Assessment: to determine what

effects programs have on their intended outcomes and whether perhaps there are important unintended effects

What would have happened in the absence of the program?

Since counterfactual is not observable, the key goal of all impact evaluation methods is to construct or “mimic” the counterfactual

Page 3: POLS 7170X Master’s Seminar Program/Policy Evaluation

Constructing the Counterfactual Counterfactual is often constructed by

selecting a group not affected by the program

Randomized: Use random assignment of the program to

create a control group which mimics the counterfactual

Non-randomized: Argue that a certain excluded group mimics

the counterfactual

Page 4: POLS 7170X Master’s Seminar Program/Policy Evaluation

Types of Impact Evaluation Methods

Randomized Evaluations Random Assignment Studies Randomized Field Trials Social Experiments Randomized Controlled Trials (RCTs) Randomized Controlled Experiments

The “golden standard” research design for assessing causal effects

Page 5: POLS 7170X Master’s Seminar Program/Policy Evaluation

Types of Impact Evaluation Methods (Cont.) Non-experimental or Quasi-experimental

Methods Pre-Post Differences-in-Differences Statistical Matching Instrumental Variables Regression Discontinuity Interrupted Time Series

Lack the random assignment to conditions that is essential for true experiments

Page 6: POLS 7170X Master’s Seminar Program/Policy Evaluation

Randomize or not? Designs using nonrandomized controls universally

yield less convincing results than well-executed randomized field experiments

The randomized field experiment in always the optimal choice for impact assessment

Nevertheless, quasi-experiments are useful for impact assessment when it is impractical or impossible to conduct a true randomized experiment

Page 7: POLS 7170X Master’s Seminar Program/Policy Evaluation

Randomized trials Simple case

Take a sample of program applicants Randomly assign them to either

Treatment group – is offered treatment Or Control group – not allowed to receive

treatment (during the evaluation period) [Or Placebo group – receives an innocuous

one]

Page 8: POLS 7170X Master’s Seminar Program/Policy Evaluation

Randomized trials The critical element in estimating program effects by

randomized field experiment is configuring a control group that does not participate in the program but is equivalent to the group that does Identical composition: Intervention and control groups

contain the same mixes of persons or other units in terms of their program-related and outcome-related characteristics

Identical predispositions: Intervention and control groups are equally disposed toward the project and equally likely, without intervention, to attain any given outcome status

Identical experiences: Over the time of observation, intervention and control groups experience the same time-related processes – maturation, interfering events, etc

Page 9: POLS 7170X Master’s Seminar Program/Policy Evaluation

Randomized trials Even though target units are assigned

randomly, the intervention and control groups will never be exactly the same

But if the random assignment were made over and over, those fluctuations would average out to zero

Statistical methods are used to guide a judgment about whether a specific difference in outcome is likely to have occurred simply by chance or it is more likely to represent the effect of the intervention

Page 10: POLS 7170X Master’s Seminar Program/Policy Evaluation

Units of analysis The units on which outcome

measures are taken in an impact assessment are called the units of analysis

The choice of the units of analysis should be based on the nature of the intervention and the target units to which it is delivered C.f., units of randomization

Page 11: POLS 7170X Master’s Seminar Program/Policy Evaluation

The logic of randomized experiments

Outcome Measures

Before Program

After Program Difference

Treatment Group

T1 T2 T = T2 – T1

Control Group C1 C2 C = C2 – C1

* Program effect = T - C

Page 12: POLS 7170X Master’s Seminar Program/Policy Evaluation

Key steps in conducting a randomized experiment Design the study carefully Randomly assign people to treatment or control Collect baseline data Verify that assignment looks random Monitor process so that integrity of experiment is not

compromised Collect follow-up data for both the treatment and control

groups in identical ways Estimate program impacts by comparing mea outcome

of treatment group vs. mean outcomes of control group Assess whether program impacts are statistically

significant and practically significant

Page 13: POLS 7170X Master’s Seminar Program/Policy Evaluation

Key advantages of experiments Because members of the groups (treatment

and control) do not differ systematically at the outset of the experiment, any difference that subsequently arises between them can be attributed to the treatment rather than to other factors

Relative to results from non-experimental studies, results from experiments are: Less subject to methodological debates; More likely to be convincing to program

funders and/or policymakers

Page 14: POLS 7170X Master’s Seminar Program/Policy Evaluation

Limitations of experiments Ethical considerations

Time and cost

External validity?

Page 15: POLS 7170X Master’s Seminar Program/Policy Evaluation

Validity A tool to assess credibility of a study

Internal validity: related to ability to draw causal inference, i.e., can we attribute our impact estimates to the program and not to something else

External validity: related to ability generalize to other settings of interest, i.e., can we generalize our impact estimates from this program to other populations, time periods, countries, etc?

Page 16: POLS 7170X Master’s Seminar Program/Policy Evaluation

Example 1 [Exhibit 8-B] The Child and Adolescent Trial for Cardiovascular Health

(CATCH) 96 elementary schools (CA, LA, MN, TX); 56 intervention

sites and 40 control sites Intervention sites: training sessions for the food service

staffs informing them of the rationale for nutritionally balanced school menus and providing recipes and menus that would achieve that goal

Measured by 24-hour dietary intake interviews with children at baseline and at follow-up, children in the intervention schools were significantly lower than children in control schools in total food intake and in calories derived from fat and saturated fat, but no different with respect to intake of cholesterol or sodium

Page 17: POLS 7170X Master’s Seminar Program/Policy Evaluation

Example 2 The Minnesota Family Investment Program (MFIP) [Exhibit 8-E]

Problem of the Aid to Families with Dependent Children (AFDC): its does not encourage recipients to leave the welfare rolls and seek employment

Conduct an experiment that would encourage AFDC clients to seek employment and allow them to receive greater income than AFDC would allow if they become employed

Three conditions: An MFIP intervention group receiving more generous benefits and

mandatory participation in employment and training activities An MFIP intervention group receiving only the more generous benefits and

not the mandatory employment and training activities A control group that continued to receive the old AFDC benefits and services

MFIP intervention families were more likely to be employed and when employed had larger incomes than control families

Those in the intervention group receiving both MFIP benefits and mandatory employment and training activities were more often employed and earned more than the intervention group receiving only the MFIP benefits

Page 18: POLS 7170X Master’s Seminar Program/Policy Evaluation

Some variations on the basics Assigning to multiple treatment groups [Example – Education Program]

Problems Large class size Children at different levels of learning Teachers often absent

Possible remedies More teachers to split classes Streaming of pupils into different achievement

bands Make teachers more accountable, may show up

more

Page 19: POLS 7170X Master’s Seminar Program/Policy Evaluation

Solutions Do smaller class sizes improve test

scores? Add new teachers

Does accountable teacher get better results? New teachers more accountable

Does streaming improve test scores? Divide some classes by achievement

Page 20: POLS 7170X Master’s Seminar Program/Policy Evaluation

How to Randomize? Lottery

Phase-In

Page 21: POLS 7170X Master’s Seminar Program/Policy Evaluation

Lottery An example of clinical trial

Take 1000 people and give half of them the new drug

Can we simply apply this approach to social programs? Need to consider the constraints

Page 22: POLS 7170X Master’s Seminar Program/Policy Evaluation

Resource Constraints Many programs have limited

resources Many more eligible recipients than

resources will allow services for Quite common in practice:

Training for entrepreneurs or farmers School vouchers

How do you allocate these resources?

Page 23: POLS 7170X Master’s Seminar Program/Policy Evaluation

Advantages of Lottery Simple, common, transparent, and flexible

Participants know who the “winners” and “losers” are

There is no a priori reason to discriminate

Perceived as fair

Page 24: POLS 7170X Master’s Seminar Program/Policy Evaluation

Phase-In Everyone gets program eventually

“In five years, we will cover 500 schools, 100 per year”

Advantages Everyone gets treatment eventually

Concerns Can complicate estimating long-run effects Care required with phase-in windows Do expectations change actions/behavior

today?