Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office...

6
1 What is Evaluation Quality? The tool used in the study to assess evaluation quality was based on the framework introduced in the guidance note Assessing the Strength of Evidence in the Education Sector, developed by DFID and produced by Building Evidence in Education (BE 2 ). BE 2 a donor partnership co- led by USAID, DFID, the World Bank, and the UN. The BE2 guidance note recommended seven principles of quality that are applicable to evaluations of all types. Principles of Quality Conceptual framing Openness and transparency Cultural appropriateness Robustness of the methodology Validity Reliability Cogency Assessment of the Quality of USAID-Funded Evaluations Education Sector 2013-16 This Evidence Brief is made possible by the support of the American People through the United States Agency for International Development (USAID). It was produced for review by USAID and was prepared by Management Systems International, A Tetra Tech Company, for the E3 Analytics and Evaluation Project and the Reading and Access Evaluation Project. This Brief provides highlights from an evaluation quality assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between 2013 and 2016, that are relevant to the Agency’s 2011 Education Strategy. Study Objectives 1. Develop and apply an evaluation quality assessment framework to USAID-funded evaluations in the education sector 2. Generate findings to inform the development of future guidance to improve the quality of USAID-funded evaluations Study Limitations The study only considered information provided in the evaluation reports without any further investigation. It’s likely that many evaluators did not report on some aspects of evaluation process if those were not specified as a requirement, such as reporting on inter-rater reliability. Therefore, the study does not draw a conclusion about the implementation of evaluations but only about characteristics of evaluation quality found in published reports. Finally, value for money was not considered, as evaluation cost information was not available. Evidence Brief Full report available at: http://pdf.usaid.gov/pdf_docs/ pa00srw1.pdf

Transcript of Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office...

Page 1: Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between ... Cultural

1

What is Evaluation Quality?

The tool used in the study to assess evaluation

quality was based on the framework introduced

in the guidance note Assessing the Strength of

Evidence in the Education Sector, developed by

DFID and produced by Building Evidence in

Education (BE2). BE2 a donor partnership co-

led by USAID, DFID, the World Bank, and the

UN. The BE2 guidance note recommended

seven principles of quality that are applicable to

evaluations of all types.

Principles of Quality

• Conceptual framing

• Openness and

transparency

• Cultural

appropriateness

• Robustness of the

methodology

• Validity

• Reliability

• Cogency

Assessment of the Quality of

USAID-Funded Evaluations

Education Sector 2013-16

This Evidence Brief is made possible by the support of the American People through the United States Agency for International Development

(USAID). It was produced for review by USAID and was prepared by Management Systems International, A Tetra Tech Company, for the

E3 Analytics and Evaluation Project and the Reading and Access Evaluation Project.

This Brief provides highlights from an evaluation quality

assessment commissioned by USAID’s Office of Education. The

study examined 92 evaluation reports, published between

2013 and 2016, that are relevant to the Agency’s 2011

Education Strategy.

Study Objectives

1. Develop and apply an evaluation quality assessment framework

to USAID-funded evaluations in the education sector

2. Generate findings to inform the development of future

guidance to improve the quality of USAID-funded evaluations

Study Limitations

The study only considered information provided in the evaluation

reports without any further investigation. It’s likely that many

evaluators did not report on some aspects of evaluation process

if those were not specified as a requirement, such as reporting on

inter-rater reliability. Therefore, the study does not draw a

conclusion about the implementation of evaluations but only

about characteristics of evaluation quality found in

published reports. Finally, value for money was not

considered, as evaluation cost information was not

available.

Evidence Brief

Full report available at:

http://pdf.usaid.gov/pdf_docs/

pa00srw1.pdf

Page 2: Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between ... Cultural

2

The assessment built on the Office of Education’s strong

partnerships by crowdsourcing the evaluation review to volunteer

experts in the international education and evaluation community.

After receiving training and guidance on applying the assessment

tool, two experts reviewed each evaluation report and compared

their responses before reaching a consensus on each item in the

tool.

Review Process

This process allowed the Office of Education to:

• Gather feedback on the utility of the quality assessment tool

from experts and implementing partners.

• Disseminate the BE2 framework to the international education

community.

36 experts from 21 organizations

conducted the evaluation reviews

Assessment Findings

Thirteen percent of evaluations met adequacy standards on all seven principles of quality, while 8% of

evaluations were found to be inadequate on all 7 principles. Just over half of evaluations were deemed

“adequate” on 4 out of 7 principles of quality. Cogency was scored as “adequate” most frequently

among the seven principles of quality.

Figure 1. Percent of Evaluations Scored as “Adequate” on Each Evaluation Quality Principle

67%

49%

65%

29%

41%

37%

75%

Conceptual Framing

Openness and Transparency

Robustness of Methodology

Cultural Appropriateness

Validity

Reliability

Cogency

Page 3: Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between ... Cultural

3

While the methodologies differ for

this study and a recent U.S.

Government Accountability

Office performance audit on

how Agencies Can Improve the

Quality and Dissemination of

Program Evaluations, findings from

both studies suggest that

strengthening the validity and

reliability of evaluations is

hindered by the difficulties of

evaluating programs abroad in

often challenging environments.

Figure 2. Percent of Evaluations Scored as “Adequate” on

Robustness of Methodology Principle

Figure 3. Percent of Evaluations Scored as “Adequate” on the

Principle of Validity

Figure 4. Percent of Evaluations Scored as “Adequate” on the

Principle of Reliability Country income level, crisis

and conflict status, and

Education Strategy goal were

not strongly correlated with

whether the evaluation met each

quality principle. Evaluation type

was found to affect validity, with

impact evaluations more often

rated as adequate than

performance evaluations.

The majority of evaluations were

scored as “adequate” on the

Cogency and Conceptual

Framing principles of quality,

while only 29% of evaluations were

scored as “adequate” on the

Cultural Appropriateness

principle. Just under half of

reviewed evaluations were scored

as “adequate” on the principle of

Openness and Transparency.

Page 4: Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between ... Cultural

4

Next Steps

The study findings and the assessment tool are intended for USAID Missions, USAID Washington

and USAID partners involved in evaluating USAID-funded education programming. The findings

and the assessment tool will be useful for USAID Missions when procuring and managing

evaluations, for Partners when planning and implementing evaluations, and for the education

sector at large when learning about evaluation practice and curating the evidence generated

through evaluations.

Recognizing the need for continuous improvement of USAID-funded evaluation practice, the

Office of Education planned the following activities to support such improvements:

• Integrate the evaluation quality framework and tool in professional development and

technical assistance activities for USAID staff and Missions;

• Disseminate the study findings along with the evaluation quality assessment framework

and tool among USAID implementing partners.

• Produce guidance notes for USAID Missions and implementing partners addressing

specific weaknesses identified through the study.

• Periodically repeat the review process as a way to engage with practitioners in a dialogue

about evaluation practice and curate newly generated evidence.

The assessment tool could be useful for other sectors of international development in helping

improve the strength of evidence generated by evaluations.

Conclusions This study demonstrated the benefits of assessing the quality of USAID-funded evaluations in the

education sector using a holistic framework that maps different aspects of an evaluation to seven

principles of quality. The findings from this study suggest two key conclusions:

• Performance evaluations were more likely than impact evaluations to fail to address validity aspects

such as measurement, internal and external validity. This might be the result of substantial

differences in the cost of conducting these evaluations, and that the emphasis that the Agency and

other donors have placed on improving the quality of impact evaluations has been successful.

• The quality of evidence generated in the education sector could be improved through strengthening

qualitative evaluations and through leveraging their exploratory power to complement explanatory

power of quantitative evaluations, possibly through sequential data collection in mixed-methods

evaluations.

Crowdsourcing the evaluation quality assessment used for this study provided an opportunity for the

international education community to come together to discuss quality standards for USAID-funded evaluations. Participants appreciated the opportunity to engage in constructive discussions with USAID

and other practitioners about evaluation practice in international education sector.

For more information on the assessment, contact Elena Walls at [email protected]

Page 5: Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between ... Cultural

5

EVALUATION QUALITY ASSESSMENT TOOL

The evaluation quality assessment tool developed as part of the Assessment of the Quality of USAID-Funded Evaluations in the

Education Sector, 2013-2016, was revised upon completion of the review, based on the comments from reviewers. This version

reflects these revisions.

PR

INC

IPLE

OF

QU

ALI

TY

IMP

AC

T EV

ALU

ATI

ON

S P

ERFO

RM

AN

CE

EVA

LUA

TIO

NS

REV

IEW

RES

ULT

O

VER

ALL

C

ON

CLU

-SI

ON

NO

TES/

JU

STIF

I-C

ATI

ON

QU

AN

TITA

TIV

E Q

UA

LITA

TIV

E

CO

NC

EPTU

AL

FRA

MIN

G

[1]

Are

th

e re

sear

ch/e

valu

atio

n q

ues

tio

ns

incl

ud

ed in

th

e re

po

rt?

yes/

no

adeq

uat

e/n

ot

ade-

qu

ate

[2]

Do

es t

he

rep

ort

incl

ud

e re

sear

ch/e

valu

atio

n h

ypo

thes

es?

yes/

no

[3]

Are

th

e ev

alu

atio

n q

ues

tio

ns

app

rop

riat

e fo

r th

e in

terv

enti

on

's c

on

cep

tual

fra

me-

wo

rk (

logf

ram

e/th

eory

of

chan

ge/

resu

lts

fram

ewo

rk)?

ye

s/p

arti

al/n

o/

no

t ap

plic

able

[4]

Do

es t

he

rep

ort

ack

no

wle

dge

/dra

w u

po

n e

xisti

ng

rele

van

t re

sear

ch?

yes/

par

tial

/no

[5]

Do

es t

he

rep

ort

exp

lain

th

e lo

cal c

on

text

in s

uffi

cien

t d

eta

il?

yes/

par

tial

/no

OP

ENN

ESS

AN

D

TRA

NSP

AR

ENC

Y

[6]

Is t

he

rep

ort

op

en a

bo

ut

stu

dy

limit

atio

ns

wit

h t

he

imp

lem

en

tati

on

of

the

eva

luati

on

, su

ch a

s is

sues

fac

ed d

uri

ng

dat

a co

llecti

on

th

at m

igh

t aff

ect

th

e st

ud

y’s

des

ign

? ye

s/p

arti

al/n

o

adeq

uat

e/n

ot

ade-

qu

ate

[7]

Is t

he

rep

ort

op

en a

bo

ut

stu

dy

limit

atio

ns

du

e to

issu

es w

ith

th

e im

ple

men

tati

on

of

the

inte

rve

nti

on

bei

ng

eval

uat

ed?

yes/

par

tial

/no

/ n

ot

app

licab

le

[8]

Do

es t

he

dis

cuss

ion

ab

ou

t th

e fi

nd

ings

ref

er t

o r

elev

ant

con

text

ual

fac

tors

or

met

ho

d-

olo

gica

l co

nsi

der

atio

ns?

ye

s/n

o/n

ot

ap-

plic

able

[9]

Is t

he

rep

ort

op

en a

bo

ut

po

ten

tial

infl

uen

ce d

ue

to t

he

stu

dy

team

co

mp

osi

tio

n?

yes/

par

tial

/no

CU

LTU

RA

L A

P-

PR

OP

RIA

TEN

ESS

[10

] D

oes

th

e re

po

rt li

st s

tep

s ta

ken

to

en

sure

th

at s

tud

y q

ues

tio

ns

and

met

ho

do

logy

are

in

form

ed b

y lo

cal s

take

ho

lder

s, a

re c

ult

ura

lly r

elev

ant

and

co

nte

xtu

ally

ap

pro

pri

ate?

ye

s/n

o

adeq

uat

e/n

ot

ade-

qu

ate

[11

] D

oes

th

e re

po

rt li

st s

tep

s to

ad

dre

ss a

nd

do

cum

ent

that

dat

a co

llecti

on

to

ols

wer

e d

evel

op

ed/a

dap

ted

wit

h p

arti

cip

atio

n o

f re

leva

nt

loca

l sta

keh

old

ers

and

are

cu

ltu

rally

ap

pro

pri

ate?

ye

s/p

arti

al/n

o

[12

] D

oes

th

e re

po

rt li

st s

tep

s ta

ken

to

val

idat

e fi

nd

ings

/co

ncl

usi

on

s/re

com

men

dati

on

s w

ith

loca

l sta

keh

old

ers

as p

art

of

the

eva

luati

on

? ye

s/n

o

[13

] W

as t

he

stu

dy

des

ign

ed t

o t

ake

into

acc

ou

nt

loca

lly r

elev

ant

stra

tifi

ers,

su

ch a

s p

o-

litica

l, so

cial

, eth

nic

, rel

igio

us,

geo

grap

hic

al o

r se

x/ge

nd

er p

hen

om

ena

du

rin

g d

ata

colle

c-ti

on

an

d d

ata

anal

ysis

? ye

s/p

arti

al/n

o

RO

BU

STN

ESS

OF

MET

HO

DO

LOG

Y

[14

] Is

th

e m

eth

od

olo

gy e

xpla

ined

in s

uffi

cien

t d

etai

l?

yes/

par

tial

/no

ad

equ

ate/

no

t ad

e-q

uat

e

[15

] Is

th

e m

eth

od

olo

gy a

pp

rop

riat

e f

or

answ

erin

g p

ose

d s

tud

y q

ues

tio

ns?

ye

s/p

arti

al/n

o/

no

t ap

plic

able

[16

] D

oes

th

e co

un

terf

actu

al

mee

t st

and

ard

s o

f ri

gor?

yes/

no

/no

t ap

-p

licab

le

[17

] D

oes

th

e re

po

rt in

clu

de

info

rmati

on

fro

m m

ulti

ple

dat

a so

urc

es a

nd

ho

w t

he

dat

a w

ere

tria

ngu

late

d?

yes/

par

tial

/no

/ n

ot

app

licab

le

[18

] D

oes

th

e re

po

rt m

enti

on

ste

ps

take

n t

o m

itiga

te c

om

mo

n t

hre

ats

to t

he

inte

grit

y o

f th

e ev

alu

atio

n (

such

as

no

n-e

qu

ival

ence

at

bas

elin

e, n

on

-co

mp

lian

ce, s

pill

ove

r, s

yste

m-

atic

attri

tio

n)

or

com

mo

n b

iase

s (c

on

fou

nd

ing

bia

s, s

elec

tio

n b

ias,

exp

erim

ente

r b

ias,

et

c)?

yes/

par

tial

/no

Page 6: Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office of Education. The study examined 92 evaluation reports, published between ... Cultural

6

[19

] Fo

r th

e q

uan

tita

tive

res

earc

h m

eth

od

s u

sed

, are

th

e sa

m-

plin

g ap

pro

ach

an

d s

amp

le s

ize

calc

ula

tio

ns

pre

sen

ted

in s

uffi

-ci

ent

de

tail

(to

incl

ud

e, a

t a

min

imu

m, t

ype

of

anal

ysis

, MD

ES,

alp

ha

and

bet

a)?

ye

s/p

arti

al/n

o/

no

t ap

plic

able

[20

] Fo

r th

e q

ual

ita

tive

re-

sear

ch m

eth

od

s u

sed

, is

the

sam

plin

g ap

pro

ach

des

crib

ed

in s

uffi

cien

t d

etai

l? (

at a

min

i-m

um

, a r

atio

nal

e fo

r th

e sa

m-

ple

siz

e an

d m

eth

od

of

sam

ple

se

lecti

on

) an

d is

it a

pp

rop

riat

e fo

r th

e st

ud

y o

bje

ctive

s?

yes/

par

tial

/no

/ n

ot

app

licab

le

VA

LID

ITY

[21

] D

o in

dic

ato

rs u

sed

in t

he

eval

uati

on

cap

ture

th

e co

nst

ruct

or

ph

eno

men

on

bei

ng

inve

stiga

ted

?

yes/

par

tial

/no

/ n

ot

app

licab

le

adeq

uat

e/n

ot

adeq

uat

e

[22

] W

ere

the

sam

plin

g co

nd

uct

ed in

su

ch a

way

su

ch t

hat

th

e re

sult

s ar

e ge

ner

aliz

able

to

th

e p

op

ula

tio

n o

f b

enefi

ciar

-ie

s re

ach

ed t

hro

ugh

th

e ac

tivi

ty?

ye

s/p

arti

al/n

o/

no

t ap

plic

able

[23

] D

oes

th

e re

po

rt a

llud

e to

wh

eth

er t

he

stu

dy

fin

din

gs m

ay h

ave

bee

n b

iase

d b

y th

e ac

tivi

ty o

f d

oin

g th

e st

ud

y it

self

? ye

s/n

o

[24

] D

oes

th

e re

po

rt a

dd

ress

th

e ex

tern

al v

alid

ity

of

fin

din

gs?

yes/

par

tial

/no

/ n

ot

app

licab

le

[25

] W

ere

all d

ata

colle

ctio

n t

oo

ls p

ilote

d w

ith

rep

rese

nta

tive

s o

f ta

rget

po

pu

lati

on

s p

rio

r to

beg

in-

nin

g o

f th

e d

ata

colle

ctio

n?

yes/

par

tial

/no

[26

] A

re c

on

fid

ence

inte

rval

s re

po

rted

aro

un

d p

oin

t es

tim

ates

?

ye

s/n

o/n

ot

app

li-ca

ble

[2

7]

Is t

reat

men

t eff

ect

pre

sen

ted

in t

erm

s o

f eff

ect

size

?

yes/

no

/no

t ap

pli-

cab

le

REL

IAB

ILIT

Y

[28

] D

oes

th

e re

po

rt li

st s

tep

s ta

ken

to

en

sure

th

at d

ata

wer

e co

llect

ed w

ith

a h

igh

deg

ree

of

relia

-b

ility

? ye

s/p

arti

al/n

o

adeq

uat

e/n

ot

adeq

uat

e

[29

] D

oes

th

e re

po

rt a

deq

uat

ely

add

ress

mis

sin

g d

ata/

no

n-r

esp

on

se?

yes/

par

tial

/no

CO

GEN

CY

[30

] A

re a

ll th

e st

ud

y q

ues

tio

ns,

incl

ud

ing

sub

-qu

esti

on

s, a

nsw

ered

? ye

s/n

o/n

ot

app

li-ca

ble

adeq

uat

e/n

ot

adeq

uat

e

[31

] D

oes

th

e Ex

ecu

tive

Su

mm

ary

incl

ud

e an

swer

s to

all

of

the

stu

dy

qu

esti

on

s?

yes/

no

[32

] Is

th

e re

po

rt a

cces

sib

le t

o t

he

aud

ien

ces

for

wh

om

th

e re

po

rt in

dic

ates

it is

wri

tten

(e.

g., m

ini-

miz

ing

tech

nic

al ja

rgo

n if

inte

nd

ed t

o t

he

gen

eral

pu

blic

)?

yes/

no

[33

] A

re c

on

clu

sio

ns

bas

ed o

n fi

nd

ings

an

d a

re t

he

fin

din

gs r

elat

ed t

o t

he

eval

uati

on

qu

esti

on

s?

yes/

par

tial

/no

/ n

ot

app

licab

le

[34

] Is

th

e n

arra

tive

in t

he

rep

ort

su

pp

ort

ed b

y ch

arts

, map

s an

d in

fogr

aph

ics

that

hel

p n

on

-te

chn

ical

au

die

nce

s ea

sily

un

der

stan

d t

he

stu

dy

fin

din

gs?

yes/

par

tial

/no

PR

INC

IPLE

OF

QU

AL-

ITY

IMP

AC

T EV

ALU

ATI

ON

S P

ERFO

RM

AN

CE

EVA

LUA

TIO

NS

REV

IEW

RES

ULT

O

VER

ALL

CO

N-

CLU

SIO

N

NO

TES/

JU

STIF

I-C

ATI

ON

QU

AN

TITA

TIV

E Q

UA

LITA

TIV

E