Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office...
Transcript of Assessment of the Quality of USAID-Funded Evaluations ...assessment commissioned by USAID’s Office...
1
What is Evaluation Quality?
The tool used in the study to assess evaluation
quality was based on the framework introduced
in the guidance note Assessing the Strength of
Evidence in the Education Sector, developed by
DFID and produced by Building Evidence in
Education (BE2). BE2 a donor partnership co-
led by USAID, DFID, the World Bank, and the
UN. The BE2 guidance note recommended
seven principles of quality that are applicable to
evaluations of all types.
Principles of Quality
• Conceptual framing
• Openness and
transparency
• Cultural
appropriateness
• Robustness of the
methodology
• Validity
• Reliability
• Cogency
Assessment of the Quality of
USAID-Funded Evaluations
Education Sector 2013-16
This Evidence Brief is made possible by the support of the American People through the United States Agency for International Development
(USAID). It was produced for review by USAID and was prepared by Management Systems International, A Tetra Tech Company, for the
E3 Analytics and Evaluation Project and the Reading and Access Evaluation Project.
This Brief provides highlights from an evaluation quality
assessment commissioned by USAID’s Office of Education. The
study examined 92 evaluation reports, published between
2013 and 2016, that are relevant to the Agency’s 2011
Education Strategy.
Study Objectives
1. Develop and apply an evaluation quality assessment framework
to USAID-funded evaluations in the education sector
2. Generate findings to inform the development of future
guidance to improve the quality of USAID-funded evaluations
Study Limitations
The study only considered information provided in the evaluation
reports without any further investigation. It’s likely that many
evaluators did not report on some aspects of evaluation process
if those were not specified as a requirement, such as reporting on
inter-rater reliability. Therefore, the study does not draw a
conclusion about the implementation of evaluations but only
about characteristics of evaluation quality found in
published reports. Finally, value for money was not
considered, as evaluation cost information was not
available.
Evidence Brief
Full report available at:
http://pdf.usaid.gov/pdf_docs/
pa00srw1.pdf
2
The assessment built on the Office of Education’s strong
partnerships by crowdsourcing the evaluation review to volunteer
experts in the international education and evaluation community.
After receiving training and guidance on applying the assessment
tool, two experts reviewed each evaluation report and compared
their responses before reaching a consensus on each item in the
tool.
Review Process
This process allowed the Office of Education to:
• Gather feedback on the utility of the quality assessment tool
from experts and implementing partners.
• Disseminate the BE2 framework to the international education
community.
36 experts from 21 organizations
conducted the evaluation reviews
Assessment Findings
Thirteen percent of evaluations met adequacy standards on all seven principles of quality, while 8% of
evaluations were found to be inadequate on all 7 principles. Just over half of evaluations were deemed
“adequate” on 4 out of 7 principles of quality. Cogency was scored as “adequate” most frequently
among the seven principles of quality.
Figure 1. Percent of Evaluations Scored as “Adequate” on Each Evaluation Quality Principle
67%
49%
65%
29%
41%
37%
75%
Conceptual Framing
Openness and Transparency
Robustness of Methodology
Cultural Appropriateness
Validity
Reliability
Cogency
3
While the methodologies differ for
this study and a recent U.S.
Government Accountability
Office performance audit on
how Agencies Can Improve the
Quality and Dissemination of
Program Evaluations, findings from
both studies suggest that
strengthening the validity and
reliability of evaluations is
hindered by the difficulties of
evaluating programs abroad in
often challenging environments.
Figure 2. Percent of Evaluations Scored as “Adequate” on
Robustness of Methodology Principle
Figure 3. Percent of Evaluations Scored as “Adequate” on the
Principle of Validity
Figure 4. Percent of Evaluations Scored as “Adequate” on the
Principle of Reliability Country income level, crisis
and conflict status, and
Education Strategy goal were
not strongly correlated with
whether the evaluation met each
quality principle. Evaluation type
was found to affect validity, with
impact evaluations more often
rated as adequate than
performance evaluations.
The majority of evaluations were
scored as “adequate” on the
Cogency and Conceptual
Framing principles of quality,
while only 29% of evaluations were
scored as “adequate” on the
Cultural Appropriateness
principle. Just under half of
reviewed evaluations were scored
as “adequate” on the principle of
Openness and Transparency.
4
Next Steps
The study findings and the assessment tool are intended for USAID Missions, USAID Washington
and USAID partners involved in evaluating USAID-funded education programming. The findings
and the assessment tool will be useful for USAID Missions when procuring and managing
evaluations, for Partners when planning and implementing evaluations, and for the education
sector at large when learning about evaluation practice and curating the evidence generated
through evaluations.
Recognizing the need for continuous improvement of USAID-funded evaluation practice, the
Office of Education planned the following activities to support such improvements:
• Integrate the evaluation quality framework and tool in professional development and
technical assistance activities for USAID staff and Missions;
• Disseminate the study findings along with the evaluation quality assessment framework
and tool among USAID implementing partners.
• Produce guidance notes for USAID Missions and implementing partners addressing
specific weaknesses identified through the study.
• Periodically repeat the review process as a way to engage with practitioners in a dialogue
about evaluation practice and curate newly generated evidence.
The assessment tool could be useful for other sectors of international development in helping
improve the strength of evidence generated by evaluations.
Conclusions This study demonstrated the benefits of assessing the quality of USAID-funded evaluations in the
education sector using a holistic framework that maps different aspects of an evaluation to seven
principles of quality. The findings from this study suggest two key conclusions:
• Performance evaluations were more likely than impact evaluations to fail to address validity aspects
such as measurement, internal and external validity. This might be the result of substantial
differences in the cost of conducting these evaluations, and that the emphasis that the Agency and
other donors have placed on improving the quality of impact evaluations has been successful.
• The quality of evidence generated in the education sector could be improved through strengthening
qualitative evaluations and through leveraging their exploratory power to complement explanatory
power of quantitative evaluations, possibly through sequential data collection in mixed-methods
evaluations.
Crowdsourcing the evaluation quality assessment used for this study provided an opportunity for the
international education community to come together to discuss quality standards for USAID-funded evaluations. Participants appreciated the opportunity to engage in constructive discussions with USAID
and other practitioners about evaluation practice in international education sector.
For more information on the assessment, contact Elena Walls at [email protected]
5
EVALUATION QUALITY ASSESSMENT TOOL
The evaluation quality assessment tool developed as part of the Assessment of the Quality of USAID-Funded Evaluations in the
Education Sector, 2013-2016, was revised upon completion of the review, based on the comments from reviewers. This version
reflects these revisions.
PR
INC
IPLE
OF
QU
ALI
TY
IMP
AC
T EV
ALU
ATI
ON
S P
ERFO
RM
AN
CE
EVA
LUA
TIO
NS
REV
IEW
RES
ULT
O
VER
ALL
C
ON
CLU
-SI
ON
NO
TES/
JU
STIF
I-C
ATI
ON
QU
AN
TITA
TIV
E Q
UA
LITA
TIV
E
CO
NC
EPTU
AL
FRA
MIN
G
[1]
Are
th
e re
sear
ch/e
valu
atio
n q
ues
tio
ns
incl
ud
ed in
th
e re
po
rt?
yes/
no
adeq
uat
e/n
ot
ade-
qu
ate
[2]
Do
es t
he
rep
ort
incl
ud
e re
sear
ch/e
valu
atio
n h
ypo
thes
es?
yes/
no
[3]
Are
th
e ev
alu
atio
n q
ues
tio
ns
app
rop
riat
e fo
r th
e in
terv
enti
on
's c
on
cep
tual
fra
me-
wo
rk (
logf
ram
e/th
eory
of
chan
ge/
resu
lts
fram
ewo
rk)?
ye
s/p
arti
al/n
o/
no
t ap
plic
able
[4]
Do
es t
he
rep
ort
ack
no
wle
dge
/dra
w u
po
n e
xisti
ng
rele
van
t re
sear
ch?
yes/
par
tial
/no
[5]
Do
es t
he
rep
ort
exp
lain
th
e lo
cal c
on
text
in s
uffi
cien
t d
eta
il?
yes/
par
tial
/no
OP
ENN
ESS
AN
D
TRA
NSP
AR
ENC
Y
[6]
Is t
he
rep
ort
op
en a
bo
ut
stu
dy
limit
atio
ns
wit
h t
he
imp
lem
en
tati
on
of
the
eva
luati
on
, su
ch a
s is
sues
fac
ed d
uri
ng
dat
a co
llecti
on
th
at m
igh
t aff
ect
th
e st
ud
y’s
des
ign
? ye
s/p
arti
al/n
o
adeq
uat
e/n
ot
ade-
qu
ate
[7]
Is t
he
rep
ort
op
en a
bo
ut
stu
dy
limit
atio
ns
du
e to
issu
es w
ith
th
e im
ple
men
tati
on
of
the
inte
rve
nti
on
bei
ng
eval
uat
ed?
yes/
par
tial
/no
/ n
ot
app
licab
le
[8]
Do
es t
he
dis
cuss
ion
ab
ou
t th
e fi
nd
ings
ref
er t
o r
elev
ant
con
text
ual
fac
tors
or
met
ho
d-
olo
gica
l co
nsi
der
atio
ns?
ye
s/n
o/n
ot
ap-
plic
able
[9]
Is t
he
rep
ort
op
en a
bo
ut
po
ten
tial
infl
uen
ce d
ue
to t
he
stu
dy
team
co
mp
osi
tio
n?
yes/
par
tial
/no
CU
LTU
RA
L A
P-
PR
OP
RIA
TEN
ESS
[10
] D
oes
th
e re
po
rt li
st s
tep
s ta
ken
to
en
sure
th
at s
tud
y q
ues
tio
ns
and
met
ho
do
logy
are
in
form
ed b
y lo
cal s
take
ho
lder
s, a
re c
ult
ura
lly r
elev
ant
and
co
nte
xtu
ally
ap
pro
pri
ate?
ye
s/n
o
adeq
uat
e/n
ot
ade-
qu
ate
[11
] D
oes
th
e re
po
rt li
st s
tep
s to
ad
dre
ss a
nd
do
cum
ent
that
dat
a co
llecti
on
to
ols
wer
e d
evel
op
ed/a
dap
ted
wit
h p
arti
cip
atio
n o
f re
leva
nt
loca
l sta
keh
old
ers
and
are
cu
ltu
rally
ap
pro
pri
ate?
ye
s/p
arti
al/n
o
[12
] D
oes
th
e re
po
rt li
st s
tep
s ta
ken
to
val
idat
e fi
nd
ings
/co
ncl
usi
on
s/re
com
men
dati
on
s w
ith
loca
l sta
keh
old
ers
as p
art
of
the
eva
luati
on
? ye
s/n
o
[13
] W
as t
he
stu
dy
des
ign
ed t
o t
ake
into
acc
ou
nt
loca
lly r
elev
ant
stra
tifi
ers,
su
ch a
s p
o-
litica
l, so
cial
, eth
nic
, rel
igio
us,
geo
grap
hic
al o
r se
x/ge
nd
er p
hen
om
ena
du
rin
g d
ata
colle
c-ti
on
an
d d
ata
anal
ysis
? ye
s/p
arti
al/n
o
RO
BU
STN
ESS
OF
MET
HO
DO
LOG
Y
[14
] Is
th
e m
eth
od
olo
gy e
xpla
ined
in s
uffi
cien
t d
etai
l?
yes/
par
tial
/no
ad
equ
ate/
no
t ad
e-q
uat
e
[15
] Is
th
e m
eth
od
olo
gy a
pp
rop
riat
e f
or
answ
erin
g p
ose
d s
tud
y q
ues
tio
ns?
ye
s/p
arti
al/n
o/
no
t ap
plic
able
[16
] D
oes
th
e co
un
terf
actu
al
mee
t st
and
ard
s o
f ri
gor?
yes/
no
/no
t ap
-p
licab
le
[17
] D
oes
th
e re
po
rt in
clu
de
info
rmati
on
fro
m m
ulti
ple
dat
a so
urc
es a
nd
ho
w t
he
dat
a w
ere
tria
ngu
late
d?
yes/
par
tial
/no
/ n
ot
app
licab
le
[18
] D
oes
th
e re
po
rt m
enti
on
ste
ps
take
n t
o m
itiga
te c
om
mo
n t
hre
ats
to t
he
inte
grit
y o
f th
e ev
alu
atio
n (
such
as
no
n-e
qu
ival
ence
at
bas
elin
e, n
on
-co
mp
lian
ce, s
pill
ove
r, s
yste
m-
atic
attri
tio
n)
or
com
mo
n b
iase
s (c
on
fou
nd
ing
bia
s, s
elec
tio
n b
ias,
exp
erim
ente
r b
ias,
et
c)?
yes/
par
tial
/no
6
[19
] Fo
r th
e q
uan
tita
tive
res
earc
h m
eth
od
s u
sed
, are
th
e sa
m-
plin
g ap
pro
ach
an
d s
amp
le s
ize
calc
ula
tio
ns
pre
sen
ted
in s
uffi
-ci
ent
de
tail
(to
incl
ud
e, a
t a
min
imu
m, t
ype
of
anal
ysis
, MD
ES,
alp
ha
and
bet
a)?
ye
s/p
arti
al/n
o/
no
t ap
plic
able
[20
] Fo
r th
e q
ual
ita
tive
re-
sear
ch m
eth
od
s u
sed
, is
the
sam
plin
g ap
pro
ach
des
crib
ed
in s
uffi
cien
t d
etai
l? (
at a
min
i-m
um
, a r
atio
nal
e fo
r th
e sa
m-
ple
siz
e an
d m
eth
od
of
sam
ple
se
lecti
on
) an
d is
it a
pp
rop
riat
e fo
r th
e st
ud
y o
bje
ctive
s?
yes/
par
tial
/no
/ n
ot
app
licab
le
VA
LID
ITY
[21
] D
o in
dic
ato
rs u
sed
in t
he
eval
uati
on
cap
ture
th
e co
nst
ruct
or
ph
eno
men
on
bei
ng
inve
stiga
ted
?
yes/
par
tial
/no
/ n
ot
app
licab
le
adeq
uat
e/n
ot
adeq
uat
e
[22
] W
ere
the
sam
plin
g co
nd
uct
ed in
su
ch a
way
su
ch t
hat
th
e re
sult
s ar
e ge
ner
aliz
able
to
th
e p
op
ula
tio
n o
f b
enefi
ciar
-ie
s re
ach
ed t
hro
ugh
th
e ac
tivi
ty?
ye
s/p
arti
al/n
o/
no
t ap
plic
able
[23
] D
oes
th
e re
po
rt a
llud
e to
wh
eth
er t
he
stu
dy
fin
din
gs m
ay h
ave
bee
n b
iase
d b
y th
e ac
tivi
ty o
f d
oin
g th
e st
ud
y it
self
? ye
s/n
o
[24
] D
oes
th
e re
po
rt a
dd
ress
th
e ex
tern
al v
alid
ity
of
fin
din
gs?
yes/
par
tial
/no
/ n
ot
app
licab
le
[25
] W
ere
all d
ata
colle
ctio
n t
oo
ls p
ilote
d w
ith
rep
rese
nta
tive
s o
f ta
rget
po
pu
lati
on
s p
rio
r to
beg
in-
nin
g o
f th
e d
ata
colle
ctio
n?
yes/
par
tial
/no
[26
] A
re c
on
fid
ence
inte
rval
s re
po
rted
aro
un
d p
oin
t es
tim
ates
?
ye
s/n
o/n
ot
app
li-ca
ble
[2
7]
Is t
reat
men
t eff
ect
pre
sen
ted
in t
erm
s o
f eff
ect
size
?
yes/
no
/no
t ap
pli-
cab
le
REL
IAB
ILIT
Y
[28
] D
oes
th
e re
po
rt li
st s
tep
s ta
ken
to
en
sure
th
at d
ata
wer
e co
llect
ed w
ith
a h
igh
deg
ree
of
relia
-b
ility
? ye
s/p
arti
al/n
o
adeq
uat
e/n
ot
adeq
uat
e
[29
] D
oes
th
e re
po
rt a
deq
uat
ely
add
ress
mis
sin
g d
ata/
no
n-r
esp
on
se?
yes/
par
tial
/no
CO
GEN
CY
[30
] A
re a
ll th
e st
ud
y q
ues
tio
ns,
incl
ud
ing
sub
-qu
esti
on
s, a
nsw
ered
? ye
s/n
o/n
ot
app
li-ca
ble
adeq
uat
e/n
ot
adeq
uat
e
[31
] D
oes
th
e Ex
ecu
tive
Su
mm
ary
incl
ud
e an
swer
s to
all
of
the
stu
dy
qu
esti
on
s?
yes/
no
[32
] Is
th
e re
po
rt a
cces
sib
le t
o t
he
aud
ien
ces
for
wh
om
th
e re
po
rt in
dic
ates
it is
wri
tten
(e.
g., m
ini-
miz
ing
tech
nic
al ja
rgo
n if
inte
nd
ed t
o t
he
gen
eral
pu
blic
)?
yes/
no
[33
] A
re c
on
clu
sio
ns
bas
ed o
n fi
nd
ings
an
d a
re t
he
fin
din
gs r
elat
ed t
o t
he
eval
uati
on
qu
esti
on
s?
yes/
par
tial
/no
/ n
ot
app
licab
le
[34
] Is
th
e n
arra
tive
in t
he
rep
ort
su
pp
ort
ed b
y ch
arts
, map
s an
d in
fogr
aph
ics
that
hel
p n
on
-te
chn
ical
au
die
nce
s ea
sily
un
der
stan
d t
he
stu
dy
fin
din
gs?
yes/
par
tial
/no
PR
INC
IPLE
OF
QU
AL-
ITY
IMP
AC
T EV
ALU
ATI
ON
S P
ERFO
RM
AN
CE
EVA
LUA
TIO
NS
REV
IEW
RES
ULT
O
VER
ALL
CO
N-
CLU
SIO
N
NO
TES/
JU
STIF
I-C
ATI
ON
QU
AN
TITA
TIV
E Q
UA
LITA
TIV
E