Rigour of evaluation Dr Carole Torgerson Senior Research Fellow Institute for Effective Education...

Rigour of evaluation

Dr Carole Torgerson

Senior Research Fellow

Institute for Effective Education

University of York

“A careful look at randomized experiments will make clear that

they are not the gold standard. But then, nothing is. And the

alternatives are usually worse.”

Berk RA. (2005) Journal of Experimental Criminology 1, 417-433.

Characteristics of a rigorous trial

• Once randomised, all participants are included within their allocated groups.

• Random allocation is undertaken by an independent third party.

• Outcome data are collected blindly.• Sample size is sufficient to exclude an

important difference.• A single analysis is pre-specified before

data analysis.

Education: comparison with health education

Characteristic Health Ed

Education Ed

Cluster Randomised 36% 18% Sample size justified 28% 0% Concealed randomisation 8% 0% Blinded Follow-up 30% 14% Use of CIs 41% 1% Low Statistical Power 41% 85%

Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison of randomised controlled trials in health and education. British Educational Research Journal,31:761-785. (based on n = 168 trials)

Problems with RCTs

• Failure to keep to random allocation – can introduce selection bias

• Attrition - can introduce selection bias• Unblinded ascertainment - can lead to

ascertainment bias• Small samples - can lead to Type II error• Multiple statistical tests - can give Type I errors• Poor reporting of uncertainty (e.g., lack of

confidence intervals)

Which are RCTs?

• “We took two groups of schools – one group had high ICT use and the other low ICT use – we then took a random sample of pupils from each school and tested them”.

• “We put the students into two groups, we then randomly allocated one group to the intervention whilst the other formed the control”

• “We formed the two groups so that they were approximately balanced on gender and pre-test scores”

• “We identified 200 children with a low reading age and then randomly selected 50 to whom we gave the intervention. They were then compared to the remaining 150”.

• “Of the eight [schools] two randomly chosen schools served as a control group”

Mixed allocation

• “Students were randomly assigned to either Teen Outreach participation or the control condition either at the student level (i.e., sites had more students sign up than could be accommodated and participants and controls were selected by picking names out of a hat or choosing every other name on an alphabetized list) or less frequently at the classroom level”

Allen et al, Child Development 1997;64:729-42.

Is it randomised?

• “The groups were balanced for gender and, as far as possible, for school. Otherwise, allocation was randomised.”

Thomson et al. Br J Educ Psychology 1998;68:475-91.

Is it randomised?

• “The students were assigned to one of three groups, depending on how revisions were made: exclusively with computer word processing, exclusively with paper and pencil or a combination of the two techniques.”

Greda and Hannafin, J Educ Res 1992;85:144.

Non-random assignment confused with random allocation

• “Before mailing, recipients were randomized by rearranging them in alphabetical order according to the first name of each person. The first 250 received one scratch ticket for a lottery conducted by the Norwegian Society for the Blind, the second 250 received two such scratch tickets, and the third 250 were promised two scratch tickets if they replied within one week.”

Finsen V, Storeheier, AH (2006) Scratch lottery tickets are a poor

incentive to respond to mailed questionnaires. BMC Medical Research Methodology 6, 19. doi:10.1186/1471-2288-6-19.

What is the problem here?

• “Pairs of students in each classroom were matched on a salient pretest variable, Rapid Letter Naming, and randomly assigned to treatment and comparison groups.”

• “The original sample – those students were tested at the beginning of Grade 1 – included 64 assigned to the SMART program and 63 assigned to the comparison group.”

Baker S, Gersten R, Keating T. (2000) When less may be more: A 2-year longitudinal evaluation of a volunteer tutoring program requiring minimal training.

Reading Research Quarterly 35, 494-519.

What is wrong here?

• “the remaining 4 classes of fifth-grade students (n = 96) were randomly assigned, each as an intact class, to the [4] prewriting treatment groups;”

Brodney et al. J Exp Educ 1999;68,5-20.

Misallocation issues

• “23 offenders from the treatment group could not attend the CBT course and they were then placed in the control group”.

Independent assignment

• “Randomisation by centre was conducted by personnel who were not otherwise involved in the research project” [1]

• Distant assignment was used to: “protect overrides of group assignment by the staff, who might have a concern that some cases receive home visits regardless of the outcome of the assignment process”[2]

[1] Cohen et al. (2005) J of Speech Language and Hearing Res. 48, 715-729.

[2] Davis RG, Taylor BG. (1997) Criminology 35, 307-333.

Attrition

• Attrition can lead to bias; a high quality trial will have maximal follow-up after allocation.

• It can be difficult to ascertain the amount of attrition and whether or not attrition rates are comparable between groups.

• A good trial reports low attrition with no between group differences.

• Rule of thumb: 0-5%, not likely to be a problem. 6% to 20%, worrying, > 20% selection bias.

Poorly reported attrition

• In a RCT of Foster-Carers extra training was given.» “Some carers withdrew from the study once the dates

and/or location were confirmed; others withdrew once they realized that they had been allocated to the control group” “117 participants comprised the final sample”

• No split between groups is given except in one table which shows 67 in the intervention group and 50 in the control group. 25% more in the intervention group – unequal attrition hallmark of potential selection bias. But we cannot be sure.

Macdonald & Turner, Brit J Social Work (2005) 35,1265

What is the problem here?

Random allocation

160 children in 20 schools (8 per school)

80 in each group

76 children allocated to control 76 allocated to intervention group

1 school 8 children withdrew

N = 17 children replaced following

discussion with teachers

What about matched pairs?

• We can only match on observable variables and we trust to randomisation to ensure that unobserved covariates or confounders are equally distributed between groups.

Matched Pairs on Gender

Control

(unknown covariate)

Intervention

(unknown covariate)

Boy (high) Boy (low)

Girl (high) Girl (high)

Girl (low) Girl (high)



3 Girls and 3 highs 3 Girls and 3 highs.

Drop-out of 1 girl

Control Intervention





Girl (high)


Removing matched pair does not balance the groups!

Control Intervention






Blinding of Outcome Assessment

• Ascertainment bias can result when the assessor is not blind to group assignment, e.g., homeopathy study of histamine showed an effect when researchers were not blind to the assignment but no effect when they were.

• Example of outcome assessment blinding: Study “was implemented with blind assessment of outcome by qualified speech language pathologists who were not otherwise involved in the project”[1]

[1] Cohen et al. (2005) J of Speech Language and Hearing Res. 48, 715-729.

ITT analysis: examples

• Seven participants allocated to the control condition (1.6%) received the intervention, whilst 65 allocated to the intervention failed to receive treatment (15%). The authors, however, analysed by randomised group - CORRECT approach.

• “It was found in each sample that approximately 86% of the students with access to reading supports used them. Therefore, one-way ANOVAs were computed for each school sample, comparing this subsample with subjects who did not have access to reading supports.” -INCORRECT

Davis RG, Taylor BG. (1997) Criminology 35, 307-333.

Feldman SC, Fish MC. (1991) Journal of Educational Computing Research 7, 25-36.

.

The CONSORT guidelines, adapted for trials in educational

research• Was the target sample size adequately determined?• Was intention to teach analysis used? (i.e. were all children who were

randomised included in the follow-up and analysis?)• Were the participants allocated using random number tables, coin

flip, computer generation?• Was the randomisation process concealed from the investigators?

(i.e. were the researchers who were recruiting children to the trial blind to the child’s allocation until after that child had been included in the trial?)

• Were follow-up measures administered blind? (i.e. were the researchers who administered the outcome measures blind to treatment allocation?)

• Was precision of effect size estimated (confidence intervals)?• Were summary data presented in sufficient detail to permit alternative

analyses or replication?• Was the discussion of the study findings consistent with the data?

Flow Diagram

• In health care trials reported in the main medical journals authors are required to produce a CONSORT flow diagram.

• The trial by Hatcher et al, clearly shows the fate of the participants after randomisation until analysis.

Flow Diagrams

2 sch o o ls exc lud edd u e to in su ff ic ien t nu m b e rs

o f p oo r spe lle rs

1 sch oo l (6 ch ild re n ) w ith d rewfro m s tu dy a fte r ra n d om isa tion

3 9 /42 ch ild re n in 13 rem a in ingsch oo ls a llo ca te d to 2 0 -w e ek

in te rve n tion3 9 /42 ch ild re n in c lud ed

3 9 /42 ch ild re n in 13 rem a in ingsch oo ls a llo ca ted to 1 0 -w e e k in te rven tion

1 ch ild le ft s tud y (m o ved sch o o l)3 8 /42 ch ild re n in c lud ed

8 4 /1 1 8 in 1 4 re m a in ingsch o o ls (6 p e r sch oo l) se le c ted fo r

ra n do m isa tion to in te rve n tio nse xc lu d ed 9 ch ild ren d ue to b eh a vio ur

1 1 8 ch ild re n w ith p oo r spe llingsk ills g ive n in d iv idu a l tes ts o f

vo cab u la ry, le tte r kno w led g e , w o rdre ad ing an d p h on em e m an ipu la tion

6 3 5 ch ild re n in 1 6 scho o lssc ree n e d u s in g g ro u p sp e llin g te st

Hatcher et al. 2005 J Child Psych Psychiatry: online

Year 7 PupilsN = 155

Randomised

ICT groupN = 77

N = 3 left school

No ICT GroupN = 78

N = 1 left school

75 valid pre-tests71 valid post-tests

67 valid pre and post tests

70 valid pre-tests67 valid post-tests

63 valid pre and post

Rigour of evaluation Dr Carole Torgerson Senior Research Fellow Institute for Effective Education...

Documents

Transcript of Rigour of evaluation Dr Carole Torgerson Senior Research Fellow Institute for Effective Education...