Evidence-based practice in Speech Pathology – the case for single cases and how to interpret them...

Post on 01-Apr-2015

224 views 0 download

Tags:

Transcript of Evidence-based practice in Speech Pathology – the case for single cases and how to interpret them...

Evidence-based practice in Speech Pathology – the case for single cases

and how to interpret them 

Lyndsey Nickels

Research Speech Pathologist,

Macquarie Centre for Cognitive Science.

Outline

• RCTs• Group vs case series vs single case

• Single case methodology

RCTs – the gold standard???

Howard (1986) • “Beyond randomised controlled trials: the case for

effective case studies of the effects of treatment in aphasia”

Argues “the effectiveness of aphasia therapy is not an issue that can be addressed by an RCT; in this case (and many others) it is an inappropriate scientific technique.”

Asking the right question

• Is speech pathology treatment effective?

= is surgery effective?

= is medication effective?

Is treatment effective for aphasia/SLI/dyspraxia…

= is aspirin effective for headache?

… we need to have precise questions

Homogeneity assumptions

• RCTs assume homogeneity of populationbecause aphasics are not a homogeneous population it could

never be possible to generalise results to all individuals

• RCTs assume homogeneity of treatmentTreatments are rarely homogeneous

Diverse treatments with diverse participants

• RCTs are usually analysed using ANOVA - assume that the effects of treatment are homogeneous across

the population (but subject to measurement error) - Preferable to use statistics to analyse whether there are

statistical diffs in the size of the treatment effects

  International Conference on Effectiveness of Rehabilitation for Cognitive Deficits

Cardiff, Wales.17th – 19th September 2002

View from a supporter of RCTs???

Keith D. Cicerone, Ph.D.

The Validity of Cognitive Rehabilitation:

Strategies for Evaluating Effectiveness and Translating Research to Clinical Practice

Cicerone et al (2000)Cicerone KD. Dahlberg C. Kalmar K. Langenbahn DM. Malec JF. Bergquist TF. Felicetti T. Giacino JT. Harley JP. Harrington DE. Herzog J. Kneipp S.

Laatsch L. Morse PA. Evidence-based cognitive rehabilitation: recommendations for clinical practice. Archives of Physical Medicine &

Rehabilitation. 81(12):1596-615, 2000.

Standards of evidence

Class I

• Well designed, prospective, randomized controlled trials

• Well designed, prospective studies with ‘quasi-random’ assignment to treatment conditions (Ia)

Class II • Prospective, non-randomized cohort studies• Retrospective, non-randomized case control studies• Clinical series with well-designed controls allowing

between-subject comparisons

Standards of evidence

Class III

• Clinical series without concurrent controls• Case studies with appropriate single-subject

methodology and measurements

This paper isn’t even good enough to be wrong.

Wolfgang Pauli

The most important maxim for data analysis is this…Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong

question...John W. Tukey, 1962

Efficacy Studies evaluate the effects of a highly constrained treatment given to a strictly defined sample under ideal

conditions of use

Effectiveness Studies evaluate the effects of a typical treatment given to a clinical

sample under usual conditions of use

“Cholesterol-lowering drugs less effective in real world”

• Several large Randomized Controlled Trials (of highly selected patients receiving medications from study nurses) have demonstrated the efficacy of statins in reducing LDL and incidence of heart attacks

• Among 375 patients receiving clinical treatment with statins:

• 66% benefited less than predicted by RCT results• 18 % showed no change or worsening LDL levels• Follow-up indicated that clinic patients did not take

the prescribed medication!

Begin with “If…”(W.A. Silverman, Where’s the Evidence?, 1998)

• Concluding statements in all clinical studies ought to begin with ‘If’…

• “If this were a strictly random sample from a population completely defined in all respects, then it might be concluded that this intervention….”

“Volunteer/Participant Problem”Sackett, 1995

• “Volunteers for [medical research] are generally a strange and healthy lot, and we can not generalize from them to other patients”

Volunteer Participant Problem in Rehabilitation ?

• Neuropsychological versus psychosocial treatment (Ruff et al, 1989)

• Volunteer participants screened for language, vision, motor impairments; no history of neurologic or psychiatric disorders; motivated and available.

• Both groups improved significantly…

Effective for Whom?

• Participants often selected on the basis of general diagnosis, without specification of target impairment / problem

• Stratification by demographics, severity of illness/injury, severity of deficit, chronicity, co-morbidity…

Memory Notebook Training after Severe Head Injury(Schmitter-Edgecome et al, 1995)

• Small RCT evaluating training in use of a memory notebook for participants with mild memory impairments

• Benefits apparent on Observed Everyday Memory Failures but not on laboratory-based memory measures

A planned RCT is investigating treatment of severe memory impairments with memory notebooks

…Outcome is being assessed on laboratory measures of prose recall without the use of the notebook.

Clinical application of therapeutic trials

“We need to decide which approach in our large therapeutic armamentarium will be most appropriate in a particular patient, with a particular stage of disease, and particular co-exising conditions, at a particular age…even when RCTs have been performed …they will often not answer this question…” (G. Thibault, 1993)

Clinical application of therapeutic trials (Sackett

et al, 2000)

• Is our patient similar enough to those in the study that its results can apply?

• Is the treatment feasible in our setting?• If applicable and feasible, what are the unique risks and benefits

to our patient?• What are our patient’s values and expectations…both for the

treatment we are offering, and for the alternatives?

Without therapeutic enthusiasm, there is no innovation.Without skepticism,

there is no proof. V. Hachinski, 1990

Group vs single case

Group Studies• Complicated & expensive• In theory easily

generalised to a population

• In practice limited by heterogeneous therapy (& subjects)

Single case

• Quick and cheap

• Easy to specify both patient and therapy

• Results only apply to the subject studied but open to replication

Case series approach can combine the advantages of both

Case series

• A series of participants given the same treatment in the same way

• Analysed as single cases• Statistically test whether the effects of the treatment

are homogeneous(NB Null results = no difference, or not enough power)

• Investigate the sources of differences (if any) among participants in the degree to which they show improvement.

Methodological issues

• Sensitivity of the assessment to change

• How to demonstrate this change is a result of the intervention

Evaluating the outcome of the (aphasia) rehabilitation process.

• requires appropriate assessment before rehabilitation begins, during the rehabilitation process and after rehabilitation (by the therapist) has ended.

• Requires testing that is…

“reliable enough to give consistent measures”

“sensitive enough to measure the improvement that the particular therapy involved is intended to produce”

“valid so that it measures changes that are of real consequence” (Howard & Hatfield, 1987, p113).

Are standardised aphasia batteries adequate for rehabilitation-focused assessment

of aphasia?

Are standardised aphasia batteries adequate for documenting change over time?

• lack of sensitivity to change improvement in a specific area will not be evident in the overall score

• problem of variabilitythe smaller the numbers of items, the harder it is to distinguish ‘real’ change

from the ‘noise’ caused by variability

“standardised tests that measure a non-specific overall level of deficit cannot be expected to measure specific improvement – particularly when the unreliability of performance is taken into account”

(Howard & Hatfield, 1987 p114).

How we should evaluate. (cont)

Levels of assessment• impairment

• Activity limitations (disability)

• Participation restrictions (handicap)

• Quality of life/psychosocial issues.

Summary

• inappropriate to use a general, broad ranging assessment

• assessment should be hypothesis driven

• Applicable to every aspect of an individual and their social context that is, or might be, impacted by the language impairment

Summary (cont.)

For documenting change over time …• assessments need to be reliable (show consistent test-

retest) and sensitive • relatively large samples of behaviour

• relationship between impairment of language function and restrictions in language activities, participation and quality of life is not straightforward.

• attempts to correlate change at one level with change at the other are fundamentally flawedactivity, participation and quality of life are impacted by many factors

over and above the language impairment

Evaluating therapy

Only possible to establish what treatments are effective and for whom, if, in treatment studies of any kind

1. The treatment is specified

2. The nature of the participant(s) disorder is specified

3. It is established which of the participants benefitted.

Distinguish the effects of

- Spontaneous recovery- General effects of therapy (therapist charm/stimulation/placebo)

- Specific effects of treatment

Evaluating therapy

1. The same assessments should be repeated before and after therapy.

2. These assessments should contain at least one measure of the skill that is to be treated and this measure should contain enough items to allow change to be demonstrated.

(sensitive & reliable measures as discussed)

Evaluating therapy

3. More than one pre-therapy baseline to establish degree of spontaneous recovery and/or variability.

- without this there is no guarantee that change is a result of intervention (perhaps it was happening already)

Evaluating therapy (cont)

4. Select a control task that you would not be expect to be affected by the treatment and assess before and after therapy. Any change in the control task will suggest spontaneous recovery

or non-specific treatment effects

Evaluating therapy (cont)

5. For treatments which focus on treatment of particular stimuli (rather than using a wide range of stimuli for teaching of new strategies)

Divide the assessed and ‘to-be-treated’ stimuli into two sets of equal difficulty based on pre-treatment performance, treat one set first, reassess and then treat the second set and reassess (a ‘cross-over’ design).

If treated items improve more than untreated items, then it can be argued that the treatment has had an effect, even if spontaneous recovery is a factor

Evaluating therapy (cont)

6. Objective statistical analysis.

e.g. McNemar’s test

Evaluating therapy (cont)

• Daily pretests & posttests- tracks effect of treatment- Allows termination if no improvement or when ceiling

reached

BUT

- Hard to use with home programmes- May be too onerous- Distinguishing practice/repetition effects from treatment

effects

• Figure 1a: An example of a multiple baseline study with item specific improvement. Baselines are stable (no consistent improvement in naming performance over three pre-therapy baselines). Following treatment of one set of items, they improve in naming. However, there is no change in untreated items in picture naming, indicating item specific effects of treatment. Similarly there is no change in a control task chosen to be unaffected by the treatment process (e.g. auditory lexical decision). Hence we can conclude (assuming statistical significance) that the treatment is effective, there is no generalisation to untreated items or tasks, nor is there spontaneous recovery or non-specific effects of treatment.

Figure 1a

0

10

20

30

40

50

60

70

0 1 2 3 4 5 6

Week no.

% C

orr

ect naming treated

items

naming untreateditems

control task

Baseline Treatment Post-test

Figure 1b

0

10

20

30

40

50

60

70

0 2 4 6

naming treateditems

naming untreateditems

control task

Baseline Treatment Post-test

Figure 1b: An example of a multiple baseline study, with item specific improvement and spontaneous recovery. Baselines are rising, indicating spontaneous recovery. However, following treatment of one set of items, they improve in naming at a faster rate than during the baseline period. However, there is no change in the rate of improvement for untreated items in picture naming. Similarly there is no change in rate of improvement for a

control task. Hence we can conclude (assuming statistical significance) that the treatment is effective and there is no generalisation to untreated items or tasks, although there is also spontaneous recovery (or non-specific effects

of treatment).

Figure 2a

0

10

20

30

40

50

60

70

0 2 4 6

naming treateditems

naming untreateditems

control task

Baseline Treatment Post-test

Figure 2a: An example of a multiple baseline study with generalisation to untreated items in the same task. Baselines are stable (no consistent improvement in naming performance over three pre-therapy baselines). Following treatment of one set of items, they improve in naming. However, there is also improvement for untreated items in picture naming, indicating generalisation of treatment effects across items. There is no change in a control task. Hence we can conclude (assuming statistical significance) that the treatment is effective, there is generalisation to untreated items but not to an untreated (and unrelated) task, nor is there spontaneous recovery or non-specific effects of treatment.

Figure 2b

0

10

20

30

40

50

60

70

0 2 4 6

naming treateditems

naming untreateditems

control task

Baseline Treatment Post-test

Figure 2b: An example of a multiple baseline study with generalisation to untreated items in the same task, and spontaneous recovery. Baselines are rising, indicating spontaneous recovery. However, following treatment of one set of items, they improve in naming at a faster rate than during the baseline period. There is also an increase in the rate of change of improvement for untreated items in picture naming, indicating generalisation of treatment effects across items. There is no difference in the rate of change for the control task. Hence we can conclude (assuming statistical significance) that the treatment is effective, there is generalisation to untreated items but not to an untreated (and unrelated) task, and there is spontaneous recovery.

Figure 3a

0

10

20

30

40

50

60

70

0 2 4 6 8 10

naming set a

naming set b

control task

BaselineTreatment

set A Post-testTreatment

set bPost-test

Figure 3a: An example of a multiple baseline study with item specific improvement, and a cross-over design. Baselines are stable (no consistent improvement in naming performance over three pre-therapy baselines). Following treatment of set a, naming improves on that set. However, there is no change in untreated items (set b) in picture naming, indicating item specific effects of treatment. Similarly there is no change in a control task. When in the second phase of treatment, set b are treated, they now show improvement. Hence we can conclude (assuming statistical significance) that the treatment is effective, there is no generalisation to untreated items or tasks, nor is there spontaneous recovery or non-specific effects of treatment.

Figure 3b

0

10

20

30

40

50

60

70

80

0 2 4 6 8 10

naming set a

naming set b

control task

BaselineTreatment

set A Post-testTreatment

set bPost-test

Figure 3b An example of a multiple baseline study, with item specific improvement, spontaneous recovery and a cross-over design. Baselines are rising, indicating spontaneous recovery. However, following treatment of set a, naming improves at a faster rate than during the baseline period. However, there is no change in the rate of improvement for untreated items (set b) in picture naming. Similarly there is no change in rate of improvement for a control task. When the previously untreated items (set b) are treated, they now show an increase in the rate of change of improvement. Hence we can conclude (assuming statistical significance) that the treatment is effective and there is no generalisation to untreated items or tasks, although there is also spontaneous recovery (or non-specific effects of treatment).

 

“failure to apply scientific thinking and measurement

during the clinical process is surely as misguided as

leaving our empathy, clinical intuition, and caring

attitudes behind as we enter the clinical arena”

(Kearns, 1993 p71.)