Addressing differences in rigour and relevance of evidence – a review of existing methods Rebecca...
-
Upload
anahi-curl -
Category
Documents
-
view
215 -
download
0
Transcript of Addressing differences in rigour and relevance of evidence – a review of existing methods Rebecca...
Addressing differences in rigour and relevance of evidence –
a review of existing methods
Rebecca Turner, David Spiegelhalter, Simon Thompson
MRC Biostatistics Unit, Cambridge
OutlineOutline
Why address rigour and relevance?
Review of methods for addressing rigour and relevance
Bias modelling and using external information on sources of bias
Ongoing work and issues for discussion
Differences in rigour (internal bias)Differences in rigour (internal bias)
Examples of internal bias
Inadequacy of randomisation/allocation concealment in RCTs Non-compliance and drop-out in RCTs Selection bias and non-response bias in observational studies Confounding in observational studies Misclassification of exposure in case-control studies
Evidence synthesis: usual approach
Choose a minimum standard Include studies which achieve standard and make no allowance
for further differences between these Exclude studies failing to reach standard
Problems with usual approach to rigourProblems with usual approach to rigour
Relevant evidence, in some cases the majority of available studies, is discarded
undesirable effects on precision (and bias?)
No allowance for differences in rigour between studies included in combined analysis
once minimum criteria achieved, less rigorous studies given equal influence in analysis
policy decisions may be based on misleading results?
Differences in relevance (external bias)Differences in relevance (external bias)
Examples of external bias
Study population different from target population Outcome similar but not identical to target outcome Interventions different from target interventions e.g. dose
Evidence synthesis: usual approach
Similar to that for rigour: studies which achieve minimum standard included and no allowance for further differences
Sometimes separate analyses carried out for different types of population/intervention
Degrees of relevance are specific to target setting, so decisions on relevance are necessarily rather subjective
Example 1: donepezil for treatment ofdementia due to Alzheimer’s disease
Example 1: donepezil for treatment ofdementia due to Alzheimer’s disease
Analysis reported by Birks and Harvey (2003) included
17 double-blind placebo-controlled RCTs only.
40 relevant comparative studies identified:
17 double-blind placebo-controlled RCTs
1 single-blind placebo-controlled RCT
14 non-randomised and/or open-label studies
8 donepezil vs. active comparisons, 1 randomised
Example 2: modified donepezil exampleExample 2: modified donepezil example
25 relevant comparative studies identified:
1 double-blind placebo-controlled RCT
2 single-blind placebo-controlled RCTs
14 non-randomised and/or open-label studies
8 donepezil vs. active comparisons, 1 randomised
Example 2: modified donepezil exampleExample 2: modified donepezil example
25 relevant comparative studies identified:
1 double-blind placebo-controlled RCT
2 single-blind placebo-controlled RCTs
14 non-randomised and/or open-label studies
8 donepezil vs. active comparisons, 1 randomised
– Include only randomised studies? Allow for degree of blinding?
– Include all studies? Allow for degree of blinding and randomisation?
– Allow for additional sources of bias, and deviations from target population, outcome, details of intervention?
Methods for addressing differencesin rigour and relevance
Methods for addressing differencesin rigour and relevance
Existing approaches:
Methods based on quality scores
Random effects modelling of bias
Full bias modelling using external information on specific sources of bias
Methods based on quality scoresMethods based on quality scores
Exclude studies below a quality score threshold
Weight the analysis by quality
Examine relationship between effect size and quality score
Cumulative meta-analysis according to quality score
Problems include:
Difficult to capture quality in a single score
Quality items represented may be irrelevant to bias
No allowance for direction of individual biases
Random effects modelling of biasRandom effects modelling of bias
Assume that each study i estimates a biased parameter i rather than target parameter
Choose a distribution to describe plausible size (and direction) of the bias i for each study
Standard random-effects analysis is equivalent to assuming E[i ] =0 , V[i ] = 2
This assumption of common uncertainty about study biases seems rather strong
Hip replacements exampleHip replacements example
Comparison of hip replacements (Charnley vs. Stanmore)
Endpoint: patient requires revision operation
Three studies available: Registry data
RCT
Case series
Assumptions: RCT evidence unbiased
bias in case series > bias in registry data
Spiegelhalter and Best, 2003
Hip replacements example: allowing for biasHip replacements example: allowing for bias
o
o
o
o
o
o
RegistryRCT
Case series
UnweightedWeighted(1)Weighted(2)
0 1 2 3 4
Hazard ratio for revision operation
Values assigned to variance of bias, which controls the extent to which evidence will be downweighted.
Problems:
How to choose the values which control the weighting?
No separation of internal and external bias.
Full bias modellingFull bias modelling
Identify sources of potential bias in available evidence
Obtain external information on the likely form of each bias
Construct a model to correct the data analysis for multiple biases, on the basis of external information
Example (Greenland, 2005):
14 case-control studies of association between residential magnetic fields and childhood leukaemia.
Potential biases identified: Non-responseConfoundingMisclassification of exposure
Magnetic fields example: allowing for biasMagnetic fields example: allowing for bias
Bias corrected for OR 95% CI P-value
Non-response 1.45 (0.94,2.28) 0.05
Confounding 1.69 (1.32,2.33) 0.002
Misclassification 2.92 (1.42,35.1) 0.011
All three biases 2.70 (0.99,32.5) 0.026
Conventional analysis
Odds ratio for leukaemia, fields >3mG vs. 3mG:
1.68 (1.27, 2.22), P-value of 0.0001
Choosing values for the bias parametersChoosing values for the bias parameters
Bias due to an unknown confounder U:
Need to express prior beliefs for:
OR relating U to magnetic fields (within leukaemia strata)OR relating U to leukaemia (within magnetic fields strata)
Greenland (2005) chooses wide distributions giving 5th and 95th percentiles of 1/6 and 6.
Multiple studies
Greenland expects degree of confounding to vary according to study location and method of measuring magnetic fields
Uses location, measurement method as predictors for log ORs
Example: ETS and lung cancerExample: ETS and lung cancer
Wolpert and Mengersen, 2004:
29 case-control and cohort studies of association between ETS and lung cancer.
Potential biases identified: Eligibility violationsMisclassification of exposureMisclassification of cases
Penalty points represent each study’s control of each bias
Error rates assumed to increase with each penalty point
E.g. eligibility violation: 5% for typical studies
doubles with each penalty point
Arguments against bias modellingArguments against bias modelling
Impossible to identify all sources of bias
Little information on the likely effects of bias, even for known sources
Bias modelling requires external (subjective) input, rather than letting the data “speak for themselves”
Increases complexity of analysis problems with presentation and interpretation
Arguments for bias modellingArguments for bias modelling
Assumption of zero bias is extremely implausible in most analyses (although zero expected bias may be reasonable)
Uncertainty due to potential biases may be much larger than uncertainty due to random error
Informal discussion of the possible effects of bias is not sufficient
Preferable to include all relevant data and model bias, rather than throwing much of the data away?
Aims of planned workAims of planned work Allow for both rigour and relevance (internal & external bias)
Consider potential sources of bias, and available evidence on plausible sizes of biases
Construct simple models for adjustment
Develop elicitation strategy for obtaining judgements on reasonable size of unmodelled sources of bias
Develop strategy for sensitivity analysis
Simple models for bias
Require 4 bias parameters for each study:
RIG, RIG control rigour
REL, REL control relevance
ChallengesChallenges
Problem of multiple biases is complex, but approach for correction must be simple and accessible.
Otherwise evidence synthesis will, in general, continue to exclude some studies and make no allowance for differences between others.
When correcting for multiple biases, important to determine a strategy for sensitivity analysis.
Issues for discussionIssues for discussion
Credibility of findings which incorporate external information in addition to data
More acceptable when available evidence is scarce and expected to be biased than when many RCTs available?
Greenland and others argue that analysis corrected for biases should be treated as definitive analysis (i.e. not only sensitivity analysis) – is this a realistic aim?
ReferencesReferences
Eddy DM, Hasselblad V, Shachter R. Meta-analysis by the Confidence Profile Method. Academic Press: San Diego, 1992.
Greenland S. Multiple-bias modelling for analysis of observational data. Journal of the Royal Statistical Society Series A 2005; 168: 267-291.
Spiegelhalter DJ, Best NG. Bayesian approaches to multiple sources of evidence and uncertainty in complex cost-effectiveness modelling. Statistics in Medicine 2003; 22: 3687-3709.
Wolpert RL, Mengersen KL. Adjusted likelihoods for synthesizing empirical evidence from studies that differ in quality and design: effects of environmental tobacco smoke. Statistical Science 2004; 19: 450-471.