Selecting Evidence for Comparative Effectiveness Reviews: When to use Observational Studies

Selecting Evidence for Comparative Selecting Evidence for Comparative Effectiveness Reviews:Effectiveness Reviews:

When to use Observational StudiesWhen to use Observational Studies

Dan Jonas, MD, MPH Dan Jonas, MD, MPH Meera Viswanathan, PhDMeera Viswanathan, PhDKaren Crotty, PhD, MPH Karen Crotty, PhD, MPH

RTI-UNC Evidence-based Practice CenterRTI-UNC Evidence-based Practice Center

SourcesSources

AHRQ Methods Guide, Chapters 4 and 8, AHRQ Methods Guide, Chapters 4 and 8, http://www.effectivehealthcare.ahrq.gov/repFiles/2007_10DraftMethodsGuide.pdf

Draft manuscript, Norris et al., Observational Draft manuscript, Norris et al., Observational Studies in Systematic Reviews of Studies in Systematic Reviews of Comparative Effectiveness.Comparative Effectiveness.

Chou R, Aronson N, Atkins D, et al. Chou R, Aronson N, Atkins D, et al. Assessing harms when comparing medical interventions: AHRQ and the Effective Health Care Program. . J Clin EpidemiolJ Clin Epidemiol 2008 Sep 25. 2008 Sep 25.

http://www.effectivehealthcare.ahrq.gov/repFiles/2007_10DraftMethodsGuide.pdf

http://www.effectivehealthcare.ahrq.gov/repFiles/2007_10DraftMethodsGuide.pdf

http://www.jclinepi.com/article/S0895-4356(08)00161-3/abstract



OverviewOverview

Why should reviewers consider including Why should reviewers consider including observational studies (OS) in comparative observational studies (OS) in comparative effectiveness reviews (CERs)?effectiveness reviews (CERs)?

When should OS be included in CERs?When should OS be included in CERs? What are the differences in considering What are the differences in considering

inclusion of OS for benefits as opposed to OS inclusion of OS for benefits as opposed to OS of harms?of harms?

Current PerspectiveCurrent Perspective

CERs should consider including CERs should consider including observational studiesobservational studies*this should be the default strategy**this should be the default strategy*

Reviewers should explicitly state the Reviewers should explicitly state the rationale for including or excluding OSrationale for including or excluding OS

Comparative Effectiveness Comparative Effectiveness Reviews (CERs)Reviews (CERs)

Systematic reviews that compare the Systematic reviews that compare the relative benefits and harms among a relative benefits and harms among a range of available treatments or range of available treatments or interventions for a given conditioninterventions for a given condition

CER Process OverviewCER Process Overview

Prepare topic:

· Refine key questions

· Develop analytic frameworks

Search for and select

studies:

· Identify eligibility criteria

· Search for relevant studies

· Select evidence for inclusion

Abstract data:

· Extract evidence from studies

· Construct evidence tables

Analyze and synthesize data:

· Assess quality of studies

· Assess applicability of studies

· Apply qualitative methods

· Apply quantitative methods (meta-analyses)

· Rate the strength of a body of evidence

Present findings

Hierarchy of EvidenceHierarchy of Evidence

Systematic Reviews

RCTs

Controlled Clinical Trials and Observational Studies

Uncontrolled Observational Studies

Case reports and case series

Expert Opinions

Lowest risk of bias

Applicability?

Danger of Over-reliance on Danger of Over-reliance on RCTsRCTs

May be unnecessary, inappropriate, May be unnecessary, inappropriate, inadequate, or impracticalinadequate, or impractical

May be too short in durationMay be too short in duration May report intermediate outcomes rather than May report intermediate outcomes rather than

main health outcomes of interestmain health outcomes of interest Often not available for vulnerable populationsOften not available for vulnerable populations Generally report efficacy rather than Generally report efficacy rather than

effectivenesseffectiveness AHRQ Evidence-based Practice Centers AHRQ Evidence-based Practice Centers

include wide variety of study designs (not only include wide variety of study designs (not only RCTs)RCTs)

Observational Studies (OS)Observational Studies (OS)

Definition: Studies where the Definition: Studies where the investigators did not assign the investigators did not assign the exposure/interventionexposure/intervention– i.e. non-experimental studiesi.e. non-experimental studies

– Controlled clinical trials are quasi-Controlled clinical trials are quasi-experimental studies, not OSexperimental studies, not OS

We present considerations for including We present considerations for including OS to assess benefits and to assess OS to assess benefits and to assess harms separatelyharms separately

OS to Assess BenefitsOS to Assess Benefits

Often insufficient evidence from trials to Often insufficient evidence from trials to answer all KQs in CERs (think PICOTS)answer all KQs in CERs (think PICOTS)– Population: may not be available for sub-Population: may not be available for sub-

populations and vulnerable populationspopulations and vulnerable populations– Interventions: may not be able to assign Interventions: may not be able to assign

high-risk interventions randomlyhigh-risk interventions randomly– Outcomes: may report intermediate Outcomes: may report intermediate

outcomes rather than main health outcomes rather than main health outcomes of interestoutcomes of interest

– Timing: may be too short in durationTiming: may be too short in duration– Setting: may not represent typical practiceSetting: may not represent typical practice

Group ExerciseGroup Exercise

What should reviewers consider when What should reviewers consider when deciding whether or not to include deciding whether or not to include observational studies in CERs?observational studies in CERs?

OS to Assess BenefitsOS to Assess Benefits

Reviewers should consider 2 questions:Reviewers should consider 2 questions:

1.1. Are there Are there gaps in trial evidencegaps in trial evidence for the for the review questions under consideration?review questions under consideration?

2.2. Will observational studies provide Will observational studies provide valid valid and useful informationand useful information to address key to address key questions?questions?

Consider OS

Always consider:Controlled Trials

Will OS provide valid and useful information?

Assess whether OS address the review

question

Are there gaps in trial evidence?

Systematic review question

(including PICOTS)

Yes

Refocus the review question on gaps

Assess the suitability of OS: Natural history of the disease

or exposurePotential biases

Confine review to Controlled Trials

No

Are there gaps in trial evidence? Will OS provide valid and useful information?Are there gaps in trial evidence? Will OS provide valid and useful information?

Group Exercise: Include OS?Group Exercise: Include OS?

1.1. CER of PCI vs. CABG for coronary disease identified CER of PCI vs. CABG for coronary disease identified 23 RCTs. Experts (TEP) raised concerns that the 23 RCTs. Experts (TEP) raised concerns that the studies enrolled patients with a relatively narrow studies enrolled patients with a relatively narrow spectrum of disease relative to those having the spectrum of disease relative to those having the procedures in current practiceprocedures in current practice

2.2. Review of antioxidant supplementation to prevent Review of antioxidant supplementation to prevent heart disease found numerous large clinical trials, heart disease found numerous large clinical trials, including over 20,000 elevated-risk subjects in the including over 20,000 elevated-risk subjects in the Heart Protection Study. No beneficial effects were Heart Protection Study. No beneficial effects were seen in CV outcomes, including mortality. Findings seen in CV outcomes, including mortality. Findings were consistent across trials with varying were consistent across trials with varying populations, sizes, etc.populations, sizes, etc.

Group Exercise: include OS?Group Exercise: include OS?

1.1. CER of PCI vs. CABG----Need to look for OSCER of PCI vs. CABG----Need to look for OS• OS from 10 large cardiovascular registries were identifiedOS from 10 large cardiovascular registries were identified

• These confirmed that the use of the procedures in the These confirmed that the use of the procedures in the community included patients with wider variation in diseasecommunity included patients with wider variation in disease

• For patients similar to those enrolled in trials, mortality For patients similar to those enrolled in trials, mortality results in the registries were similar to trials (no difference results in the registries were similar to trials (no difference between interventions)between interventions)

• Relative benefits of the procedures varied markedly with Relative benefits of the procedures varied markedly with extent of disease, raising caution about extending trial extent of disease, raising caution about extending trial conclusions to patients with greater or lesser disease than conclusions to patients with greater or lesser disease than those in trial populationsthose in trial populations

2.2. Review of antioxidant supplementation to prevent Review of antioxidant supplementation to prevent heart disease----Trial data are sufficientheart disease----Trial data are sufficient

Gaps in Trial Evidence: PICOTSGaps in Trial Evidence: PICOTS

Trial data may be insufficient for a Trial data may be insufficient for a number of reasonsnumber of reasons– PICOTSPICOTS

– Populations included (missing certain Populations included (missing certain groups)groups)

– Interventions included Interventions included

– Outcomes reported (only intermediate)Outcomes reported (only intermediate)

– DurationDuration

– All trials may be efficacy studiesAll trials may be efficacy studies

Are Trial Data Sufficient? Are Trial Data Sufficient? PICOTS and BeyondPICOTS and Beyond

Risk of bias (internal validity)Risk of bias (internal validity)– Degree to which the findings may be attributed to Degree to which the findings may be attributed to

factors other than the intervention under reviewfactors other than the intervention under review ConsistencyConsistency

– Extent to which effect size and direction vary Extent to which effect size and direction vary within and across studieswithin and across studies

– Inconsistency may be due to heterogeneity across Inconsistency may be due to heterogeneity across PICOTS PICOTS

DirectnessDirectness– Degree to which outcomes that are important to Degree to which outcomes that are important to

users of the CER (patients, clinicians, or users of the CER (patients, clinicians, or policymakers) are encompassed by trial datapolicymakers) are encompassed by trial data

– Health outcomes generally most importantHealth outcomes generally most important

Are Trial Data Sufficient?Are Trial Data Sufficient? PICOTS and BeyondPICOTS and Beyond

PrecisionPrecision– Includes sample size, number of studies, and Includes sample size, number of studies, and

heterogeneity within or across studies heterogeneity within or across studies Reporting biasReporting bias

– Extent to which trial authors appear to have Extent to which trial authors appear to have reported all outcomes examined reported all outcomes examined

ApplicabilityApplicability– Extent to which the trial data are likely to be Extent to which the trial data are likely to be

applicable to populations, interventions, and applicable to populations, interventions, and settings of interest to the usersettings of interest to the user

– The review questions should reflect the PICOTS The review questions should reflect the PICOTS characteristics of interestcharacteristics of interest

When to Identify Gaps in Trial When to Identify Gaps in Trial EvidenceEvidence

Identification of gaps in trial evidence Identification of gaps in trial evidence available to answer review questions available to answer review questions can occur at a number of points in the can occur at a number of points in the reviewreviewA.A. When first scoping the reviewWhen first scoping the review

B.B. Consultation with Technical Expert PanelConsultation with Technical Expert Panel

C.C. Initial review of titles and abstractsInitial review of titles and abstracts

D.D. After detailed review of trial dataAfter detailed review of trial data

CER Process OverviewCER Process Overview

Prepare topic:

· Refine key questions

· Develop analytic frameworks

Search for and select

studies:

· Identify eligibility criteria

· Search for relevant studies

· Select evidence for inclusion

Abstract data:

· Extract evidence from studies

· Construct evidence tables

Analyze and synthesize data:

· Assess quality of studies

· Assess applicability of studies

· Apply qualitative methods

· Apply quantitative methods (meta-analyses)

· Rate the strength of a body of evidence

Present findings

Gaps in Trial EvidenceGaps in Trial Evidence

Operationally, may perform initial Operationally, may perform initial searches broadly, to identify both OS searches broadly, to identify both OS and trials, or may do searches and trials, or may do searches sequentially and search for OS after sequentially and search for OS after reviewing trials in detail to identify gaps reviewing trials in detail to identify gaps in evidencein evidence

2. Will observational studies provide 2. Will observational studies provide valid and valid and useful informationuseful information to address key questions? to address key questions?

Reviewers should:Reviewers should:I.I. Refocus the study question on gaps in Refocus the study question on gaps in

trial evidencetrial evidencea)a) specify the PICOTS characteristics for specify the PICOTS characteristics for

gaps in trial evidencegaps in trial evidence

II.II. Assess whether available OS may Assess whether available OS may address the review questions address the review questions (applicable to PICOTS?)(applicable to PICOTS?)

III.III. Assess suitability of OS to answer the Assess suitability of OS to answer the review questionsreview questions

Valid and Useful InformationValid and Useful Information

III.III. Assess suitability of OS to answer the Assess suitability of OS to answer the review questions review questions

After gaps have been identified in trial After gaps have been identified in trial literature and that OS potentially fill literature and that OS potentially fill those gaps those gaps

Consider the clinical context and natural Consider the clinical context and natural history of the condition under studyhistory of the condition under study

Assess how potential biases may Assess how potential biases may influence the results of OSinfluence the results of OS

Clinical contextClinical context

Fluctuating or intermittent conditions are Fluctuating or intermittent conditions are more difficult to assess with OSmore difficult to assess with OS– Especially if there is no comparison groupEspecially if there is no comparison group

OS may be more useful for conditions OS may be more useful for conditions with steady progression or declinewith steady progression or decline


Here are two very different conditions:Here are two very different conditions:

1.1. Acute low back painAcute low back pain

2.2. Amyotrophic lateral sclerosis (ALS) Amyotrophic lateral sclerosis (ALS)

How might the differences in these conditions How might the differences in these conditions impact whether OS would provide useful impact whether OS would provide useful information?information?


Main considerations here are the natural Main considerations here are the natural history of the condition under studyhistory of the condition under study

People with acute low back pain often recover People with acute low back pain often recover spontaneouslyspontaneously– A cohort study of treatments for acute low back A cohort study of treatments for acute low back

pain can’t establish, with any degree of certainty, pain can’t establish, with any degree of certainty, whether the treatments affected patient outcomeswhether the treatments affected patient outcomes

ALS has a course of steady declineALS has a course of steady decline– An uncontrolled cohort study of treatments for ALS An uncontrolled cohort study of treatments for ALS

may well be able to demonstrate meaningful may well be able to demonstrate meaningful effectseffects

Potential biasesPotential biases

Selection bias (and confounding by Selection bias (and confounding by indication)indication)

Performance biasPerformance bias Detection biasDetection bias Attrition biasAttrition bias


Suppose you’re conducting a CER of medications for Suppose you’re conducting a CER of medications for rheumatoid arthritis (RA)rheumatoid arthritis (RA)

You find several retrospective analyses of You find several retrospective analyses of administrative databases comparing outcomes of RA administrative databases comparing outcomes of RA patients taking etanercept vs. methotrexatepatients taking etanercept vs. methotrexate

Suppose that etanercept is restricted in many of the Suppose that etanercept is restricted in many of the health systems to patients with more severe RA who health systems to patients with more severe RA who have failed other treatmentshave failed other treatments

Should you include these OS?Should you include these OS? What considerations will influence your decision?What considerations will influence your decision?


Confounding by indicationConfounding by indication– A type of selection biasA type of selection bias– When different diagnoses, severity of illness, or When different diagnoses, severity of illness, or

comorbid conditions are important reasons for comorbid conditions are important reasons for physicians to assign different treatmentsphysicians to assign different treatments

– Common problem in pharmacoepidemiology Common problem in pharmacoepidemiology studies comparing beneficial effects of studies comparing beneficial effects of interventionsinterventions

Generally would not include this information Generally would not include this information due to a high risk of bias (poor internal due to a high risk of bias (poor internal validity), unless studies had a good way to validity), unless studies had a good way to adjust for severity of diseaseadjust for severity of disease

HarmsHarms

Assessing harms can be difficultAssessing harms can be difficult– Trials often focus on benefits, with little effort to Trials often focus on benefits, with little effort to

balance assessment of benefits and harmsbalance assessment of benefits and harms

– OS are almost always necessary to assess harms OS are almost always necessary to assess harms adequatelyadequately

There are tradeoffs between increasing There are tradeoffs between increasing comprehensiveness of reviewing all possible comprehensiveness of reviewing all possible harms data and decreasing quality (increasing harms data and decreasing quality (increasing risk of bias) for harms datarisk of bias) for harms data

Trials to Assess HarmsTrials to Assess Harms

Randomized controlled trials = gold standard Randomized controlled trials = gold standard for evaluating efficacyfor evaluating efficacy

But, relying solely on RCTs to evaluate harms But, relying solely on RCTs to evaluate harms in CERs is problematic in CERs is problematic – Most lack prespecified hypotheses for harms as Most lack prespecified hypotheses for harms as

they are designed to evaluate benefitsthey are designed to evaluate benefits– Assessment of harms is often a secondary Assessment of harms is often a secondary

considerationconsideration– Quality and quantity of reporting of harms is Quality and quantity of reporting of harms is

frequently inadequatefrequently inadequate– Few have sufficient sample sizes or duration to Few have sufficient sample sizes or duration to

adequately assess uncommon or long-term harmsadequately assess uncommon or long-term harms


Most RCTs are “efficacy” trialsMost RCTs are “efficacy” trials– they assess benefits and harms in ideal, they assess benefits and harms in ideal,

homogenous populations and settingshomogenous populations and settings

– patients who are more susceptible to harms are patients who are more susceptible to harms are often under-representedoften under-represented

Few RCTs directly compare alternative Few RCTs directly compare alternative treatment strategiestreatment strategies

Publication bias and selective outcome Publication bias and selective outcome reporting biasreporting bias

RCTs may not be availableRCTs may not be available


Nevertheless, head-to-head RCTs provide the Nevertheless, head-to-head RCTs provide the most direct evidence on comparative harmsmost direct evidence on comparative harms

In addition, placebo-controlled RCTs can In addition, placebo-controlled RCTs can provide important informationprovide important information

In general, CERs should routinely include both In general, CERs should routinely include both head-to-head and placebo-controlled trials for head-to-head and placebo-controlled trials for assessment of harmsassessment of harms– In lieu of placebo-controlled RCTs, CERs may In lieu of placebo-controlled RCTs, CERs may

incorporate findings of well-conducted systematic incorporate findings of well-conducted systematic reviews if they evaluated the specific harms of reviews if they evaluated the specific harms of interestinterest

Unpublished Supplemental Unpublished Supplemental Trials DataTrials Data

Consider including results of completed or Consider including results of completed or terminated unpublished RCTs and terminated unpublished RCTs and unpublished results from published trialsunpublished results from published trials– FDA website, FDA website, http://www.ClinicalTrials.gov, etc., etc.

– Must contemplate ability to fully assess risk of biasMust contemplate ability to fully assess risk of bias

When significant # of published trials fails to When significant # of published trials fails to report an important AE, CER authors should report an important AE, CER authors should report this gap in the evidence and consider report this gap in the evidence and consider efforts to obtain unpublished dataefforts to obtain unpublished data

http://www.clinicaltrials.gov/

OS to Assess HarmsOS to Assess Harms

OS are almost always necessary to assess OS are almost always necessary to assess harms adequatelyharms adequately

Exception is when there are sufficient data Exception is when there are sufficient data from RCTs to reliably estimate harmsfrom RCTs to reliably estimate harms

May provide best or only data for assessing May provide best or only data for assessing harms in minority or vulnerable populations harms in minority or vulnerable populations who are under-represented in trialswho are under-represented in trials

Types of OS included in a CER will vary; Types of OS included in a CER will vary; different types of OS might be included or different types of OS might be included or rendered irrelevant by availability of data from rendered irrelevant by availability of data from stronger study typesstronger study types

Hypothesis Testing vs. Hypothesis Testing vs. Hypothesis GeneratingHypothesis Generating

Important consideration in determining Important consideration in determining which OS to includewhich OS to include– Case reports are hypothesis generatingCase reports are hypothesis generating

– Cohort and case-control studies are well Cohort and case-control studies are well suited for testing hypotheses of whether suited for testing hypotheses of whether one intervention is associated with a one intervention is associated with a greater risk for an adverse event than greater risk for an adverse event than another and for quantifying the risk*another and for quantifying the risk*

*Chou et al, JCE 2008

Hierarchy of EvidenceHierarchy of Evidence

Systematic Reviews

RCTs

Controlled Clinical Trials and Observational Studies

Uncontrolled Observational Studies

Case reports and case series

Expert Opinions

Lowest risk of bias

Applicability?Hypothesis Testing

Hypothesis Generating


Cohort and case-control studiesCohort and case-control studies– CERs should routinely search for and include, CERs should routinely search for and include,

except when RCT data are sufficient and validexcept when RCT data are sufficient and valid

OS based on patient registriesOS based on patient registries OS based on analyses of large databasesOS based on analyses of large databases Case reports and post-marketing surveillanceCase reports and post-marketing surveillance

– New medicationsNew medications

Other OSOther OS


Criteria to select OS for inclusionCriteria to select OS for inclusion– there are often many more OS than trials; there are often many more OS than trials;

evaluating a large number of OS can be evaluating a large number of OS can be impractical when conducting a CERimpractical when conducting a CER

– Several criteria commonly uses in CERs to screen Several criteria commonly uses in CERs to screen OS for inclusion (empirical data lacking)OS for inclusion (empirical data lacking) Minimum duration of follow-upMinimum duration of follow-up Minimum sample sizeMinimum sample size Defined threshold for risk of biasDefined threshold for risk of bias Study design (cohort and case-control)Study design (cohort and case-control) Specific population of interestSpecific population of interest

Key Take-home PointsKey Take-home Points

Often insufficient evidence from trials to answer all Often insufficient evidence from trials to answer all Key Questions in CERsKey Questions in CERs

CERs should consider including OS *default strategy*CERs should consider including OS *default strategy* Should explicitly state the rationale for including or Should explicitly state the rationale for including or

excluding OSexcluding OS For OS to assess benefits, reviewers should consider For OS to assess benefits, reviewers should consider

2 questions:2 questions:1.1. Are there Are there gaps in trial evidencegaps in trial evidence for the review questions for the review questions

under consideration?under consideration?2.2. Will observational studies provide Will observational studies provide valid and useful valid and useful

informationinformation to address key questions? to address key questions? For harms, should routinely search for and include For harms, should routinely search for and include

cohort and case-control studiescohort and case-control studies

Selecting Evidence for Comparative Effectiveness Reviews: When to use Observational Studies

Documents

Transcript of Selecting Evidence for Comparative Effectiveness Reviews: When to use Observational Studies