Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics...

Clinical Significance Clinical Significance for Quality of Life Endpointsfor Quality of Life Endpoints

in Clinical Trials in Clinical Trials

Clinical Significance Clinical Significance for Quality of Life Endpointsfor Quality of Life Endpoints

in Clinical Trials in Clinical Trials

FDA/Industry Statistics Workshop FDA/Industry Statistics Workshop

Washington, September 16, 2005Washington, September 16, 2005

FDA/Industry Statistics Workshop FDA/Industry Statistics Workshop

Washington, September 16, 2005Washington, September 16, 2005

Jeff A. Sloan, Ph.D.Jeff A. Sloan, Ph.D.Mayo Clinic, Rochester, MN, USA

Primary goal: advance the state of the science Primary goal: advance the state of the science to help cancer patient QOL soarto help cancer patient QOL soar

http://www.worth1000.com/view.asp?entry=102590&display=photoshop

Take home message:Take home message:there is good newsthere is good news

Take home message:Take home message:there is good newsthere is good news

• There are problems with using QOL There are problems with using QOL assessments as indicators of efficacy in assessments as indicators of efficacy in clinical trials.clinical trials.

• There are scientifically sound solutions to There are scientifically sound solutions to these problems. The problems have been these problems. The problems have been disseminated widely and consistently. The disseminated widely and consistently. The solutions have not.solutions have not.

• There are problems with using QOL There are problems with using QOL assessments as indicators of efficacy in assessments as indicators of efficacy in clinical trials.clinical trials.

• There are scientifically sound solutions to There are scientifically sound solutions to these problems. The problems have been these problems. The problems have been disseminated widely and consistently. The disseminated widely and consistently. The solutions have not.solutions have not.

It takes a certain amount of bravery to work in QOL researchIt takes a certain amount of bravery to work in QOL research

Science is a candle in the darkScience is a candle in the dark - Carl Sagan- Carl Sagan

We will use the candle of science to We will use the candle of science to improve the QOLimprove the QOL of cancer patients of cancer patients


How do you determineHow do you determine

the clinical significancethe clinical significance

of QOL assessments?of QOL assessments?

What is a clinically meaningful QOL burden?

Why is it difficult to define Why is it difficult to define “clinical significance” for QOL?“clinical significance” for QOL?


• Pain analogy

• 25 years ago physicians were the sole raters of patient pain

• JCAHO 2000 guideline: every patient’s pain to be assessed upon intake on a 0-10 scale

• Time and experience alleviates novelty and skepticism, and guidelines evolve

• Pain analogy

• 25 years ago physicians were the sole raters of patient pain

• JCAHO 2000 guideline: every patient’s pain to be assessed upon intake on a 0-10 scale

• Time and experience alleviates novelty and skepticism, and guidelines evolve



• Blood pressure analogy

• 100 years ago, clinical significance of BP scores was unknown (Lancet 1899)

• massage therapy was the gold standard

• present guidelines for BP clinical significance today redrawn (McCrory DC. Lewis SZ. Chest. 126(1 Suppl):11S-13S, 2004)

• Blood pressure analogy

• 100 years ago, clinical significance of BP scores was unknown (Lancet 1899)

• massage therapy was the gold standard

• present guidelines for BP clinical significance today redrawn (McCrory DC. Lewis SZ. Chest. 126(1 Suppl):11S-13S, 2004)

The solution found for tumor response The solution found for tumor response cutoffs may provide guidancecutoffs may provide guidance

The solution found for tumor response The solution found for tumor response cutoffs may provide guidancecutoffs may provide guidance

• We call a reduction of 50% a response. We call a reduction of 50% a response.

• Have reductions of 49% all the time, but do Have reductions of 49% all the time, but do not worry about misclassification.not worry about misclassification.

• Moertel (1976) basis for 50% cutoffMoertel (1976) basis for 50% cutoff

• Find a cutoff and stick to it? (RECIST)Find a cutoff and stick to it? (RECIST)

• We call a reduction of 50% a response. We call a reduction of 50% a response.

• Have reductions of 49% all the time, but do Have reductions of 49% all the time, but do not worry about misclassification.not worry about misclassification.

• Moertel (1976) basis for 50% cutoffMoertel (1976) basis for 50% cutoff

• Find a cutoff and stick to it? (RECIST)Find a cutoff and stick to it? (RECIST)

What Clinical significance is NOTWhat Clinical significance is NOTWhat Clinical significance is NOTWhat Clinical significance is NOT

• Statistical significanceStatistical significance

• Example drawn from JCO 2001 (anonymous)Example drawn from JCO 2001 (anonymous)• HSQ before / after scores on 1300 patientsHSQ before / after scores on 1300 patients• all p-values <0.0001all p-values <0.0001• conclusion: all domains of QOL were conclusion: all domains of QOL were

significantly different across treatment groupssignificantly different across treatment groups• problem: 1300 patients provides 80% power to problem: 1300 patients provides 80% power to

detect a change of 1 unit on 0-100 point scaledetect a change of 1 unit on 0-100 point scale

• Statistical significanceStatistical significance

• Example drawn from JCO 2001 (anonymous)Example drawn from JCO 2001 (anonymous)• HSQ before / after scores on 1300 patientsHSQ before / after scores on 1300 patients• all p-values <0.0001all p-values <0.0001• conclusion: all domains of QOL were conclusion: all domains of QOL were

significantly different across treatment groupssignificantly different across treatment groups• problem: 1300 patients provides 80% power to problem: 1300 patients provides 80% power to

detect a change of 1 unit on 0-100 point scaledetect a change of 1 unit on 0-100 point scale

EORTC QLQ-LC13EORTC QLQ-LC13EORTC QLQ-LC13EORTC QLQ-LC13

• ItemItem n=537n=537 n=346n=346 Effect Size Effect Size

• CoughingCoughing 46.246.2 44.344.3 smallsmall

• DyspneaDyspnea 17.217.2 16.216.2 smallsmall

• PainPain 26.926.9 25.525.5 smallsmall

• all p-values were statistically significantall p-values were statistically significant

• ItemItem n=537n=537 n=346n=346 Effect Size Effect Size

• CoughingCoughing 46.246.2 44.344.3 smallsmall

• DyspneaDyspnea 17.217.2 16.216.2 smallsmall

• PainPain 26.926.9 25.525.5 smallsmall

• all p-values were statistically significantall p-values were statistically significant

The Six PapersThe Six PapersThe Six PapersThe Six Papers

• 1) Methods used to date

• 2) Group versus individual differences

• 3) Single item versus multi-item

• 4) Patient, clinician, population perspectives

• 5) Changes over time

• 6) Practical considerations for specific audiences

• MCP, April, May, June 2002MCP, April, May, June 2002

• 1) Methods used to date

• 2) Group versus individual differences

• 3) Single item versus multi-item

• 4) Patient, clinician, population perspectives

• 5) Changes over time

• 6) Practical considerations for specific audiences

• MCP, April, May, June 2002MCP, April, May, June 2002

No single statistical decision rule or procedure can take the place of well-reasoned consideration of all aspects of the data by a group of concerned, competent, and experienced persons

with a wide range of scientific backgrounds and points of view.

Canner (1981)

If it looks like a duck, sounds like a duck, and walks like a duck, the odds of it being a worm or an elephant in a clever disguise

are small in the extreme.Sloan (2001)

Bottom LineBottom LineBottom LineBottom Line

• Assessing the clinical significance of QOL can be as simple as a 10-point change on a 100-point scale, if that is consistent with the goals of the scientific enquiry. The real issue underlying the controversy over QOL is the relative novelty and lack of experience that presently exists with QOL. With time and familiarity this too shall pass.

(Sloan, J Chronic Obs. Pul. Dis. 2: 57-62, 2005.)

• Assessing the clinical significance of QOL can be as simple as a 10-point change on a 100-point scale, if that is consistent with the goals of the scientific enquiry. The real issue underlying the controversy over QOL is the relative novelty and lack of experience that presently exists with QOL. With time and familiarity this too shall pass.

(Sloan, J Chronic Obs. Pul. Dis. 2: 57-62, 2005.)

Presenting global solutions is always interestingPresenting global solutions is always interesting

meme

youyou

Two general methods Two general methods for clinical significancefor clinical significanceTwo general methods Two general methods for clinical significancefor clinical significance

• Anchor-based methods requirementsAnchor-based methods requirements

• independent interpretable measure independent interpretable measure (the anchor) which has appreciable (the anchor) which has appreciable correlation between anchor and targetcorrelation between anchor and target

• Distribution-based methodsDistribution-based methods

• rely on expression of magnitude of rely on expression of magnitude of effect in terms of measure of effect in terms of measure of variability of results (effect size)variability of results (effect size)

• Anchor-based methods requirementsAnchor-based methods requirements

• independent interpretable measure independent interpretable measure (the anchor) which has appreciable (the anchor) which has appreciable correlation between anchor and targetcorrelation between anchor and target

• Distribution-based methodsDistribution-based methods

• rely on expression of magnitude of rely on expression of magnitude of effect in terms of measure of effect in terms of measure of variability of results (effect size)variability of results (effect size)

The MID method in one slide

The Empirical Rule Effect Size (ERES) Approach The Empirical Rule Effect Size (ERES) Approach (Sloan et al, Cancer Integrative Medicine 1(1):41-47, 2003)(Sloan et al, Cancer Integrative Medicine 1(1):41-47, 2003)

The Empirical Rule Effect Size (ERES) Approach The Empirical Rule Effect Size (ERES) Approach (Sloan et al, Cancer Integrative Medicine 1(1):41-47, 2003)(Sloan et al, Cancer Integrative Medicine 1(1):41-47, 2003)

• QOL tool range QOL tool range = 6 standard Deviations= 6 standard Deviations

• SD Estimate SD Estimate == 100 percent / 6100 percent / 6

= 16.7% of theoretical range= 16.7% of theoretical range

• Two-sample t-test effect sizes Two-sample t-test effect sizes (J Cohen, 1988)(J Cohen, 1988): :

small, moderate, large effectsmall, moderate, large effect (0.2, 0.5, 0.8 SD shift)(0.2, 0.5, 0.8 SD shift)

• S,M,L effects S,M,L effects = 3%, 8%, 12% of range= 3%, 8%, 12% of range

• QOL tool range QOL tool range = 6 standard Deviations= 6 standard Deviations

• SD Estimate SD Estimate == 100 percent / 6100 percent / 6

= 16.7% of theoretical range= 16.7% of theoretical range

• Two-sample t-test effect sizes Two-sample t-test effect sizes (J Cohen, 1988)(J Cohen, 1988): :

small, moderate, large effectsmall, moderate, large effect (0.2, 0.5, 0.8 SD shift)(0.2, 0.5, 0.8 SD shift)

• S,M,L effects S,M,L effects = 3%, 8%, 12% of range= 3%, 8%, 12% of range

The Empirical RuleThe Empirical Rule

• Tchebyshev’s Theorem: at least 1-1/kTchebyshev’s Theorem: at least 1-1/k22 of of any distribution will fall within k standard any distribution will fall within k standard deviations (SD’s) of the meandeviations (SD’s) of the mean

• If the distribution is symmetric, 99% will fall If the distribution is symmetric, 99% will fall within 3 standard deviationswithin 3 standard deviations

• The pdf for the range is a function of the SDThe pdf for the range is a function of the SD

• an estimate of the SD can be obtained viaan estimate of the SD can be obtained via• range = 6 SDrange = 6 SD

Assumption Checking for ERESAssumption Checking for ERES(Dueck, Sloan, 2006, J. Biopharm. Stats, in press)(Dueck, Sloan, 2006, J. Biopharm. Stats, in press)

Assumption Checking for ERESAssumption Checking for ERES(Dueck, Sloan, 2006, J. Biopharm. Stats, in press)(Dueck, Sloan, 2006, J. Biopharm. Stats, in press)

• Tchebyshev’s Inequality is conservativeTchebyshev’s Inequality is conservative

• Tested the effect of various distributional Tested the effect of various distributional assumptionsassumptions

• Only a uniform distribution results in Only a uniform distribution results in deviation from the assumption of a 6 SD-deviation from the assumption of a 6 SD-based estimate (28% instead of 17%)based estimate (28% instead of 17%)

• Tchebyshev’s Inequality is conservativeTchebyshev’s Inequality is conservative

• Tested the effect of various distributional Tested the effect of various distributional assumptionsassumptions

• Only a uniform distribution results in Only a uniform distribution results in deviation from the assumption of a 6 SD-deviation from the assumption of a 6 SD-based estimate (28% instead of 17%)based estimate (28% instead of 17%)

All Methods Give Similar AnswersAll Methods Give Similar AnswersAll Methods Give Similar AnswersAll Methods Give Similar Answers

• Cohen - 1/2 SD is moderate effectCohen - 1/2 SD is moderate effect

• MCID - 1/2 point on 7-point Likert MCID - 1/2 point on 7-point Likert • 7-1 = 6 point range ==> SD of 1 unit7-1 = 6 point range ==> SD of 1 unit• so 1/2 point ==:> 1/2 SDso 1/2 point ==:> 1/2 SD

• Cella - 10 point on FACT-GCella - 10 point on FACT-G• 10/1.12 = 8.9% / 16.7% = 1/2 SD10/1.12 = 8.9% / 16.7% = 1/2 SD

• Feinstein - correlation approachFeinstein - correlation approach• Cohen was arbitrary, should be 0.6 SDCohen was arbitrary, should be 0.6 SD

• Cohen - 1/2 SD is moderate effectCohen - 1/2 SD is moderate effect

• MCID - 1/2 point on 7-point Likert MCID - 1/2 point on 7-point Likert • 7-1 = 6 point range ==> SD of 1 unit7-1 = 6 point range ==> SD of 1 unit• so 1/2 point ==:> 1/2 SDso 1/2 point ==:> 1/2 SD

• Cella - 10 point on FACT-GCella - 10 point on FACT-G• 10/1.12 = 8.9% / 16.7% = 1/2 SD10/1.12 = 8.9% / 16.7% = 1/2 SD

• Feinstein - correlation approachFeinstein - correlation approach• Cohen was arbitrary, should be 0.6 SDCohen was arbitrary, should be 0.6 SD

The Good NewsThe Good NewsThe Good NewsThe Good News

• Statistical, Philosophical, Empirical, Clinical, Historical, Practical Statistical, Philosophical, Empirical, Clinical, Historical, Practical approaches to defining a clinically significant effect for symptom approaches to defining a clinically significant effect for symptom assessments are all in the same ballparkassessments are all in the same ballpark

• A 10 point difference on a 100-point scale (1/2 SD) is almost always going A 10 point difference on a 100-point scale (1/2 SD) is almost always going to be clinically significantto be clinically significant

• Smaller differences may also be meaningful (data)Smaller differences may also be meaningful (data)

• Applies to groups or individuals (just different SD)Applies to groups or individuals (just different SD)

Norman GR, Sloan JA, Wyrwich KW. Expert Review of Pharmacoeconomics and Outcomes Research Sept 2004; 4(5): 515 – 519Norman GR, Sloan JA, Wyrwich KW. Expert Review of Pharmacoeconomics and Outcomes Research Sept 2004; 4(5): 515 – 519

Sloan JA, Cella D, Hays R. J Clin Epidemiol (in press).Sloan JA, Cella D, Hays R. J Clin Epidemiol (in press).

• Statistical, Philosophical, Empirical, Clinical, Historical, Practical Statistical, Philosophical, Empirical, Clinical, Historical, Practical approaches to defining a clinically significant effect for symptom approaches to defining a clinically significant effect for symptom assessments are all in the same ballparkassessments are all in the same ballpark

• A 10 point difference on a 100-point scale (1/2 SD) is almost always going A 10 point difference on a 100-point scale (1/2 SD) is almost always going to be clinically significantto be clinically significant

• Smaller differences may also be meaningful (data)Smaller differences may also be meaningful (data)

• Applies to groups or individuals (just different SD)Applies to groups or individuals (just different SD)

Norman GR, Sloan JA, Wyrwich KW. Expert Review of Pharmacoeconomics and Outcomes Research Sept 2004; 4(5): 515 – 519Norman GR, Sloan JA, Wyrwich KW. Expert Review of Pharmacoeconomics and Outcomes Research Sept 2004; 4(5): 515 – 519

Sloan JA, Cella D, Hays R. J Clin Epidemiol (in press).Sloan JA, Cella D, Hays R. J Clin Epidemiol (in press).

Four GuidelinesFour Guidelines(Sloan, Cella, Hays, JCE 2005,in press)(Sloan, Cella, Hays, JCE 2005,in press)


• The method used to obtain an estimate of The method used to obtain an estimate of clinical significance should be scientifically clinical significance should be scientifically supportable.supportable.

• The ½ SD is a conservative estimate of an effect The ½ SD is a conservative estimate of an effect size that is likely to be clinically meaningful. An size that is likely to be clinically meaningful. An effect size greater than ½ SD is not likely to be effect size greater than ½ SD is not likely to be one that can be ignored. In the absence of other one that can be ignored. In the absence of other information, the ½ SD is a reasonable and information, the ½ SD is a reasonable and scientifically supportable estimate of a scientifically supportable estimate of a meaningful effect. meaningful effect.

• The method used to obtain an estimate of The method used to obtain an estimate of clinical significance should be scientifically clinical significance should be scientifically supportable.supportable.

• The ½ SD is a conservative estimate of an effect The ½ SD is a conservative estimate of an effect size that is likely to be clinically meaningful. An size that is likely to be clinically meaningful. An effect size greater than ½ SD is not likely to be effect size greater than ½ SD is not likely to be one that can be ignored. In the absence of other one that can be ignored. In the absence of other information, the ½ SD is a reasonable and information, the ½ SD is a reasonable and scientifically supportable estimate of a scientifically supportable estimate of a meaningful effect. meaningful effect.



• Effect sizes below ½ SD, supported by data Effect sizes below ½ SD, supported by data regarding the specific characteristics of a regarding the specific characteristics of a particular QOL assessment or application, may particular QOL assessment or application, may also be meaningful. The minimally important also be meaningful. The minimally important difference may be below ½ SD in such cases. difference may be below ½ SD in such cases.

• If feasible, multiple approaches to estimating a If feasible, multiple approaches to estimating a tool’s clinically meaningful effect size in tool’s clinically meaningful effect size in multiple patient groups are helpful in assessing multiple patient groups are helpful in assessing the variability of the estimates. However, the the variability of the estimates. However, the lack of multiple approaches with multiple lack of multiple approaches with multiple groups should not preemptively restrict groups should not preemptively restrict application of information gained to date.application of information gained to date.

• Effect sizes below ½ SD, supported by data Effect sizes below ½ SD, supported by data regarding the specific characteristics of a regarding the specific characteristics of a particular QOL assessment or application, may particular QOL assessment or application, may also be meaningful. The minimally important also be meaningful. The minimally important difference may be below ½ SD in such cases. difference may be below ½ SD in such cases.

• If feasible, multiple approaches to estimating a If feasible, multiple approaches to estimating a tool’s clinically meaningful effect size in tool’s clinically meaningful effect size in multiple patient groups are helpful in assessing multiple patient groups are helpful in assessing the variability of the estimates. However, the the variability of the estimates. However, the lack of multiple approaches with multiple lack of multiple approaches with multiple groups should not preemptively restrict groups should not preemptively restrict application of information gained to date.application of information gained to date.

SummarySummarySummarySummary

• Defining clinical significance for QOL assessments is Defining clinical significance for QOL assessments is today where pain was 25 years ago, tumor response was today where pain was 25 years ago, tumor response was 50 years ago and blood pressure was 100 years ago50 years ago and blood pressure was 100 years ago

• Define clinical significance a priori, and use the Define clinical significance a priori, and use the definition in the analytical processdefinition in the analytical process

• Consensus is building as the answers from different Consensus is building as the answers from different approaches are similar and relatively robustapproaches are similar and relatively robust

• Defining clinical significance for QOL assessments is Defining clinical significance for QOL assessments is today where pain was 25 years ago, tumor response was today where pain was 25 years ago, tumor response was 50 years ago and blood pressure was 100 years ago50 years ago and blood pressure was 100 years ago

• Define clinical significance a priori, and use the Define clinical significance a priori, and use the definition in the analytical processdefinition in the analytical process

• Consensus is building as the answers from different Consensus is building as the answers from different approaches are similar and relatively robustapproaches are similar and relatively robust

New ideas have enabled us to make advances in QOL science New ideas have enabled us to make advances in QOL science


A Mayo/FDA meeting regardingA Mayo/FDA meeting regardingguidance on patient-reported outcomes (PRO)guidance on patient-reported outcomes (PRO)Discussion, Education, and OperationalizationDiscussion, Education, and Operationalization

A Mayo/FDA meeting regardingA Mayo/FDA meeting regardingguidance on patient-reported outcomes (PRO)guidance on patient-reported outcomes (PRO)Discussion, Education, and OperationalizationDiscussion, Education, and Operationalization• FDA to release guidances for assessing PRO’s in all clinical trials

(3rd quarter 2005?)

• Meeting co-sponsored with FDA to:• provide a focused process to facilitate discussion among all

stakeholders• educate stakeholders on background, content, and concerns• provide an opportunity for input• delineate ways to best operationalize the guidance into

clinical trials

• February 23-25, 2006, DC (Westfields Marriott, Chantilly, VA, 7 miles from Dulles)

• FDA to release guidances for assessing PRO’s in all clinical trials (3rd quarter 2005?)

• Meeting co-sponsored with FDA to:• provide a focused process to facilitate discussion among all

stakeholders• educate stakeholders on background, content, and concerns• provide an opportunity for input• delineate ways to best operationalize the guidance into

clinical trials

• February 23-25, 2006, DC (Westfields Marriott, Chantilly, VA, 7 miles from Dulles)

The NCCTG QOL TeamThe NCCTG QOL TeamThe NCCTG QOL TeamThe NCCTG QOL Team

Pamela Atherton Michele Halyard Jarrett Richardson

Deb Barton Alan Hartung Teresa Rummans

Brent Bauer Jef Huntington Paul Schaefer

Teri Britt Mashele Huschka Jeff Sloan

Kelli Burger Mary Johnson Denise Smith

Kara Curry Celia Kamath Angelina Tan

AmyLou Dueck Sumithra Mandrekar Kristy Vierling

Marlene Frost Timothy Moynihan Gilbert Wong

Axel Grothey Paul Novotny Cathy Zhao

For further information, email: [email protected]

Email: [email protected]

Thank you

Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics...

Documents

Transcript of Clinical Significance for Quality of Life Endpoints in Clinical Trials FDA/Industry Statistics...