Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los...

48
Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center November 1, 2007 Peter D. Christenson Conducting Clinical Trials 2007

Transcript of Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los...

Page 1: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Statistical Principles for Clinical Research

Sponsored by:

NIH General Clinical Research Center

Los Angeles Biomedical Research Institute

at Harbor-UCLA Medical Center

November 1, 2007

Peter D. Christenson

Conducting Clinical Trials 2007

Page 2: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Speaker Disclosure Statement

The speaker has no financial relationships relevant to this presentation.

Page 3: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Recommended Textbook: Making Inference

Design issues

Biases

How to read papers

Meta-analyses

Dropouts

Non-mathematical

Many examples

Page 4: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Example: Harbor Study Protocol

18 Pages of Background and Significance, Preliminary Studies, and Research Design and Methods. Then:

“Pearson correlation, repeated measure of the general linear model, ANOVA analyses and student t tests will be used where appropriate. …

The [two] main parameters of interest will be … [A and B. For A, using a t-test] 40 subjects provide 80% assurance that a XX reduction … will be detected, with p<0.05.

Similar comparisons as for … [A and B] will be carried out …”

Page 5: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Example: Harbor Study Protocol

The good ….

“The [two] main parameters of interest will be … [A and B. For A, using a t-test,] 40 subjects provide 80% assurance that a XX reduction … will be detected, with p<0.05.”

Because:

• Explicit: Specifies primary outcome of interest.

• Explicit: Justification for # of subjects.

Page 6: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Example: Harbor Study Protocol

… the Bad …

“Pearson correlation, repeated measure of the general linear model, ANOVA analyses and student t tests will be used where appropriate. …”

Because:

• Boilerplate.

• These methods are almost always used.

• “Where appropriate”?

• Tries to satisfy reviewer, not science.

Page 7: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Example: Harbor Study Protocol

… and the Ugly.

“Similar comparisons as for … [A and B] will be carried out …”

Because:

• 1º OK: Diff b/w 2 visits for 2 measures, A & B.

• But, 15 measures taken at each of 19 visits.

• Torture the data long enough, and it will confess to something.

Page 8: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Goals of this Presentation

More good.

Less bad.

Less ugly.

Page 9: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Biostatistical Involvement in Studies

Off-site statistical design and analysis

Multicenter studies; data coordinating center.

In house drug company statisticians.

CRO through NIH or drug company.

Local study contracted elsewhere

e.g. UCLA, USC, CRO.

Local protocol, and statistical design and analysis

Occasionally multicenter.

Page 10: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Studies with Off-Site Biostatistics

Not responsible for statistical design and analysis.

Are responsible for study conduct that may:

• … impact analysis, believability of results.

• … reduce sensitivity (power) of the study to be able to detect effects.

Page 11: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Review of Basic Method of Inference

from Clinical Studies

Page 12: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Typical Study Data Analysis

Large enough “signal-to-noise ratio” → Proves an effect beyond a reasonable doubt. Often:

Observed Effect

Natural Variation/√N

Signal

NoiseRatio ==

Difference in Means

SD/√N

For a t-test comparing two groups:

t Ratio =

Degree of allowable doubt → How large t needs to be.

5% (p<0.05) → |t| > ~2

Page 13: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Meaning of p-valuep-value:Probability of a test statistic (ratio) that is at least as deviant as was observed, if there is really no effect.

Smaller p-values ↔ more evidence of effect.

Validity of p-value interpretation typically requires:• Proper data generation, e.g., randomness.• Subjects provide independent information.• Data is not used in other statistical tests.

or: an accounting for not satisfying these criteria.

→ p-values are earned by satisfying appropriately.

Page 14: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Truth:

No Effect Effect

No Effect

Effect

Study Claims:

Correct

CorrectError

Error

Power: Maximize.

Choose N for 80%

Set p≤0.05

Specificity=95%

Specificity

Sensitivity

Analogy with Diagnostic Testing

← Typical →

Analogy

True Effect

Disease

Study Claim

Diagnosis

Page 15: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Study Conduct Impacting Analysis

Non-adherence of study personnel to the protocol in general. [Increases variation.]

Enrolling subjects who do not satisfy inclusion or exclusion criteria. [ E.g., no effect in 10% wrongly included & real effect=50% → ~0.9(50%) = 45% observed effect. Can decrease observed effect.]

Subjects not completing entire study. [May decrease N, or give potentially conflicting results.]

↓ effect detectability (and ↓ratio) results from:

Page 16: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Potentially Conflicting Results

Example: Subjects not completing the entire study.

Page 17: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Tigabine Study Results: How Believable?

1

2

3

Conclusions differ depending on how non-completing subjects (24%) are handled in the analysis.

Primary analysis here is specified, but we would prefer robustness to the method of analysis (agreement), which is more likely with more completing subjects.

Page 18: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Study Conduct Impacting Analysis

Intention-to-Treat (ITT)

Continued …

ITT typically specifies that all subjects are included in analysis, regardless of treatment compliance or whether lost to follow-up.

Purposes: Avoid bias from subjective exclusions or differential exclusion between treatment groups; sometimes argued to mimic non-compliance in real world setting.

More emphasis on policy implications of societal effectiveness than on scientific efficacy.

Not appropriate for many studies.

Page 19: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Study Conduct Impacting Analysis

Lost to follow-up:

Always minimize; no “real world” analogy as for treatment compliance.

Need to define outcomes for non-completing subjects.

Current Harbor study:

N≈1200 would need N≈3000 if ITT used, 20% lost, and lost counted as treatment failures.

Intention-to-Treat (ITT)

Page 20: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

ITT: Need to Impute Unknown Values

Change from Baseline

Baseline Final VisitIntermediate Visit

0

Change from Baseline

Intermediate Visit

Final VisitBaseline

0

LOCF:

Ignore Presumed

Progression

LRCF:

Maintain Expected Relative

Progression

Individual Subjects

Ranks

Observations

Page 21: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Study Conduct Impacting Feasibility

Potential Effects of Slow Enrollment

• Needed N may be impossible → Study stopped.

• Competitive site enrollment → Local financial loss.

• Insufficient person-years (PY) of observation for some studies, even if N is attained:

0 1 2 0 1 2 0 1 2Planned Slower YetSlower

Area = PY

N

# of

Sub

ject

s

Year

Detects Effect=Δ

Detects Effect=1.1Δ Detects

Effect=1.7Δ

Page 22: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Biostatistical Involvement in Studies

Off-site statistical design and analysis

Multicenter studies; data coordinating center.

In-house drug company statisticians.

By CRO through NIH or drug company.

Local study contracted elsewhere

e.g. UCLA, USC, CRO

Local protocol, and statistical design and analysis

Occasionally multicenter.

Page 23: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Local Protocols and Data Analysis

1. Develop protocol and data analysis plan.

2. Have randomization and blinding strategy, if study requires.

3. Data management.

4. Perform data analyses.

Page 24: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Local Data Analysis Resources

Biostatistician:

Peter Christenson, [email protected].

Develop study design, analysis plan.

Advise throughout for any study.

Perform all non-basic analyses.

Full responsibility for studies with funded %FTE.

Review some protocols for committees.

Data Management:

Database development for GCRC studies by database manager.

Page 25: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Statistical Components of Protocols

• Target population / source of subjects.• Quantification of aims, hypotheses.• Case definitions, endpoints quantified. • Randomization plan, if any.• Masking, if used.• Study size: screen, enroll, complete.• Use of data from non-completers.• Justification of study size (power, precision, other).• Methods of analysis.• Mid-study analyses.

Page 26: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

SelectedStatistical Components

and Issues

Page 27: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Case Definitions and Endpoints

• Primary case definitions and endpoints need careful thought.

• Will need to report results based on these.

Example: Study at HarborDefinition of cure very strict.

Analyzed data with this definition.

Cure rates too low - would not be taken seriously.

Scientific method → need to report them; otherwise cherry-picking.

Publication: Use primary definition; explain; also report with secondary definition. Less credible.

Page 28: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Randomization

• Helps assure attributability of treatment effects.

• Blocked randomization assures approximate chronologic equality of numbers of subjects in each treatment group.

• Recruiters must not have access to randomization list.

• List can be created with a random number generator in software, printed tables in stat texts, or even shuffled slips of paper.

Page 29: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Non-completing Subjects

• Enrolled subjects are never “dropouts”.• Protocol should specify:

– Primary analysis set (e.g., ITT or per-protocol).

– How final values will be assigned to non-completers.

• Time-to-event (survival analysis) studies may not need final assignments; use time followed.

• Study size estimates should incorporate the number of expected non-completers.

Page 30: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Study Size: Power

Power = Probability of detecting real effects of a specified minimal (clinically relevant) magnitude

• Power will be different for each outcome.• Power depends on the statistical method.• Five factors including power are inter-related.

Fixing four of these specifies the fifth:– Study size– Heterogeneity among subjects (SD)– Magnitude of treatment effect to be detected– Power to detect this magnitude of effect– Acceptable chance of false positive conclusion,

usually 0.05

Page 31: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Free Study Size Software

www.stat.uiowa.edu/~rlenth/Power

Page 32: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Free Study Size Software: ExamplePilot data: SD=8.19 in 36 subjects.

We propose N=40 subjects/group in order to provide 80% power to detect (p<0.05) an effect Δ of 5.2:

Page 33: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Study Size : May Not be Based on Power

Precision refers to how well a measure is estimated.

Margin of error = the ± value (half-width) of the 95% confidence interval.

Smaller margin of error ←→ greater precision.

To achieve a specified margin of error, solve the CI formula for N.Polls: N ≈ 1000→ margin of error on % ≈ 1/√N ≈ 3%.

Pilot Studies, Phase I, Some Phase II: Power not relevant; may have a goal of obtaining an SD for future studies.

Page 34: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Mid-Study Analyses

• Mid-study comparisons should not be made before study completion unless planned for (interim analyses). Early comparisons are unstable, and can invalidate final comparisons.

• Interim analyses are planned comparisons at specific times, usually by an unmasked advisory board. They allow stopping the study early due to very dramatic effects, and final comparisons, if study continues, are adjusted to validly account for “peeking”.

Continued …

Page 35: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Mid-Study Analyses

Effect

0

Number of Subjects EnrolledTime →

Too many analyses

Wrong early conclusion

Need to monitor, but also account for many analyses

Page 36: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Mid-Study Analyses

• Mid-study reassessment of study size is advised for long studies. Only standard deviations to date, not effects themselves, are used to assess original design assumptions.

• Feasibility analysis: – may use the assessment noted above to

decide whether to continue the study.– may measure effects, like interim analyses, by

unmasked advisors, to project ahead on the likelihood of finding effects at the planned end of study.

Continued …

Page 37: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Mid-Study Analyses

Study 1: Groups do not differ; plan to add more subjects.

Consequence → final p-value not valid; probability requires no prior knowledge of effect.

Study 2: Groups differ significantly; plan to stop study.

Consequence → use of this p-value not valid; the probability requires incorporating later comparison.

Examples: Studies at HarborRandomized; not masked; data available to PI.

Compared treatment groups repeatedly, as more subjects were enrolled.

Page 38: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Multiple Analyses at Study End

Lagakos NEJM 354(16):1667-1669.

Replacing “Subgroup”

with “Analysis” Gives a Similar

Problem

Torturing Data

False Positive

Conclusions

Page 39: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Multiple Analyses at Study End

• There are formal methods to incorporate the number of multiple analyses.

• Bonferroni

• Tukey

• Dunnett

• Transparency of what was done is most important.

• Should be aware of number of analyses and report it with any conclusions.

Page 40: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Summary:Bad Science That May Seem So Good

1. Re-examining data, or using many outcomes, seeming to be performing due diligence.

2. Adding subjects to a study that is showing marginal effects; or, stopping early due to strong results.

3. Examining effects in subgroups. See NEJM 2006 354(16):1667-1669.

Actually bad? Could be negligent NOT to do these, but need to account for doing them.

Page 41: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Statistical Software

Page 42: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Professional Statistics Software Package

Output

Enter code; syntax.

Stored data; access-ible.

Page 43: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Microsoft Excel for Statistics

• Primarily for descriptive statistics.

• Limited output.

Page 44: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Almost Free On-Line Statistics Software

Run from browser; not local.

$5/ 6 months usage.

Potential HIPPA concerns

www.statcrunch.com

Supported by NSF

Page 45: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Typical Statistics Software PackageSelect Methods from Menus

Output after menu selection

Data in spreadsheet

www.ncss.com

www.minitab.com

www.stata.com

$100 - $500

Page 46: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

http://gcrc.labiomed.org/biostat

This and

other biostat talks

posted

Page 47: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Conclusions

Don’t put off slow enrollment; find the cause; solve it.

I am available.

Do put off analyses of efficacy, not of design assumptions.

I am available.

P-values are earned, by following methods which are needed for them to be valid.

I am available.

You may have to pay for lack of attention to protocol decisions, to satisfy the scientific method.

I am available.

Software always takes more time than expected.

Page 48: Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.

Thank You

Nils Simonson, in

Furberg & Furberg,

Evaluating Clinical Research