Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD...

43
Introduction to Introduction to Statistics: Frequentist & Statistics: Frequentist & Bayesian Approaches (for Bayesian Approaches (for Non-Statisticians) Non-Statisticians) Ryung Suh, MD Ryung Suh, MD Becker & Associates Consulting, Becker & Associates Consulting, Inc. Inc. Internal Staff Training Internal Staff Training June 8, 2004 June 8, 2004 [email protected] [email protected]

Transcript of Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD...

Page 1: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

Introduction to Statistics: Introduction to Statistics: Frequentist & Bayesian Frequentist & Bayesian Approaches (for Non-Approaches (for Non-

Statisticians)Statisticians)

Ryung Suh, MDRyung Suh, MD

Becker & Associates Consulting, Inc.Becker & Associates Consulting, Inc.

Internal Staff TrainingInternal Staff Training

June 8, 2004June 8, 2004

[email protected]@becker-consult.com

Page 2: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 22

ObjectivesObjectives

• To provide a basic understanding of To provide a basic understanding of the terms and concepts that underlie the terms and concepts that underlie statistical analyses of clinical trials statistical analyses of clinical trials datadata

• To introduce Bayesian approaches To introduce Bayesian approaches and their application to FDA and their application to FDA submissionssubmissions

Page 3: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 33

Table of ContentsTable of Contents

• Sources of Statistical DataSources of Statistical Data

• Frequentist ApproachesFrequentist Approaches

• Bayesian ApproachesBayesian Approaches

• Insights from the Experts (from the Insights from the Experts (from the Bayesian Approaches meeting, May 20-21, Bayesian Approaches meeting, May 20-21, 2004)2004)

• Take-Aways and Strategic InsightsTake-Aways and Strategic Insights

• Corporate ResourcesCorporate Resources

Page 4: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 44

Sources of DataSources of Data

• Retrospective StudiesRetrospective Studies: Design, Bias, Matching, : Design, Bias, Matching, Relative Risk, Odds RatioRelative Risk, Odds Ratio

• Prospective StudiesProspective Studies: Design, Loss to Follow-up, : Design, Loss to Follow-up, Analysis, Relative Risk, Nonconcurrent Prospective Analysis, Relative Risk, Nonconcurrent Prospective Studies, Incidence, PrevalenceStudies, Incidence, Prevalence

• Randomized Controlled TrialsRandomized Controlled Trials: Design, Elimination : Design, Elimination of Bias, Placebo Effect, Analysisof Bias, Placebo Effect, Analysis

• Survival AnalysisSurvival Analysis: Person-Time, Life-Tables, : Person-Time, Life-Tables, Proportional Hazard ModelsProportional Hazard Models

Page 5: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

FREQUENTIST FREQUENTIST APPROACHESAPPROACHES

Page 6: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 66

Classical = FrequentistClassical = Frequentist

• Hypothesis TestingHypothesis Testing: In order to draw a : In order to draw a valid statistical inference that an valid statistical inference that an independent variable has a statistically independent variable has a statistically significant effect (not the same as significant effect (not the same as clinically significant effect), it is clinically significant effect), it is important to rule out chance or random important to rule out chance or random variability as an explanation for the variability as an explanation for the effects seen in a sampling distribution.effects seen in a sampling distribution.

Page 7: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 77

Statistical InferenceStatistical Inference

• Two inferential techniques:Two inferential techniques:– Hypothesis TestingHypothesis Testing– Confidence IntervalsConfidence Intervals

• Inference is the process of making Inference is the process of making statements (hypotheses) with a degree of statements (hypotheses) with a degree of statistical certainty about population statistical certainty about population parameters based on a sampling parameters based on a sampling distributiondistribution

Page 8: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 88

Hypothesis Testing: TermsHypothesis Testing: Terms

• Null HypothesisNull Hypothesis = = HHoo = initially held to be true = initially held to be true unless proven otherwiseunless proven otherwise– e.g. there is NO difference between treatment and controle.g. there is NO difference between treatment and control– e.g. µ = 11, or e.g. µ = 11, or μμ22 – – μμ11 = 0 = 0 – Akin to “the accused is innocent”Akin to “the accused is innocent”

• Alternative HypothesisAlternative Hypothesis = = HHaa = is the claim we = is the claim we usually want to proveusually want to prove– e.g. there is a difference between treatment and controle.g. there is a difference between treatment and control– e.g. µ ≠ 11, or e.g. µ ≠ 11, or μμ22 – – μμ11 ≠ 0 ≠ 0 – Akin to “the accused is guilty”Akin to “the accused is guilty”

• We assume innocence until proven guilty “beyond We assume innocence until proven guilty “beyond a reasonable doubt”… the same applies with Ha reasonable doubt”… the same applies with Hoo

Page 9: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 99

Hypothesis Testing: Hypothesis Testing: DecisionsDecisions• Decision OptionsDecision Options::

– Reject HReject Hoo (and assert H (and assert Haa to be true) to be true)

– Fail to Reject HFail to Reject Hoo (due to insufficient evidence) (due to insufficient evidence)

• Errors in DecisionsErrors in Decisions::

Decision

RealityHo is true

Reject Ho

Ho is false

Fail to Reject Ho

TYPE I ERROR

TYPE II ERRORNo Error

No Error

Page 10: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1010

Level of SignificanceLevel of Significance

• Alpha = Alpha = αα = P(Type I Error) = P(Reject H = P(Type I Error) = P(Reject Hoo | H | Hoo is true) is true)• Beta = Beta = ββ = P(Type II Error) = P(Fail to Reject H = P(Type II Error) = P(Fail to Reject Hoo | H | Hoo is is

false)false)• Power = 1 – Power = 1 – ββ

• We want both We want both αα and and ββ to be small… to be small… but increasing one decreases the other…but increasing one decreases the other…

This example is a simplification to aid understanding;the exact β tends to be generallyunknown, althoughit is frequently dueto sample sizes thatare too small.

Null Hypothesis Alternative Hypothesis

Page 11: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1111

Sampling DistributionSampling Distribution

• Population DistributionPopulation Distribution: usu. a normal distribution with a mean of : usu. a normal distribution with a mean of μμ and a variance of and a variance of σσ22 (but tough to measure the entire (but tough to measure the entire population)population)

• Sampling DistributionSampling Distribution: a distribution of means from random : a distribution of means from random samples drawn from the population; a random variable (Ẋ); samples drawn from the population; a random variable (Ẋ); normally distributed with a mean (normally distributed with a mean (μμẊẊ) and variance of () and variance of (σσ22/n), /n), – Take random samples from the population and calculate a statisticTake random samples from the population and calculate a statistic– Describes the chance fluctuations of the statistic and the variability of Describes the chance fluctuations of the statistic and the variability of

sample averages around the population mean, for a given sample size sample averages around the population mean, for a given sample size (n).(n).

– Sample mean (Sample mean (μμẊẊ) serves as a point estimate for the population mean ) serves as a point estimate for the population mean ((μμ) )

– Central Limit Theorem: as n Central Limit Theorem: as n ∞, sampling distribution approaches ∞, sampling distribution approaches normal distribution (and the estimate becomes more precise)normal distribution (and the estimate becomes more precise)

• http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/

Page 12: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1212

Determining the P(Ẋ=Determining the P(Ẋ=μμ))

• Key QuestionKey Question: Does the sample mean reflect the : Does the sample mean reflect the population mean, given the effects of population mean, given the effects of variability/chance?variability/chance?

• If population standard deviation (If population standard deviation (σσ) is known, we can ) is known, we can standardize (mean=0; s.d.=1) and compare:standardize (mean=0; s.d.=1) and compare:

Z = (Ẋ - Z = (Ẋ - μμẊẊ) / () / (σσ / √n) / √n)

• If If σσ is unknown, we can estimate is unknown, we can estimate σσ from the same set of from the same set of sample data and compare with a normal t-distribution:sample data and compare with a normal t-distribution:

T = (Ẋ - T = (Ẋ - μμẊẊ) / (s / √n)) / (s / √n)– a continuous distribution symmetric about zeroa continuous distribution symmetric about zero– an infinite number of t-distributions indexed by degrees of freedoman infinite number of t-distributions indexed by degrees of freedom– as degrees of freedom (n-1) increase, t-distributions approach standard normal as degrees of freedom (n-1) increase, t-distributions approach standard normal

distributionsdistributions

Page 13: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1313

Normal versus t-distributionNormal versus t-distribution

N(0,1)

t(1)

t(5)

T-distributions are “flatter” and have more area in the tails compared to Normal distributions

T-distributions approximate the Normal as degrees of freedom(n-1) increase

Page 14: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1414

Hypothesis Testing: More Hypothesis Testing: More TermsTerms• Test StatisticTest Statistic: the computed statistic used to make the decisions : the computed statistic used to make the decisions

in hypothesis testing; relates to a probability distribution (e.g. Z, t, in hypothesis testing; relates to a probability distribution (e.g. Z, t, ΧΧ22))

• Critical RegionCritical Region: contains the values of the test statistic such that : contains the values of the test statistic such that HHoo is rejected is rejected

• Critical ValueCritical Value: the endpoint(s) of the critical region : the endpoint(s) of the critical region

• One-tailedOne-tailed versus versus two-tailedtwo-tailed tests: depends on H tests: depends on Haa

• P-ValueP-Value: the smallest value of : the smallest value of αα such that H such that Hoo will be rejected (a will be rejected (a probability associated with the calculated value of the test probability associated with the calculated value of the test statistic)statistic)

Page 15: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1515

Steps in Hypothesis Testing:Steps in Hypothesis Testing:The Classical/Frequentist The Classical/Frequentist ApproachApproach• Define parameter and specify HDefine parameter and specify Hoo and H and Haa

• Specify n (sample size), Specify n (sample size), αα (significance level), the test (significance level), the test statistic, and the critical value(s) and critical regionsstatistic, and the critical value(s) and critical regions

• Take a sample and compute the value of the test statistic; Take a sample and compute the value of the test statistic; compare to the relevant probability distributioncompare to the relevant probability distribution

• Reject or fail to reject HReject or fail to reject Hoo and draw statistical inferences and draw statistical inferences

* * RememberRemember: P-value is not the probability of the null hypothesis : P-value is not the probability of the null hypothesis being true (the null hypothesis is either true or not, with P-value being true (the null hypothesis is either true or not, with P-value defining the level of significance for which randomness is defining the level of significance for which randomness is considered).considered).

Page 16: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1616

Confidence IntervalsConfidence Intervals

• CI for (1-CI for (1-αα)100%: )100%: Ẋ ± t (n-1, Ẋ ± t (n-1, αα/2)(s/√n) /2)(s/√n) – Provides CI for population mean (Provides CI for population mean (μμ) at the chosen level ) at the chosen level

of confidence (e.g. 90%, 95%, 99%)of confidence (e.g. 90%, 95%, 99%)– Provides interval estimate of the population mean (vs. Provides interval estimate of the population mean (vs.

the point estimate that the sample mean gives)the point estimate that the sample mean gives)– Depends on the amount of variability in the dataDepends on the amount of variability in the data– Depends on the level of certainty we requireDepends on the level of certainty we require– Increasing (1-Increasing (1-αα) will increase the CI width ) will increase the CI width – Increasing sample size (n) will decrease the CI widthIncreasing sample size (n) will decrease the CI width

Page 17: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1717

Issues for Frequentists (and Issues for Frequentists (and others)others)• MultiplicityMultiplicity: : the chance of a Type I error when multiple hypotheses are the chance of a Type I error when multiple hypotheses are

tested is larger than the chance of a Type I error in each hypothesis testtested is larger than the chance of a Type I error in each hypothesis test

• Multiple EndpointsMultiple Endpoints: Frequentists worry about the dimensions of the : Frequentists worry about the dimensions of the sample space (the Bayesian looks at the dimensions of the parameter sample space (the Bayesian looks at the dimensions of the parameter space)…both tend to be skeptical of “believing what he thinks he sees in space)…both tend to be skeptical of “believing what he thinks he sees in high-dimensional problems” (Permutt)high-dimensional problems” (Permutt)

• Multiple LooksMultiple Looks: Trials are expensive, so sequential methods are : Trials are expensive, so sequential methods are attractive; but stopping rules tend to be fixed in frequentist approachesattractive; but stopping rules tend to be fixed in frequentist approaches

• Multiple StudiesMultiple Studies: Frequentist meta-analysis (to look at combined : Frequentist meta-analysis (to look at combined evidence from several studies) cannot rely simply on a fixed p-value (i.e. evidence from several studies) cannot rely simply on a fixed p-value (i.e. 0.05); it must look at the entirely of the evidence and the strength of 0.05); it must look at the entirely of the evidence and the strength of each pieceeach piece

• Garbage In, Garbage OutGarbage In, Garbage Out

Page 18: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BAYESIAN APPROACHESBAYESIAN APPROACHES

Page 19: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 1919

Bayesian StatisticsBayesian Statistics

• Thomas Bayes (1702-1761): English theologian and Thomas Bayes (1702-1761): English theologian and mathematician; “Essay towards solving a problem in the mathematician; “Essay towards solving a problem in the doctrine of chances” (1763)doctrine of chances” (1763)

• Bayesian methodsBayesian methods: iterative processes that make better : iterative processes that make better decisions based on learning from experiencesdecisions based on learning from experiences

• combines a prior probability distribution for the states of nature with combines a prior probability distribution for the states of nature with new sample information new sample information

• the combined data gives a revised probability distribution about the the combined data gives a revised probability distribution about the states of nature, which is then used as a prior probability distribution states of nature, which is then used as a prior probability distribution with new (future) sample information with new (future) sample information

• and so on and so onand so on and so on

• Key featureKey feature: using an empirically derived probability : using an empirically derived probability distribution for a population parameterdistribution for a population parameter

• May use objective data or subjective opinions in specifying a prior May use objective data or subjective opinions in specifying a prior distributiondistribution

• Criticized for lack of objectivity in specifying prior probability distributionCriticized for lack of objectivity in specifying prior probability distribution

Page 20: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2020

A Bayesian exampleA Bayesian example• From From http://www.abelard.org/briefings/bayes.htmhttp://www.abelard.org/briefings/bayes.htm• 15 blue taxis; 85 black taxis; only 100 taxis in the entire town15 blue taxis; 85 black taxis; only 100 taxis in the entire town• Witness claims seeing a blue taxi in hit-and-runWitness claims seeing a blue taxi in hit-and-run• Witness is given a “random” ordered testWitness is given a “random” ordered test

– successfully identifies 4/5 taxis correctly (80%)successfully identifies 4/5 taxis correctly (80%)

• ““If witness claims blue, how likely is she to have the color correct?”If witness claims blue, how likely is she to have the color correct?”– Blue taxis: 80% is 12 blue; 3 blackBlue taxis: 80% is 12 blue; 3 black– Black taxis: 80% is 68 black; 17 blueBlack taxis: 80% is 68 black; 17 blue

• In given sample space, 12/29 claims of “blue” are actually blue taxis (41%)In given sample space, 12/29 claims of “blue” are actually blue taxis (41%)• A claim of “black” would be 68/71 (in the given sample space) = 96%A claim of “black” would be 68/71 (in the given sample space) = 96%

• Bayesians take into account the rate of “false positives” for black taxis as Bayesians take into account the rate of “false positives” for black taxis as well as for blue taxis (note that black taxis are in greater supply here)well as for blue taxis (note that black taxis are in greater supply here)– Bayesian stats useful for calculating relatively small risks (e.g. rare disorders)Bayesian stats useful for calculating relatively small risks (e.g. rare disorders)– Bayesian stats useful in non-random distributionsBayesian stats useful in non-random distributions

Page 21: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2121

Perspectives on ProbabilityPerspectives on Probability

• FrequentistFrequentist: probability = the : probability = the relative frequency of an event, given relative frequency of an event, given the experiment is repeated an the experiment is repeated an infinite number of timesinfinite number of times

• BayesianBayesian: probability = “degree of : probability = “degree of belief” or the likelihood of an event belief” or the likelihood of an event happening given what is known happening given what is known about the populationabout the population

Page 22: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2222

Bayesian Hypothesis Bayesian Hypothesis TestingTesting• Non-BayesiansNon-Bayesians: navigate the optimal tradeoff between : navigate the optimal tradeoff between

the probabilities of a “false alarm” (Type I error) and a the probabilities of a “false alarm” (Type I error) and a “miss” (Type II error)“miss” (Type II error)– One can compare the likelihood ratio of these two probabilities to One can compare the likelihood ratio of these two probabilities to

a nonnegative threshold value (or the log likelihood ratio to an a nonnegative threshold value (or the log likelihood ratio to an arbitrary real threshold value)arbitrary real threshold value)

– Increasing the threshold makes the test less “sensitive” (higher Increasing the threshold makes the test less “sensitive” (higher chance of a “miss”); decreasing the threshold makes the test chance of a “miss”); decreasing the threshold makes the test more sensitive (but with a higher chance of a “false alarm”)more sensitive (but with a higher chance of a “false alarm”)

– More data improves the limits of this ratio (the limit relation is More data improves the limits of this ratio (the limit relation is often give as often give as Stein’s lemma, Stein’s lemma, which approaches the which approaches the Kullback-Kullback-LeiblerLeibler distance) distance)

• BayesiansBayesians: instead of optimizing a probability tradeoff, : instead of optimizing a probability tradeoff, a “miss” event or “false alarm” event is assigned a “miss” event or “false alarm” event is assigned costscosts; ; additionally, we have additionally, we have prior distributionsprior distributions– Decision function is based on the Decision function is based on the Bayes RiskBayes Risk, or expected costs, or expected costs– Threshold value is a function of Threshold value is a function of costscosts and and priorspriors

Page 23: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2323

Bayesian Parameter Bayesian Parameter EstimationEstimation• Non-BayesiansNon-Bayesians: the probability of an event is estimated as : the probability of an event is estimated as

the empirical frequency of the event in a data samplethe empirical frequency of the event in a data sample

• BayesiansBayesians: include empirical “prior information;” as the : include empirical “prior information;” as the data sample goes to infinity, the effects of the past trial data sample goes to infinity, the effects of the past trial wash outwash out

• If there is no empirical “prior information,” it is possible to create a If there is no empirical “prior information,” it is possible to create a prior distribution based on reasonable beliefsprior distribution based on reasonable beliefs

• We calculate the We calculate the posterior distributionposterior distribution from the sample data and from the sample data and the the prior distributionprior distribution using using Baye’s Theorem:Baye’s Theorem:

P(A|B) = [ P(B|A) * P(A) ]/ P(B)P(A|B) = [ P(B|A) * P(A) ]/ P(B)• This becomes the new prior distribution (known as a This becomes the new prior distribution (known as a conjugate conjugate

priorprior); this process allows efficient sequential updating of the ); this process allows efficient sequential updating of the posterior distributions as the study proceedsposterior distributions as the study proceeds

The “output” of the Bayesian analysis is the entire posterior The “output” of the Bayesian analysis is the entire posterior distribution (not just a single point estimate); it summarizes ALL distribution (not just a single point estimate); it summarizes ALL our information to dateour information to date

As we get more data, the posterior distribution will become more As we get more data, the posterior distribution will become more sharply peaked about a single valuesharply peaked about a single value

Page 24: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2424

Bayesian Sequential Bayesian Sequential AnalysisAnalysis• Given no fixed number of observations, and the Given no fixed number of observations, and the

observations come in sequence (until we decide to stop)…observations come in sequence (until we decide to stop)…

• Non-BayesiansNon-Bayesians: the : the sequential probability ratio testsequential probability ratio test is comparable to the log likelihood ratio and is used to is comparable to the log likelihood ratio and is used to decide on outcome 1, outcome 2, or to keep collecting decide on outcome 1, outcome 2, or to keep collecting observations (assigning threshold values to the log ratio observations (assigning threshold values to the log ratio functions)functions)

• BayesiansBayesians: use the : use the sequential Bayes risksequential Bayes risk by assigning by assigning a cost (of “false alarms” and “misses”) proportional to the a cost (of “false alarms” and “misses”) proportional to the number of observations prior to stopping; the goal is to number of observations prior to stopping; the goal is to minimize expected cost using a strategy of minimize expected cost using a strategy of optimal optimal stoppingstopping

Page 25: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

INSIGHTS FROM THE INSIGHTS FROM THE EXPERTS (BAYESIANS AND EXPERTS (BAYESIANS AND

FREQUENTISTS)FREQUENTISTS)

Page 26: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2626

Steve Goodman (Hopkins)Steve Goodman (Hopkins)• Medical Inference is inductiveMedical Inference is inductive

– Deductive (disease Deductive (disease signs/symptoms)… traditional statistical methods signs/symptoms)… traditional statistical methods– Inductive (signs/symptoms Inductive (signs/symptoms disease)…Bayesian approaches more disease)…Bayesian approaches more

appropriateappropriate

• Bayes Theorem: Bayes Theorem: – prior odds x Bayes factor = posterior oddsprior odds x Bayes factor = posterior odds– Pretest odds x likelihood factor = posttest oddsPretest odds x likelihood factor = posttest odds

• P-Value = P(X being more extreme than observed result, assuming P-Value = P(X being more extreme than observed result, assuming null hypothesis to be true)null hypothesis to be true)– Does not represent the probability of observed data being true Does not represent the probability of observed data being true – Does not represent the probability of observed data being by chanceDoes not represent the probability of observed data being by chance– Does not represent the probability of the truth of the null hypothesisDoes not represent the probability of the truth of the null hypothesis

• If P(data|hypothesis) = p, then likelihood of (hypothesis|data) = c*p, If P(data|hypothesis) = p, then likelihood of (hypothesis|data) = c*p, where c is an arbitrary constantwhere c is an arbitrary constant– P(HP(H00|data) / P(H|data) / P(Haa|data) = [g / (1-g)] * [P(data|H|data) = [g / (1-g)] * [P(data|H00) / P (data|H) / P (data|Haa)])]

Page 27: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2727

Steve Goodman (Hopkins)Steve Goodman (Hopkins)

• P-ValueP-Value::– NoncomparativeNoncomparative– Observed + hypothetical Observed + hypothetical

datadata– Implicit HImplicit Haa

– Evidence can only be Evidence can only be negativenegative

– Sensitive to stopping rulesSensitive to stopping rules– No formal interpretationNo formal interpretation

• Bayes FactorBayes Factor::– ComparativeComparative– Only observed dataOnly observed data– Pre-defined explicit HPre-defined explicit Haa

– Positive or negative Positive or negative evidenceevidence

– Insensitive to stopping rulesInsensitive to stopping rules– Formal interpretationFormal interpretation

P-Value asks you to look at the data only then make inferences later

Bayesian methods ask you to ask the question first and look at existing data that is evidence for the question

Page 28: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2828

Tom Louis (Hopkins)Tom Louis (Hopkins)

• Bayesian InferenceBayesian Inference::– Specify the multi-level structure of prior probability distributionsSpecify the multi-level structure of prior probability distributions– Compute the joint posterior distribution for all unknownsCompute the joint posterior distribution for all unknowns– Compute the posterior distribution of quantities by integrating known Compute the posterior distribution of quantities by integrating known

conditionsconditions– Use the joint distribution to make inferencesUse the joint distribution to make inferences

• Bayesian AdvantagesBayesian Advantages::– Precision increases with more available informationPrecision increases with more available information– Repeated sampling gives information on the priorRepeated sampling gives information on the prior– More flexible when looking at More flexible when looking at partially relatedpartially related gaussian distributions gaussian distributions– Allows inclusion and structuring of historical data (allows a compromise Allows inclusion and structuring of historical data (allows a compromise

between ignoring historical data (no weight) and data-pooling (full between ignoring historical data (no weight) and data-pooling (full weight)weight)

• Captures relevant uncertaintiesCaptures relevant uncertainties• Structures complicated inferencesStructures complicated inferences• Adds flexibility in designsAdds flexibility in designs• Documents assumptionsDocuments assumptions

Page 29: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 2929

Don Berry (M.D. Anderson)Don Berry (M.D. Anderson)• Approaches to drug/device developmentApproaches to drug/device development::

– Fully Bayes Fully Bayes likelihood principle (for company decision-making) likelihood principle (for company decision-making)– Bayesian tools for expanding the frequentist envelope (for designing Bayesian tools for expanding the frequentist envelope (for designing

and analyzing registration studies)and analyzing registration studies)

• Bayesian advantagesBayesian advantages::– Sequential learning is useful in study designSequential learning is useful in study design– Predictive distributions (frequentists cannot emulate this)Predictive distributions (frequentists cannot emulate this)– Borrowing strength from historical data, concomitant trials, or from Borrowing strength from historical data, concomitant trials, or from

across patient and disease groupsacross patient and disease groups– Early data allows Early data allows Adaptive RandomizationAdaptive Randomization

• Ethical advantage: stop clearly harmful or ineffective drugs/devices early in Ethical advantage: stop clearly harmful or ineffective drugs/devices early in the trialthe trial

• Find “nuggets” quickly and with higher probabilityFind “nuggets” quickly and with higher probability• Learn quickly, treat patients in trial more effectively, save resourcesLearn quickly, treat patients in trial more effectively, save resources

– May save resources (base development on early decision-analysis)May save resources (base development on early decision-analysis)– May test multiple experimental drugs (e.g. cancer drug cocktails)May test multiple experimental drugs (e.g. cancer drug cocktails)– Seamless transitions through clinical trial phases (e.g. do not stop Seamless transitions through clinical trial phases (e.g. do not stop

accrual)accrual)• Increase statistical power with much smaller sample populationsIncrease statistical power with much smaller sample populations• Relates response and survival rates as wellRelates response and survival rates as well

– Early decisions on treatment…and on ending a trial…Early decisions on treatment…and on ending a trial…

Page 30: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3030

Bob Temple (CDER)Bob Temple (CDER)

• FDA is “nervous” and “inexperienced” with regard to FDA is “nervous” and “inexperienced” with regard to Bayesian analysis (perhaps with exception in CRDH)Bayesian analysis (perhaps with exception in CRDH)

• StrategyStrategy: should show both frequentist and Bayesian : should show both frequentist and Bayesian results (and show the difference)results (and show the difference)

• PitfallsPitfalls: Bayesian approaches can sometimes be : Bayesian approaches can sometimes be longer and more expensive for the companylonger and more expensive for the company

• BottomlineBottomline: Bayesian approaches are still new and : Bayesian approaches are still new and need to be better understood by investigators and need to be better understood by investigators and regulatorsregulators

Page 31: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3131

Larry Kessler (CDRH)Larry Kessler (CDRH)• Bayesians at CDRHBayesians at CDRH: : Greg Campbell, Don Malec, Gene Pennello, Telba Greg Campbell, Don Malec, Gene Pennello, Telba

IronyIrony– White Paper (1997): White Paper (1997): http://ftp.isds.duke.edu/WorkingPapers/97-21.pshttp://ftp.isds.duke.edu/WorkingPapers/97-21.ps

• Applications to devicesApplications to devices::– Devices tend to have a great deal of prior information (mechanism of action Devices tend to have a great deal of prior information (mechanism of action

is physical and local, as opposed to pharmacokinetic and systemic)is physical and local, as opposed to pharmacokinetic and systemic)– Devices usually evolve in small stepsDevices usually evolve in small steps– Studies “gain strength” by using Studies “gain strength” by using quantitativequantitative prior information prior information– Prediction models available for surrogate variablesPrediction models available for surrogate variables– Sensitivity analysis available for missing dataSensitivity analysis available for missing data– Adaptive trial designs often useful for decision theoretics, non-inferiority Adaptive trial designs often useful for decision theoretics, non-inferiority

trials, and post-market surveillancetrials, and post-market surveillance– Helps determine sample size and interim-look strategiesHelps determine sample size and interim-look strategies

• Risks and ChallengesRisks and Challenges::– Often a trade-off between “clinical burden” and “computational burden”Often a trade-off between “clinical burden” and “computational burden”– Can be more expensive (e.g. if the prior information is NOT predictive or Can be more expensive (e.g. if the prior information is NOT predictive or

useless)useless)– Beware of the “regression to the mean” effectBeware of the “regression to the mean” effect– Hierarchical structure is not good if too little (single prior study) or too Hierarchical structure is not good if too little (single prior study) or too

much prior infomuch prior info

Page 32: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3232

Larry Kessler (CDRH)Larry Kessler (CDRH)• Considerations:Considerations:

– Restrict to quantitative prior information Restrict to quantitative prior information • Need legal permission because companies tend to “own” prior studies and dataNeed legal permission because companies tend to “own” prior studies and data• Published literature and SSEs often lack patient-level dataPublished literature and SSEs often lack patient-level data

– FDA/companies need to reach agreement on the validity of any prior infoFDA/companies need to reach agreement on the validity of any prior info– Need new decision rules for the clinical study processNeed new decision rules for the clinical study process

• Frequentist: statistically significant result for primary endpoint effectivenessFrequentist: statistically significant result for primary endpoint effectiveness• Bayesian: posterior probability exceeding some predetermined value (or some Bayesian: posterior probability exceeding some predetermined value (or some

interval within which it behaves consistently)interval within which it behaves consistently)– Bayesian trials must be prospectively designed (no switching mid-stream)Bayesian trials must be prospectively designed (no switching mid-stream)– Control group cannot be used as a source of prior info for the new deviceControl group cannot be used as a source of prior info for the new device– Need new formats for Labeling and for the Summary of Safety and Need new formats for Labeling and for the Summary of Safety and

EffectivenessEffectiveness– Simulations are important (show that “Type I error” is well-controlled)Simulations are important (show that “Type I error” is well-controlled)– FDA review team plays role in choice of decision rules for success and for FDA review team plays role in choice of decision rules for success and for

the exchangeability of prior studies in a hierarchical modelthe exchangeability of prior studies in a hierarchical model

• RecommendationsRecommendations::– Prospectively planned, with legally available and valid prior informationProspectively planned, with legally available and valid prior information– Good communications with the FDA, with a good statistician, and proper Good communications with the FDA, with a good statistician, and proper

electronic Dataelectronic Data

Page 33: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3333

Ralph D’Agostino (Boston Univ)Ralph D’Agostino (Boston Univ)(Advisory Committee Member)(Advisory Committee Member)• Randomized Controlled Trials: need to keep simpleRandomized Controlled Trials: need to keep simple

– Challenge is that Bayesian methods can sometimes seem Challenge is that Bayesian methods can sometimes seem complexcomplex

– Promise is that Bayesian methods can be made more Promise is that Bayesian methods can be made more intuitiveintuitive

• Should NOT use Bayesian methods to salvage studies Should NOT use Bayesian methods to salvage studies that have failed frequentist approachesthat have failed frequentist approaches

• Sometimes Bayesians are too optimistic about their Sometimes Bayesians are too optimistic about their ability to see validity across studies with different ability to see validity across studies with different populations, different endpoints, and different populations, different endpoints, and different analytical methodsanalytical methods

Page 34: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3434

Bob O’Neill (CDER)Bob O’Neill (CDER)

• Too many people misinterpret the p-valueToo many people misinterpret the p-value

• We rely on statistical significance with little regard for effect We rely on statistical significance with little regard for effect size or magnitudesize or magnitude

• The FDA needs to develop more format and content guides The FDA needs to develop more format and content guides about reporting Bayesian statisticsabout reporting Bayesian statistics

• Dealing with missing data is essentially a Bayesian exercise Dealing with missing data is essentially a Bayesian exercise (i.e. model-building)(i.e. model-building)

• Bayesian statistics cut both ways (may require more time, Bayesian statistics cut both ways (may require more time, expenses, and data to reach required evidence)expenses, and data to reach required evidence)

Page 35: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3535

Stacy Lindborg (Global Stacy Lindborg (Global Statistics) and Greg Campbell Statistics) and Greg Campbell (CDRH)(CDRH)• SL: Need validated computer software for Bayesian statistics and SL: Need validated computer software for Bayesian statistics and

need a great deal of education to help regulators and clinicians need a great deal of education to help regulators and clinicians understand the meaning of “predictive posterior probabilities” and to understand the meaning of “predictive posterior probabilities” and to trust in Bayesian statisticstrust in Bayesian statistics

• SL: Great promise with regard to: SL: Great promise with regard to: – Looking at data more comprehensivelyLooking at data more comprehensively– Conducting trials more ethicallyConducting trials more ethically

• GC: Bayesian designs need to be done prospectivelyGC: Bayesian designs need to be done prospectively– CANNOT switch to Bayesian analysis to rescue/salvage studies that are not going wellCANNOT switch to Bayesian analysis to rescue/salvage studies that are not going well

• GC: Bayesian methods have the potential to shorten study duration, GC: Bayesian methods have the potential to shorten study duration, cut costs (by reducing number of patients), and enhance product cut costs (by reducing number of patients), and enhance product developmentdevelopment

• GC: Between 1999-2003, there have been 14 original PMAs & GC: Between 1999-2003, there have been 14 original PMAs & Supplements in which Bayesian estimation was the primary analysis; Supplements in which Bayesian estimation was the primary analysis; many more are in the worksmany more are in the works

Page 36: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3636

Don Rubin (Harvard) Don Rubin (Harvard) and Jay Siegal (Centecor)and Jay Siegal (Centecor)• DR: Bayesian thinking is our natural way to look at the worldDR: Bayesian thinking is our natural way to look at the world

• DR: Frequentist approaches need to work with Bayesian thinking (they DR: Frequentist approaches need to work with Bayesian thinking (they are still just rules)are still just rules)

• DR: Validation is needed to ensure that both the model and the DR: Validation is needed to ensure that both the model and the analysis are appropriateanalysis are appropriate

• JS: Bayesian approaches (which relies on Predictive Value) and JS: Bayesian approaches (which relies on Predictive Value) and Frequentist approaches (which relies on Specificity) will converge to Frequentist approaches (which relies on Specificity) will converge to the extent that prior probabilites are similar the extent that prior probabilites are similar – e.g. in adult use drugs/devices now applied to pediatric usee.g. in adult use drugs/devices now applied to pediatric use– e.g the same class of drug being applied to similar therapeutic usese.g the same class of drug being applied to similar therapeutic uses

• JS: Concerns about movement toward Bayesian approachesJS: Concerns about movement toward Bayesian approaches– Shifts incentives toward non-innovative (more valid priors for existing Shifts incentives toward non-innovative (more valid priors for existing

therapies)therapies)– Priors constantly change during a trial (need predictable, prospective Priors constantly change during a trial (need predictable, prospective

standards)standards)– Legal concerns about using competitors’ dataLegal concerns about using competitors’ data

Page 37: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3737

Susan Ellenberg (OBE, CBER) Susan Ellenberg (OBE, CBER) and Norris Alderson (FDA)and Norris Alderson (FDA)• SE: If Bayesian approaches are really a better mousetrap, it SE: If Bayesian approaches are really a better mousetrap, it

will spread and people will “beg” to demand itwill spread and people will “beg” to demand it

• NA: “Bayesian is NOT a religion”NA: “Bayesian is NOT a religion”

• NA: Incorporating NA: Incorporating a prioria priori knowledge is useful, but we need knowledge is useful, but we need frequentist checks at times (reality checks)frequentist checks at times (reality checks)

• NA: Clear guidelines on methods, formats, content, analysis, NA: Clear guidelines on methods, formats, content, analysis, etc. are need; FDA regulators will need to work with etc. are need; FDA regulators will need to work with statisticians, clinicians, and industry to accomplish thisstatisticians, clinicians, and industry to accomplish this

• NA: Bayesian approaches still must deal with the common NA: Bayesian approaches still must deal with the common sources of bias found in frequentist approachessources of bias found in frequentist approaches

Page 38: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

TAKE-AWAYSTAKE-AWAYS

Page 39: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 3939

Statistical Terms and Statistical Terms and ConceptsConcepts• Sources of DataSources of Data• Statistical Inference Statistical Inference • Frequentist Hypothesis TestingFrequentist Hypothesis Testing

– Null and Alternative HypothesesNull and Alternative Hypotheses– Test Statistics and Sampling DistributionTest Statistics and Sampling Distribution– Type I and Type II Errors; PowerType I and Type II Errors; Power– P-Value and Significance Level (P-Value and Significance Level (αα))

• Confidence IntervalsConfidence Intervals• Bayesian Statistics Bayesian Statistics

– Prior probability distributionPrior probability distribution– Posterior (or Joint) probability distributionPosterior (or Joint) probability distribution– Bayes Factor (or Likelihood Ratio)Bayes Factor (or Likelihood Ratio)– Adaptive RandomizationAdaptive Randomization

Page 40: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 4040

Strategic FDA InsightsStrategic FDA Insights• FDA (especially CDRH) favorable to Bayesian approachesFDA (especially CDRH) favorable to Bayesian approaches

• Not effective in rescuing/salvaging troubled studies; must do prospectivelyNot effective in rescuing/salvaging troubled studies; must do prospectively

• May lead to quicker, less expensive approvals (but may be longer, more May lead to quicker, less expensive approvals (but may be longer, more expensive as well)expensive as well)

• Useful in predictive models, sensitivity analysis for missing data, adaptive trial Useful in predictive models, sensitivity analysis for missing data, adaptive trial designs, and for looking at data more comprehensively (and perhaps ethically)designs, and for looking at data more comprehensively (and perhaps ethically)

• Need to use valid quantitative prior information (work with owners of data and Need to use valid quantitative prior information (work with owners of data and with the FDA)with the FDA)

• New decision rules, content, format, method, analysis, and reporting guidelines New decision rules, content, format, method, analysis, and reporting guidelines are needed (as well as new labeling and SSE)are needed (as well as new labeling and SSE)

• A good statistician with both Bayesian and Frequentist credentials is perhaps our A good statistician with both Bayesian and Frequentist credentials is perhaps our best advocate; many Bayesians already have good relationships with the FDAbest advocate; many Bayesians already have good relationships with the FDA

Page 41: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 4141

Final ThoughtsFinal Thoughts

• Clinical versus Statistical SignificanceClinical versus Statistical Significance

• Why p-values of 0.05?Why p-values of 0.05?

• Importance of the research questionImportance of the research question

• Bayesian is not a religion, although some Bayesian is not a religion, although some Bayesians seem to see it that wayBayesians seem to see it that way

• The promise of new statistical approachesThe promise of new statistical approaches

• Our need to understand (at least at a basic Our need to understand (at least at a basic level) the statistical work we do for our level) the statistical work we do for our clientsclients

Page 42: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 4242

Corporate ResourcesCorporate Resources

• Carlos Alzola, MSCarlos Alzola, MS

• Aldo Crossa, MSAldo Crossa, MS

• Campbell Tuskey, MSPHCampbell Tuskey, MSPH

• Reine Lea Speed, MPHReine Lea Speed, MPH

• Ryung Suh, MDRyung Suh, MD

• Expert Associates: Simon, d’Agostino, Expert Associates: Simon, d’Agostino, Rubin, HCRI, HopkinsRubin, HCRI, Hopkins

• Firm Library and Statistical LiteratureFirm Library and Statistical Literature

Page 43: Introduction to Statistics: Frequentist & Bayesian Approaches (for Non-Statisticians) Ryung Suh, MD Becker & Associates Consulting, Inc. Internal Staff.

BECKER, INC.BECKER, INC. 4343

ReferencesReferences• ““Bayesian Approaches,” U.S. Food and Drug Administration. Meeting Bayesian Approaches,” U.S. Food and Drug Administration. Meeting

at Masur Auditorium, National Institutes of Health, May 20-21, 2004.at Masur Auditorium, National Institutes of Health, May 20-21, 2004.

• Morton, Richard F, J. Richard Hebel, and Robert J. McCarter. Morton, Richard F, J. Richard Hebel, and Robert J. McCarter. A Study A Study Guide to Epidemiology and BiostatisticsGuide to Epidemiology and Biostatistics. 3. 3rdrd ed. 1990. ed. 1990.

• Permutt, Thomas. “Three Nonproblems in the Frequentist Approach to Permutt, Thomas. “Three Nonproblems in the Frequentist Approach to Clinical Trials,” U.S. Food and Drug Administration.Clinical Trials,” U.S. Food and Drug Administration.

• Stockburger, David W. Stockburger, David W. Introductory Statistics: Concepts, Models, and Introductory Statistics: Concepts, Models, and ApplicationsApplications. . http://www.psychstat.smsu.edu/introbook/sbk19m.htmhttp://www.psychstat.smsu.edu/introbook/sbk19m.htm

• Thornburg, Harvey. “Introduction to Bayesian Statistics,” Thornburg, Harvey. “Introduction to Bayesian Statistics,” CCRMACCRMA. . Stanford University, Spring 2000-2001.Stanford University, Spring 2000-2001.

• Sampling Distribution Demonstration. Sampling Distribution Demonstration. http://http://www.ruf.rice.edu/~lane/stat_sim/sampling_distwww.ruf.rice.edu/~lane/stat_sim/sampling_dist//