Download - Analysis of Variance (ANOVA) Brian Healy, PhD BIO203.

Analysis of Variance Analysis of Variance (ANOVA)(ANOVA)

Brian Healy, PhDBrian Healy, PhD

BIO203BIO203

Types of analysis-independent Types of analysis-independent samplessamples

OutcomeOutcome ExplanatoryExplanatory AnalysisAnalysis

ContinuousContinuous DichotomousDichotomous t-test, t-test, Wilcoxon Wilcoxon testtest

ContinuousContinuous CategoricalCategorical ANOVA, ANOVA, linear linear regressionregression

ContinuousContinuous ContinuousContinuous Correlation, Correlation, linear regressionlinear regression

DichotomousDichotomous DichotomousDichotomous Chi-square test, Chi-square test, logistic logistic regressionregression

DichotomousDichotomous ContinuousContinuous Logistic Logistic regressionregression

Time to eventTime to event DichotomousDichotomous Log-rank testLog-rank test

ExampleExample

A recent study compared the A recent study compared the hypointensity of gray matter hypointensity of gray matter structures on MRI in normal controls, structures on MRI in normal controls, benign MS patients and secondary benign MS patients and secondary progressive MS patientsprogressive MS patients

Increased hypointensity is a marker Increased hypointensity is a marker of diseaseof disease

Question: Is there any difference Question: Is there any difference among these groups?among these groups?

The null hypothesis is that all of the groups The null hypothesis is that all of the groups have the same hypointensity on averagehave the same hypointensity on average– Categorical predictorCategorical predictor– Continuous outcomeContinuous outcome

You could compare each of the groups to You could compare each of the groups to each of the other groups which would be 3 each of the other groups which would be 3 pair wise comparisons at the 0.05 level, pair wise comparisons at the 0.05 level, but what happens to the overall alpha but what happens to the overall alpha level?level?

What is What is ??– = = P(reject HP(reject H00 | H | H00 is true) is true) so in this case so in this case = =

P(one difference | all are equal )P(one difference | all are equal )

Also, Also, P(fail to reject HP(fail to reject H0 0 | H| H00 is true) = 1 - is true) = 1 -

Overall Overall levellevel Now, if we completed each of the 3 pair Now, if we completed each of the 3 pair

wise tests at the 0.05 level and all of the wise tests at the 0.05 level and all of the tests were independent, tests were independent, P(fail to reject all 3 P(fail to reject all 3 hypotheses | Hhypotheses | H00 is true) = is true) = (1-0.05)(1-0.05)3 3 = 0.857= 0.857

Therefore, Therefore, P(reject at least 1 | HP(reject at least 1 | H00 is true) = is true) = 1-0.857 = 0.143 = 1-0.857 = 0.143 = type I errortype I error

Type I error is greater than 0.05. This gets Type I error is greater than 0.05. This gets worse as number of comparisons increasesworse as number of comparisons increases

What can we do?What can we do?– ANOVAANOVA

Analysis of variance Analysis of variance (ANOVA)(ANOVA)

Null hypothesis is Null hypothesis is nn

We are testing if the mean is equal across We are testing if the mean is equal across groupsgroups

The alternative hypothesis is that at least The alternative hypothesis is that at least one of the means is different (but we will one of the means is different (but we will not be able to determine which one using not be able to determine which one using this test)this test)

The name tells us that we are going to be The name tells us that we are going to be using the variance, but the goal is to use using the variance, but the goal is to use the variance to compare the means (this is the variance to compare the means (this is a common source of confusion)a common source of confusion)

How does this work?How does this work?

As with the t-test, we have a As with the t-test, we have a continuous continuous outcomeoutcome, but now we have multiple , but now we have multiple groups, which is a groups, which is a categorical variablecategorical variable

Before we begin, we must consider the Before we begin, we must consider the assumptions required to use ANOVAassumptions required to use ANOVA– The underlying distributions of the The underlying distributions of the

populations are normalpopulations are normal– The variance of each group is equal (This is The variance of each group is equal (This is

critical for ANOVA), critical for ANOVA), homoskedastichomoskedastic These are similar to the two sample t-These are similar to the two sample t-

testtest

PicturePicture

If all of the groups If all of the groups had the same had the same means, the means, the distributions for all distributions for all of the populations of the populations would look exactly would look exactly the same (overlaid the same (overlaid graphs)graphs)

Picture IIPicture II

Now, if the means of the populations Now, if the means of the populations were different, the picture would look were different, the picture would look like this. Notice that the variability like this. Notice that the variability between the groups is much greater between the groups is much greater than within a groupthan within a group

Sources of varianceSources of variance

When we take samples from each group, When we take samples from each group, there will be two sources of variabilitythere will be two sources of variability– Within group variability - when we sample Within group variability - when we sample

from a group there will be variability from from a group there will be variability from person to person in the same groupperson to person in the same group

– Between group variability – the difference Between group variability – the difference from group to groupfrom group to group If the between group variability is large, the If the between group variability is large, the

means of the two groups are likely not the samemeans of the two groups are likely not the same

We can use the two types of variability to We can use the two types of variability to determine if the means are likely differentdetermine if the means are likely different

How can we do this?How can we do this? Look again at the pictureLook again at the picture Blue arrow: within group, red arrow: Blue arrow: within group, red arrow:

between groupbetween group

Blue arrow: within group, red arrow: Blue arrow: within group, red arrow: between groupbetween group

Notice that when the distribution are Notice that when the distribution are separate, the between group variability is separate, the between group variability is much greater than the within groupmuch greater than the within group

NotationNotation

First we will define First we will define

How could we express the different How could we express the different forms of variability?forms of variability?

jj

jjj

n

iij

jj

ij

n

xn

x

xn

x

x

j

1

1

observation from student i from observation from student i from group jgroup j

mean of group jmean of group j

grand mean over all of the groupsgrand mean over all of the groups

Sources of variabilitySources of variability

The distance of each observation from the The distance of each observation from the grand mean can be broken into two piecesgrand mean can be broken into two pieces

Like the calculation of the variance, we are Like the calculation of the variance, we are interested in the square of the deviationinterested in the square of the deviation

What does the squared deviation look like?What does the squared deviation look like?

xxxxxxxxxx jjijjjijij

Within group variability Between group variability

The final squared deviation simplifies toThe final squared deviation simplifies to

As we discussed earlier, we are going to As we discussed earlier, we are going to compare the two errors to determine if the compare the two errors to determine if the group means are equalgroup means are equal

Total sum of squares (SST)

Within group sum of squares (SSW)

Between group sum of squares (SSB)

3

1 1

23

1 1

23

1 1

2

j

n

ij

j

n

ijij

j

n

iij

jjj

xxxxxx

The within group variability can be written The within group variability can be written in terms of the individual group standard in terms of the individual group standard deviations, deviations, ssi.i.

The result is called the within group mean The result is called the within group mean square error, which is the combined square error, which is the combined estimate of the within group varianceestimate of the within group variance

Note the denominator is the total sample Note the denominator is the total sample size minus the number of groupssize minus the number of groups

3

1

23

1 1

21

jjj

j

n

ijijW snxxSS

j

3

)1()1()1(

321

233

222

211

nnn

snsnsnMSW

The between group variability can be broken into The between group variability can be broken into pieces from the summary statistics as well pieces from the summary statistics as well

The between group mean square error can be The between group mean square error can be written aswritten as

The denominator of the MSThe denominator of the MSBB is the number of groups is the number of groups minus 1 because we are considering the group minus 1 because we are considering the group means as the observations and the grand mean as means as the observations and the grand mean as the meanthe mean

3

1

23

1 1

2

jjj

j

n

ijB xxnxxSS

j

13

3

1

2

jjj

B

xxn

MS

F-statisticF-statistic

Now that we have estimates of the between Now that we have estimates of the between group and within group variation, we can group and within group variation, we can use an F-statisticuse an F-statistic

where k is the number of groups and n is the where k is the number of groups and n is the total sample sizetotal sample size

This test statistic is compared to an F-This test statistic is compared to an F-statistic with k-1 and n-k degrees of statistic with k-1 and n-k degrees of freedomfreedom

knSS

kSS

MS

MSF

W

B

W

Bknk

1,1

ANOVA tableANOVA table To complete the analysis, we need to To complete the analysis, we need to

calculate the SS’s, MS’s and the F-statisticcalculate the SS’s, MS’s and the F-statistic A specific display of this data is often used A specific display of this data is often used

called the ANOVA tablecalled the ANOVA table Standard software may provide results in Standard software may provide results in

this formthis form

Source of Source of variationvariation

SSSS dfdf MSMS FF p-valuep-value

BetweenBetween SSSSBB k-1k-1 MSMSBB MSMSBB/MS/MSWW

WithinWithin SSSSWW n-kn-k MSMSWW

TotalTotal SSSSTT

ExampleExample

Let’s perform an ANOVA test for the Let’s perform an ANOVA test for the hypointensityhypointensity

Here are the summary statisticsHere are the summary statistics

HealthyHealthy BMSBMS SPMSSPMS

MeanMean 0.4040.404 0.3890.389 0.3910.391

Standard Standard deviationdeviation

0.0220.022 0.0170.017 0.0140.014

Sample Sample sizesize

2424 3535 2626

Hypothesis testHypothesis test

1)1) HH00: : 11= = 22= =

2)2) Continuous outcome/categorical predictorContinuous outcome/categorical predictor

3)3) ANOVAANOVA

4)4) Test statistic: F=5.42Test statistic: F=5.42

5)5) p-value=0.0062p-value=0.0062

6)6) Since the p-value is less than 0.05, we Since the p-value is less than 0.05, we can reject the null hypothesis can reject the null hypothesis

7)7) We conclude that the mean is different in We conclude that the mean is different in at least one groupat least one group

ANOVA tableANOVA table

Here is the ANOVA table for this dataHere is the ANOVA table for this data

Source of Source of variationvariation

SSSS dfdf MSMS FF p-valuep-value

BetweenBetween 0.00350.0035 22 0.00170.0017 5.425.42 0.00620.0062

WithinWithin 0.0260.026 8282 0.00030.000322

TotalTotal

p-valueMean and standard deviation

NotesNotes

Remember the assumption of equal Remember the assumption of equal variance across groups is required variance across groups is required

We were able to conclude that one of the We were able to conclude that one of the means is different, but we do not know means is different, but we do not know which of the means is different. ANOVA is which of the means is different. ANOVA is often considered a first stepoften considered a first step

We can do pair wise comparisons to We can do pair wise comparisons to determine which specific means are determine which specific means are different, but we must still take into different, but we must still take into account the problem with multiple account the problem with multiple comparisonscomparisons

Bonferroni correctionBonferroni correction The simplest way to handle the multiple The simplest way to handle the multiple

comparisons is to correct the alpha level to comparisons is to correct the alpha level to allow the overall alpha level to be closer to allow the overall alpha level to be closer to the desired 0.05 levelthe desired 0.05 level

The Bonferroni correction takes the The Bonferroni correction takes the observed p-values and multiplies it by the observed p-values and multiplies it by the number of comparisons number of comparisons – If we have 3 groups and we would like to If we have 3 groups and we would like to

complete all pair wise comparison, we multiply complete all pair wise comparison, we multiply the p-values by 3the p-values by 3

In addition, we assume that the variance is In addition, we assume that the variance is equal in the pairwise t-testsequal in the pairwise t-tests

Pairwise t-testPairwise t-test

Here are the pairwise t-test resultsHere are the pairwise t-test results

Group 1Group 1 Group 2Group 2 p-valuep-value Adjusted Adjusted p-valuep-value

HCHC BMSBMS 0.00220.0022 0.00650.0065

HCHC SPMSSPMS 0.0140.014 0.0420.042

BMSBMS SPMSSPMS 0.620.62 1.01.0

We conclude that there is a significant difference between the healthy controls and both groups of MS patients, but no difference between the two groups of MS patients

More on Bonferroni More on Bonferroni correctioncorrection

For three groups, we have three For three groups, we have three pairwise comparisonspairwise comparisons

What if we were only interested in What if we were only interested in comparing each MS group to the comparing each MS group to the healthy controls? How many healthy controls? How many comparisons would we need to comparisons would we need to correct for?correct for?– Two comparisonsTwo comparisons– Multiply each p-value by 2Multiply each p-value by 2

Other correctionsOther corrections

Sidak’s testSidak’s test– 1-(1-0.05)1-(1-0.05)1/C1/C

All groups to a controlAll groups to a control– Dunnett’s test-available in SASDunnett’s test-available in SAS

MANY othersMANY others False discovery rateFalse discovery rate

ConclusionConclusion

ANOVA compares more than 2 ANOVA compares more than 2 groups on a continuous outcomegroups on a continuous outcome– If the difference between the groups is If the difference between the groups is

more than the difference within a group, more than the difference within a group, the groups are likely not the samethe groups are likely not the same

Pairwise comparisons can be Pairwise comparisons can be completed if there is a significant completed if there is a significant difference, but correction for multiple difference, but correction for multiple comparisons is requiredcomparisons is required