Analysis of Variance Analysis of Variance (ANOVA)(ANOVA)
Brian Healy, PhDBrian Healy, PhD
BIO203BIO203
Types of analysis-independent Types of analysis-independent samplessamples
OutcomeOutcome ExplanatoryExplanatory AnalysisAnalysis
ContinuousContinuous DichotomousDichotomous t-test, t-test, Wilcoxon Wilcoxon testtest
ContinuousContinuous CategoricalCategorical ANOVA, ANOVA, linear linear regressionregression
ContinuousContinuous ContinuousContinuous Correlation, Correlation, linear regressionlinear regression
DichotomousDichotomous DichotomousDichotomous Chi-square test, Chi-square test, logistic logistic regressionregression
DichotomousDichotomous ContinuousContinuous Logistic Logistic regressionregression
Time to eventTime to event DichotomousDichotomous Log-rank testLog-rank test
ExampleExample
A recent study compared the A recent study compared the hypointensity of gray matter hypointensity of gray matter structures on MRI in normal controls, structures on MRI in normal controls, benign MS patients and secondary benign MS patients and secondary progressive MS patientsprogressive MS patients
Increased hypointensity is a marker Increased hypointensity is a marker of diseaseof disease
Question: Is there any difference Question: Is there any difference among these groups?among these groups?
The null hypothesis is that all of the groups The null hypothesis is that all of the groups have the same hypointensity on averagehave the same hypointensity on average– Categorical predictorCategorical predictor– Continuous outcomeContinuous outcome
You could compare each of the groups to You could compare each of the groups to each of the other groups which would be 3 each of the other groups which would be 3 pair wise comparisons at the 0.05 level, pair wise comparisons at the 0.05 level, but what happens to the overall alpha but what happens to the overall alpha level?level?
What is What is ??– = = P(reject HP(reject H00 | H | H00 is true) is true) so in this case so in this case = =
P(one difference | all are equal )P(one difference | all are equal )
Also, Also, P(fail to reject HP(fail to reject H0 0 | H| H00 is true) = 1 - is true) = 1 -
Overall Overall levellevel Now, if we completed each of the 3 pair Now, if we completed each of the 3 pair
wise tests at the 0.05 level and all of the wise tests at the 0.05 level and all of the tests were independent, tests were independent, P(fail to reject all 3 P(fail to reject all 3 hypotheses | Hhypotheses | H00 is true) = is true) = (1-0.05)(1-0.05)3 3 = 0.857= 0.857
Therefore, Therefore, P(reject at least 1 | HP(reject at least 1 | H00 is true) = is true) = 1-0.857 = 0.143 = 1-0.857 = 0.143 = type I errortype I error
Type I error is greater than 0.05. This gets Type I error is greater than 0.05. This gets worse as number of comparisons increasesworse as number of comparisons increases
What can we do?What can we do?– ANOVAANOVA
Analysis of variance Analysis of variance (ANOVA)(ANOVA)
Null hypothesis is Null hypothesis is nn
We are testing if the mean is equal across We are testing if the mean is equal across groupsgroups
The alternative hypothesis is that at least The alternative hypothesis is that at least one of the means is different (but we will one of the means is different (but we will not be able to determine which one using not be able to determine which one using this test)this test)
The name tells us that we are going to be The name tells us that we are going to be using the variance, but the goal is to use using the variance, but the goal is to use the variance to compare the means (this is the variance to compare the means (this is a common source of confusion)a common source of confusion)
How does this work?How does this work?
As with the t-test, we have a As with the t-test, we have a continuous continuous outcomeoutcome, but now we have multiple , but now we have multiple groups, which is a groups, which is a categorical variablecategorical variable
Before we begin, we must consider the Before we begin, we must consider the assumptions required to use ANOVAassumptions required to use ANOVA– The underlying distributions of the The underlying distributions of the
populations are normalpopulations are normal– The variance of each group is equal (This is The variance of each group is equal (This is
critical for ANOVA), critical for ANOVA), homoskedastichomoskedastic These are similar to the two sample t-These are similar to the two sample t-
testtest
PicturePicture
If all of the groups If all of the groups had the same had the same means, the means, the distributions for all distributions for all of the populations of the populations would look exactly would look exactly the same (overlaid the same (overlaid graphs)graphs)
Picture IIPicture II
Now, if the means of the populations Now, if the means of the populations were different, the picture would look were different, the picture would look like this. Notice that the variability like this. Notice that the variability between the groups is much greater between the groups is much greater than within a groupthan within a group
Sources of varianceSources of variance
When we take samples from each group, When we take samples from each group, there will be two sources of variabilitythere will be two sources of variability– Within group variability - when we sample Within group variability - when we sample
from a group there will be variability from from a group there will be variability from person to person in the same groupperson to person in the same group
– Between group variability – the difference Between group variability – the difference from group to groupfrom group to group If the between group variability is large, the If the between group variability is large, the
means of the two groups are likely not the samemeans of the two groups are likely not the same
We can use the two types of variability to We can use the two types of variability to determine if the means are likely differentdetermine if the means are likely different
How can we do this?How can we do this? Look again at the pictureLook again at the picture Blue arrow: within group, red arrow: Blue arrow: within group, red arrow:
between groupbetween group
Blue arrow: within group, red arrow: Blue arrow: within group, red arrow: between groupbetween group
Notice that when the distribution are Notice that when the distribution are separate, the between group variability is separate, the between group variability is much greater than the within groupmuch greater than the within group
NotationNotation
First we will define First we will define
How could we express the different How could we express the different forms of variability?forms of variability?
jj
jjj
n
iij
jj
ij
n
xn
x
xn
x
x
j
1
1
observation from student i from observation from student i from group jgroup j
mean of group jmean of group j
grand mean over all of the groupsgrand mean over all of the groups
Sources of variabilitySources of variability
The distance of each observation from the The distance of each observation from the grand mean can be broken into two piecesgrand mean can be broken into two pieces
Like the calculation of the variance, we are Like the calculation of the variance, we are interested in the square of the deviationinterested in the square of the deviation
What does the squared deviation look like?What does the squared deviation look like?
xxxxxxxxxx jjijjjijij
Within group variability Between group variability
The final squared deviation simplifies toThe final squared deviation simplifies to
As we discussed earlier, we are going to As we discussed earlier, we are going to compare the two errors to determine if the compare the two errors to determine if the group means are equalgroup means are equal
Total sum of squares (SST)
Within group sum of squares (SSW)
Between group sum of squares (SSB)
3
1 1
23
1 1
23
1 1
2
j
n
ij
j
n
ijij
j
n
iij
jjj
xxxxxx
The within group variability can be written The within group variability can be written in terms of the individual group standard in terms of the individual group standard deviations, deviations, ssi.i.
The result is called the within group mean The result is called the within group mean square error, which is the combined square error, which is the combined estimate of the within group varianceestimate of the within group variance
Note the denominator is the total sample Note the denominator is the total sample size minus the number of groupssize minus the number of groups
3
1
23
1 1
21
jjj
j
n
ijijW snxxSS
j
3
)1()1()1(
321
233
222
211
nnn
snsnsnMSW
The between group variability can be broken into The between group variability can be broken into pieces from the summary statistics as well pieces from the summary statistics as well
The between group mean square error can be The between group mean square error can be written aswritten as
The denominator of the MSThe denominator of the MSBB is the number of groups is the number of groups minus 1 because we are considering the group minus 1 because we are considering the group means as the observations and the grand mean as means as the observations and the grand mean as the meanthe mean
3
1
23
1 1
2
jjj
j
n
ijB xxnxxSS
j
13
3
1
2
jjj
B
xxn
MS
F-statisticF-statistic
Now that we have estimates of the between Now that we have estimates of the between group and within group variation, we can group and within group variation, we can use an F-statisticuse an F-statistic
where k is the number of groups and n is the where k is the number of groups and n is the total sample sizetotal sample size
This test statistic is compared to an F-This test statistic is compared to an F-statistic with k-1 and n-k degrees of statistic with k-1 and n-k degrees of freedomfreedom
knSS
kSS
MS
MSF
W
B
W
Bknk
1,1
ANOVA tableANOVA table To complete the analysis, we need to To complete the analysis, we need to
calculate the SS’s, MS’s and the F-statisticcalculate the SS’s, MS’s and the F-statistic A specific display of this data is often used A specific display of this data is often used
called the ANOVA tablecalled the ANOVA table Standard software may provide results in Standard software may provide results in
this formthis form
Source of Source of variationvariation
SSSS dfdf MSMS FF p-valuep-value
BetweenBetween SSSSBB k-1k-1 MSMSBB MSMSBB/MS/MSWW
WithinWithin SSSSWW n-kn-k MSMSWW
TotalTotal SSSSTT
ExampleExample
Let’s perform an ANOVA test for the Let’s perform an ANOVA test for the hypointensityhypointensity
Here are the summary statisticsHere are the summary statistics
HealthyHealthy BMSBMS SPMSSPMS
MeanMean 0.4040.404 0.3890.389 0.3910.391
Standard Standard deviationdeviation
0.0220.022 0.0170.017 0.0140.014
Sample Sample sizesize
2424 3535 2626
Hypothesis testHypothesis test
1)1) HH00: : 11= = 22= =
2)2) Continuous outcome/categorical predictorContinuous outcome/categorical predictor
3)3) ANOVAANOVA
4)4) Test statistic: F=5.42Test statistic: F=5.42
5)5) p-value=0.0062p-value=0.0062
6)6) Since the p-value is less than 0.05, we Since the p-value is less than 0.05, we can reject the null hypothesis can reject the null hypothesis
7)7) We conclude that the mean is different in We conclude that the mean is different in at least one groupat least one group
ANOVA tableANOVA table
Here is the ANOVA table for this dataHere is the ANOVA table for this data
Source of Source of variationvariation
SSSS dfdf MSMS FF p-valuep-value
BetweenBetween 0.00350.0035 22 0.00170.0017 5.425.42 0.00620.0062
WithinWithin 0.0260.026 8282 0.00030.000322
TotalTotal
p-valueMean and standard deviation
NotesNotes
Remember the assumption of equal Remember the assumption of equal variance across groups is required variance across groups is required
We were able to conclude that one of the We were able to conclude that one of the means is different, but we do not know means is different, but we do not know which of the means is different. ANOVA is which of the means is different. ANOVA is often considered a first stepoften considered a first step
We can do pair wise comparisons to We can do pair wise comparisons to determine which specific means are determine which specific means are different, but we must still take into different, but we must still take into account the problem with multiple account the problem with multiple comparisonscomparisons
Bonferroni correctionBonferroni correction The simplest way to handle the multiple The simplest way to handle the multiple
comparisons is to correct the alpha level to comparisons is to correct the alpha level to allow the overall alpha level to be closer to allow the overall alpha level to be closer to the desired 0.05 levelthe desired 0.05 level
The Bonferroni correction takes the The Bonferroni correction takes the observed p-values and multiplies it by the observed p-values and multiplies it by the number of comparisons number of comparisons – If we have 3 groups and we would like to If we have 3 groups and we would like to
complete all pair wise comparison, we multiply complete all pair wise comparison, we multiply the p-values by 3the p-values by 3
In addition, we assume that the variance is In addition, we assume that the variance is equal in the pairwise t-testsequal in the pairwise t-tests
Pairwise t-testPairwise t-test
Here are the pairwise t-test resultsHere are the pairwise t-test results
Group 1Group 1 Group 2Group 2 p-valuep-value Adjusted Adjusted p-valuep-value
HCHC BMSBMS 0.00220.0022 0.00650.0065
HCHC SPMSSPMS 0.0140.014 0.0420.042
BMSBMS SPMSSPMS 0.620.62 1.01.0
We conclude that there is a significant difference between the healthy controls and both groups of MS patients, but no difference between the two groups of MS patients
More on Bonferroni More on Bonferroni correctioncorrection
For three groups, we have three For three groups, we have three pairwise comparisonspairwise comparisons
What if we were only interested in What if we were only interested in comparing each MS group to the comparing each MS group to the healthy controls? How many healthy controls? How many comparisons would we need to comparisons would we need to correct for?correct for?– Two comparisonsTwo comparisons– Multiply each p-value by 2Multiply each p-value by 2
Other correctionsOther corrections
Sidak’s testSidak’s test– 1-(1-0.05)1-(1-0.05)1/C1/C
All groups to a controlAll groups to a control– Dunnett’s test-available in SASDunnett’s test-available in SAS
MANY othersMANY others False discovery rateFalse discovery rate
ConclusionConclusion
ANOVA compares more than 2 ANOVA compares more than 2 groups on a continuous outcomegroups on a continuous outcome– If the difference between the groups is If the difference between the groups is
more than the difference within a group, more than the difference within a group, the groups are likely not the samethe groups are likely not the same
Pairwise comparisons can be Pairwise comparisons can be completed if there is a significant completed if there is a significant difference, but correction for multiple difference, but correction for multiple comparisons is requiredcomparisons is required
Top Related