Post on 28-Dec-2015
OutlineOutline
Purpose and logic : page 3Purpose and logic : page 3 Hypothesis testing :Hypothesis testing : page 6page 6 Computations:Computations: page 11page 11 FF-Ratios: page 25-Ratios: page 25 Assumptions and noncentrality : page 35Assumptions and noncentrality : page 35
MANOVAMANOVA When ?
When a research design contains two or more dependent variables we could perform multiple univariate tests or one multivariate test
Why ? MANOVA does not have the problem of inflated overall type I error rate
() Univariate tests ignore the correlations among the variables Multivariate tests are more powerful than multiple univariate tests
Assumptions Multivariate normality Absence of outliers Homogeneity of variance-covariance matrices Linearity Absence of multicollinearity
MANOVAMANOVA If the independent variables are discrete and the dependant variables
are continuous we will performed a MANOVA
Y XB ETo GLM where,
ijhr h ih jh ijh ijhry e From MANOVA
where, = grand mean, = treatment effect 1, = treatment effect 2, = interaction, e = error
1 11 21 1 1 11 21 1 1 111 121 1 1 1
2 12 22 1 2 12 22 1 2 112 122 1 1 2
1 2 1 1 2 1 11
, , ,..., , , ,..., , ( ) , ( ) ,..., ( )
, , ,..., , , ,..., , ( ) , ( ) ,..., ( )B
, , ,..., , , ,..., , ( ) , (
r c r c
r c r c
q q q r q q q c q q
T
12 1 1 ) ,..., ( )q r c q
MANOVAMANOVA
ExampleDrug
A B C
Male
5, 65, 49, 97, 6
7, 67, 7
9, 126, 8
21, 1514, 1117, 1212, 10
Female
7, 106, 69, 7
8, 10
10, 138, 77, 66, 9
16, 1214, 914, 810, 5
The general idea behind MANOVA is the same as previously. We want to find a ratio between explained variability over unexplained variability (error)
= treatment effect 1 (rows; r = 2) = treatment effect 2 (columns; c = 3)
ni = 4 N = r*c*ni=24q = number of DV = 2 (WeightLoss, Time)
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
Drug
A B C
Male
5, 65, 49, 97, 6
7, 67, 7
9, 126, 8
21, 1514, 1117, 1212, 10
Female
7, 106, 69, 7
8, 10
10, 138, 77, 66, 9
16, 1214, 914, 810, 5
Hypothesis Are the drug mean vectors equal? Are the sex mean vectors equal? Do some drugs interact with sex to produce inordinately high or low weight
decrements?
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
Using the GLM approach through a coding matrix
1 1
1 2
1 3
2 1
2 2
2 3
1 1 0 1 0
1 0 1 0 1
1 1 1 1 1
1 1 0 1 0
1 0 1 0 1
1 1 1 1 1
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
Then, for each subject we associate its corresponding group coding.
X Y
s1 1 1 0 1 0 5 6 s2 1 1 0 1 0 5 4 s3 1 1 0 1 0 9 9 s4 1 1 0 1 0 7 6 s5 1 0 1 0 1 7 6 s6 1 0 1 0 1 7 7 s7 1 0 1 0 1 9 12 s8 1 0 1 0 1 6 8 s9 1 -1 -1 -1 -1 21 15 s10 1 -1 -1 -1 -1 14 11 s11 1 -1 -1 -1 -1 17 12
M = s12 1 -1 -1 -1 -1 12 10 s13 -1 1 0 -1 0 7 10 s14 -1 1 0 -1 0 6 6 s15 -1 1 0 -1 0 9 7 s16 -1 1 0 -1 0 8 10 s17 -1 0 1 0 -1 10 13 s18 -1 0 1 0 -1 8 7 s19 -1 0 1 0 -1 7 6 s20 -1 0 1 0 -1 6 9 s21 -1 -1 -1 1 1 16 12 s22 -1 -1 -1 1 1 14 9 s23 -1 -1 -1 1 1 14 8 s24 -1 -1 -1 1 1 10 5
1 2 1 2[ : : ... : : : : ... : ]p qM x x x y y x
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
1 2 1 2
T T T T
[ : : ... : : : : ... : ]
( ) ( )
p q
pp pc
cp cc
S S
n
S S
M x x x y y x
M M 1 M 1 M SSCP
Canonical correlation matrixCanonical correlation matrix R is obtained by:
1 1cp pp pc cc
R S S S S
0.93673 -0.349632
R = 0.2258 0.136781
Error Matrix (E)Error Matrix (E) In ANOVA, the error was defined as e = (1-R2)Scc This is a special case of the MANOVA error matrix E
94.5 76.5
E = 76.5 114
ccSRIE )(
Hypothesis variation matrixHypothesis variation matrix The total variation is the sum of the various hypothesis variation add to
the error variation, i.e. T=E+H+H+H. Each matrix H is obtained by
T T 1 T( )i i i i iH Y M M M M Y
Where i {, , } The full model is omitted when performing hypothesis testing
(We start by testing the interaction, then the main effects, etc.)
Hypothesis variation matrixHypothesis variation matrix Interaction
X Y
s1 1 1 0 1 0 5 6 s2 1 1 0 1 0 5 4 s3 1 1 0 1 0 9 9 s4 1 1 0 1 0 7 6 s5 1 0 1 0 1 7 6 s6 1 0 1 0 1 7 7 s7 1 0 1 0 1 9 12 s8 1 0 1 0 1 6 8 s9 1 -1 -1 -1 -1 21 15 s10 1 -1 -1 -1 -1 14 11 s11 1 -1 -1 -1 -1 17 12
M = s12 1 -1 -1 -1 -1 12 10 s13 -1 1 0 -1 0 7 10 s14 -1 1 0 -1 0 6 6 s15 -1 1 0 -1 0 9 7 s16 -1 1 0 -1 0 8 10 s17 -1 0 1 0 -1 10 13 s18 -1 0 1 0 -1 8 7 s19 -1 0 1 0 -1 7 6 s20 -1 0 1 0 -1 6 9 s21 -1 -1 -1 1 1 16 12 s22 -1 -1 -1 1 1 14 9 s23 -1 -1 -1 1 1 14 8 s24 -1 -1 -1 1 1 10 5
= M
Hypothesis variation matrixHypothesis variation matrix Interaction
T T 1 T( ) H Y M M M M Y
14.33 21.33
H = 21.33 32.33
Here is the catch!Here is the catch! In univariate, the statistics is based on the F-ratio distribution
22
21(1 )
R dfF
R df
However, in MANOVA there is no unique statistic. Four statistics are commonly used: Hotelling-Lawley trace (HL), Pillai-Bartlett trace (PB), Wilk`s likelihood ratio (W) and Roy’s largest root (RLR).
Hotelling-Lawley trace (Hotelling-Lawley trace (HLHL)) The HL statistic is defined as
where s = min(dfi, q), i represents the tested effect (i {, , }), dfi is the degree of freedom associated with the hypothesis under investigation (, or ) and k is kth eigenvalue extracted from
HiE-1.
1
=1
=tr( )=s
i i kk
HL H E
Hotelling-Lawley trace (Hotelling-Lawley trace (HLHL)) Interaction
df= (r-1)(c-1)=(2-1)(3-1) = 2 s = min(df, q) = min(2, 2) = 2
1
=1
=tr( )=s
kk
HL H E
1
H E
Trace
1
=1
=tr( )= 0.0004 0.2892 0.2896s
kk
HL H E
Pillai-Bartlett trace (Pillai-Bartlett trace (PBPB)) The PB statistic is defined as
where s = min(dfi, q), i represents the tested effect (i {, , }), dfi is the degree of freedom associated with the hypothesis under investigation (, or ) and k is kth eigenvalue extracted from
HiE-1.
1
=1
tr( ( ) )1
sk
i i ik k
PB
H E H
Pillai-Bartlett trace (Pillai-Bartlett trace (PBPB)) Interaction
df= (r-1)(c-1)=(2-1)(3-1) = 2 s = min(df, q) = min(2, 2) = 2
1( ) H E H
1
=1
tr( ( ) )1
sk
k k
PB
H E H
1tr( ( ) ) 0.0016 0.2253 0.2269PB H E H
Wilk’s likelihood ratio (Wilk’s likelihood ratio (WW)) The W statistic is defined as
where s = min(dfi, q), i represents the tested effect (i {, , }), dfi is the degree of freedom associated with the hypothesis under investigation (, or ), k is kth eigenvalue extracted from
HiE-1 and |E| (as well as |E+Hi|) is the determinant.
1
1
1( )
1
s
i iki k
W
E
E E HE H
Wilk’s likelihood ratio (Wilk’s likelihood ratio (WW)) Interaction
df= (r-1)(c-1)=(2-1)(3-1) = 2 s = min(df, q) = min(2, 2) = 2
1( ) E E H
1( ) 0.77436W E E H
1
1
1( )
1
s
k k
W
EE E H
E H
Roy’s largest root (Roy’s largest root (RLRRLR)) The RLR statistic is defined as
where i represents the tested effect (i {, , }) and k is kth
eigenvalue extracted from HiE-1.
( )
1 ( )k
ik
MaxRLR
Max
Roy’s largest root (Roy’s largest root (RLRRLR)) Interaction
( )
1 ( )k
k
MaxRLR
Max
( ) 0.2837230.221
1 ( ) 1 0.283723k
k
MaxRLR
Max
Multivariate Multivariate FF-ratio-ratio All the statistics are equivalent when s = 1. In general there is no exact formula for finding the associated
p-value except on rare situations. Nevertheless, a convenient and sufficient approximation exists for
all but RLR. Since RLR is the least robust, attention will be focused on the first
three statistics: HL, PB and W. These three statistics’ distributions are approximated using an
F distribution which has the advantage of being simple to understand
22
21
( )( )
(1 )i
i
m ii
m
df mF m
df
Multivariate Multivariate FF-ratio-ratio
Where df1 represents the numerator degree of freedom (df1 = q*dfi)
df2(m) the denominator degree of freedom for each statistic m (m {HLi, PBi and Wi})
2m is the multivariate measure of association for each statistic m
22
21
( )( )
(1 )i
i
m ii
m
df mF m
df
Multivariate Multivariate FF-ratio (-ratio (HLHL)) The multivariate measure of association for HL is given by
The numerator df
The denominator df
2
i
iHL
i
HL
HL s
1 * idf q df
errdf n k l
2 ( ) 1 2i errdf HL s df q
Multivariate Multivariate FF-ratio (-ratio (HLHL))InteractionInteraction
The multivariate measure of association for HL is given by
The numerator df
The denominator df
2 0.28960.1265
0.2896+2HL
HL
HL s
1 * 2*2 4df q df
* 24 2*3 18errdf N r c
2 ( ) 1 2 2(18 2 1) 2 32i errdf HL s df q
Multivariate Multivariate FF-ratio (-ratio (PBPB)) The multivariate measure of association for PB is given by
The numerator df
The denominator df
2
i
iPB
PB
s
1 * idf q df
errdf n k l
2 ( )i errdf PB s df q s
Multivariate Multivariate FF-ratio (-ratio (PBPB)) InteractionInteraction
The multivariate measure of association for PB is given by
The numerator df
The denominator df
2 0.2269490.113475
2PB
PB
s
1 * 2*2 4df q df
* 24 2*3 18errdf N r c
2 ( ) 2(18 2 2) 36errdf PB s df q s
Multivariate Multivariate FF-ratio (-ratio (WW)) The multivariate measure of association for W is given by
The numerator df
The denominator df
12 1 g
iW iW
1 * idf q df
( 1 ) / 2err io df q df
12 ( ) 1
2i
dfdf W og
1
22 2 2 24 / 5i ig q df q df
Multivariate Multivariate FF-ratio (-ratio (WW))InteractionInteraction
The multivariate measure of association for W is given by
The numerator df
The denominator df
122 1 0.774362 0.120021W
1 * 2*2 4df q df
( 1 ) / 2 18 (2 1 2) / 2 17.5erro df q df
12
4( ) 1 17.5*2 1 34
2 2i
dfdf W og
1 122 2 2 2 24 / 5 4*4 4 / 4 4 5 2g q df q df
Multivariate Multivariate FF-ratio-ratio
22
21
( ) 0.1265*32( ) 1.15877
(1 ) (1-0.1265)4
HL
HL
df HLF HL
df
HL (interaction, )
22
21
( ) 0.113475*36( ) 1.15199
(1 ) (1 0.113475)4
PB
PB
df PBF PB
df
PB (interaction, )
22
21
( ) 0.120021*34( ) 1.15933
(1 ) (1 0.120021)4
W
W
df WF W
df
W (interaction, )
MANOVAMANOVA Unfortunately there is no single test that is the most powerful if
the MANOVA assumptions are not met. If there is a violation of homogeneity of the covariance matrices or
the multivariate normality, then the PB statistic is the most robust while RLR is the least robust statistic.
If the noncentrality is concentrated (when the population centroids are largely confined to a single dimension), RLR provides the most power test.
MANOVAMANOVA If on the other hand, the noncentrality is diffuse (when the
population centroids differ almost equally in all dimensions) then PB, HT or W will all give good power.
However, in most cases, power differences among the four statistics are quite small (<0.06), thus it does not matter which statistics is used.