พลวัตจนในี GMS : โอกาสและความท้า ... · 2016-03-16 · พลวัตจนในีgms : โอกาสและความท้าทายต่อ
GMS MS 700, Lecture 7-2
Transcript of GMS MS 700, Lecture 7-2
-
8/6/2019 GMS MS 700, Lecture 7-2
1/38
Hypothesis Testing
Analysis of Variance (ANOVA)
GMS MS 700/GMS AN 704
Elementary Biostatistics
March 23, 2011
-
8/6/2019 GMS MS 700, Lecture 7-2
2/38
Hypothesis Testing
continuous outcomes: z- ort-test
one sample
two samples
paired samples (matched samples)
discrete outcomes:2
one sample (2goodness-of-fit test)
two samples (2test of independence)
-
8/6/2019 GMS MS 700, Lecture 7-2
3/38
Hypothesis Testing
continuous outcomes: ANOVA
more than two samples/groups
several types of ANOVAs
one-way (one-factor)
extension of two-sample t-test
randomized block (no interaction effects)
multi-factor (possible interaction effects)
repeated measures extension of paired-samples t-test
-
8/6/2019 GMS MS 700, Lecture 7-2
4/38
One-Way ANOVA allows us to compare the means of2 ormore groups or categories (the independent variable) onone dependent variable to determine if the groups differsignificantly from one another on the DV.
To use ANOVA, you must have a categorical (or nominal)variable that has at least two independent groups (e.g.treatment vs control, fuel 1 vs fuel 2) as the independentvariable and a continuous variable (interval or ratio) as thedependent variable.
ANOVA is very similar to a t-test, particularly whencomparing only 2 groups. But when looking at 3 or moregroups, ANOVA is much more effective in determiningsignificant group differences.
What is ANOVA?
-
8/6/2019 GMS MS 700, Lecture 7-2
5/38
t-tests allow us to decide whether the observeddifference between two group means is large enoughnot to be due to chance (i.e., statistically significant).
But the more ttests we run, the greater the chance ofrejecting the null hypothesis when it is true (Type 1error).
ANOVA takes into account the number of groups beingcompared, and provides us with more certainty inconcluding significance when looking at 3 or moregroups.
Rather than finding a simple difference between 2means as in a t-test, in ANOVA we find the averagedifference between means of multiple independentgroups using the squared value of the differencebetween the means.
t-Tests vs. ANOVA
-
8/6/2019 GMS MS 700, Lecture 7-2
6/38
H0: There is no difference in MPG between fuels.
HA: There is a difference in MPG between fuels.
(What is the IV? What is the DV?)Data Set 1
Fuel 1 Fuel 2 Fuel 3
40 50 5644 54 56
42 52 54
44 52 5840 52 56
M1 = 42 M2= 52 M3 = 56
Grand M= 50
Data Set 2
Fuel 1 Fuel 2 Fuel 3
36 54 3448 40 74
34 58 58
44 62 42
48 46 72
M1 = 42 M2= 52 M3 = 56
Grand M= 50
-
8/6/2019 GMS MS 700, Lecture 7-2
7/38
One-Way (One-Factor) ANOVA (one IV):An Intuitive Decomposition of Sum of Squares/Variance
Variance: the near average of the squared differences ofa set of observations around its mean
One-Way ANOVA: Compare the between-group (between-factor) variance to the within-group (within-factor) variance
In case of ANOVA, variance is referred to as the meansquare
Fstatistic is determined by the ratio of these two variances
1
)( 22
7!
n
XXs
-
8/6/2019 GMS MS 700, Lecture 7-2
8/38
Hypothesis Testing for More than 2 Means:
ANOVA
Continuous outcome
k Independent Samples, k > 2
H0: Q!Q2!Q !Qk
H1: Means are not all equalTest Statistic
Find critical value in Table 4 Fdistribution
df = (k -1), (N k)
k)/(N)X(X
1)/(k)XX(nF
2j
2
jj
!
-
8/6/2019 GMS MS 700, Lecture 7-2
9/38
An Intuitive Decomposition of Sum of Squares
Data Set 1: Decision Rule
SSTOTAL = SSBETWEEN + SSWITHINFuel 1 Fuel 2 Fuel 3
40 50 56
44 54 56
42 52 5444 52 58
40 52 56
M1 = 42 M2= 52 M3 = 56
GrandM
=50
k 1 = 3 1 = 2; N k = 15 3 = 12
F(2, 12) = 3.89 (E = .05; Table 4)
Data Set 1
k)/()X(X)/(k)XX(
!
-
8/6/2019 GMS MS 700, Lecture 7-2
10/38
An Intuitive Decomposition of Sum of Squares
Data Set 1
SSTOTAL = SSBETWEEN + SSWITHIN
Fuel 1 Fuel 2 Fuel 3
40 50 56
44 54 56
42 52 54
44 52 58
40 52 56
M1 = 42 M2= 52 M3 = 56
Grand M= 50
SST = (40 - 50)2 + (44 - 50)2 + + (58 - 50)2 + (56 - 50)2
= 552 units of variation
Data Set 1
-
8/6/2019 GMS MS 700, Lecture 7-2
11/38
An Intuitive Decomposition of Sum of Squares:
Data Set 1
SSTOTAL = SSBETWEEN + SSWITHIN
Fuel 1 Fuel 2 Fuel 3
40 50 56
44 54 56
42 52 54
44 52 58
40 52 56
M1 = 42 M2= 52 M3 = 56
Grand M= 50
SSB = 5 [(42 - 50)2 + (52 - 50)2 + (56 - 50)2]
= 5 [ 64 + 4 + 36]
= 520 units of variation
Data Set 1
-
8/6/2019 GMS MS 700, Lecture 7-2
12/38
An Intuitive Decomposition of Sum of Squares
Data Set 1SS
TOTAL =SS
BETWEEN+ SS
WITHIN
Fuel 1 Fuel 2 Fuel 3
40 50 56
44 54 56
42 52 5444 52 58
40 52 56
M1 = 42 M2= 52 M3 = 56
Grand M= 50
SSW1 = (40 - 42)2 + + (40 - 42)2 = 16 for Fuel 1
SSW2 = (50 - 52)2 + + (52 - 52)2 = 8 for Fuel 2
SSW3 = (40 - 56)2 + + (40 - 56)2 = 8 for Fuel 3
= 32 units of variation
DataSe
t1
-
8/6/2019 GMS MS 700, Lecture 7-2
13/38
An Intuitive Decomposition of Sum of Squares
Data Set 1: Conclusion
Sources of
Variation
Sum of
Squares
df Mean
Square
F p
Between Groups 520 2 260 97.5 .000
Within Groups/Error 32 12 2.67
Total 552 14
Reject H0 because F= 97.5 > F= 3. 89 (E = .05).
Conclude that there is a significant difference between fuels in
MPG.
-
8/6/2019 GMS MS 700, Lecture 7-2
14/38
SSTOTAL =
SSBETWEEN
+ SSWITHIN
Fuel 1 Fuel 2 Fuel 3
36 54 34
48 40 74
34 58 58
44 62 42
48 46 72
M1 = 42 M2= 52 M3 = 56
Grand M= 50
Data Set 2
An Intuitive Decomposition of Sum of SquaresData Set 2: Decision Rule
k 1 = 3 1 = 2; N k = 15 3 = 12
F(2, 12) = 3.89 (E = .05; Table 4)
-
8/6/2019 GMS MS 700, Lecture 7-2
15/38
SSTOTAL = SSBETWEEN + SSWITHIN
Fuel 1 Fuel 2 Fuel 3
36 54 34
48 40 7434 58 58
44 62 42
48 46 72
M1 = 42 M2= 52 M3 = 56
Grand M= 50
SST = (36 - 50)2 + (48 - 50)2 + + (42 - 50)2 + (72 - 50)2
= 2280 units of variation
Data Set 2
An Intuitive Decomposition of Sum of SquaresData Set 2
-
8/6/2019 GMS MS 700, Lecture 7-2
16/38
An Intuitive Decomposition of Sum of Squares
Data Set 2
SSB = 5 [(42 - 50)2 + (52 - 50)2 + (56 - 50)2]
= 5 [ 64 + 4 + 36]
= 520 units of variation (NOTE: Same as for Data Set 1)
Data Set 2
Fuel 1 Fuel 2 Fuel 3
36 54 34
48 40 74
34 58 58
44 62 42
48 46 72
M1 = 42 M2= 52 M3 = 56
Grand M= 50
SSTOTAL = SSBETWEEN + SSWITHIN
-
8/6/2019 GMS MS 700, Lecture 7-2
17/38
An Intuitive Decomposition of Sum of Squares
Data Set 2
SSTOTAL =
SSBETWEEN
+ SSWITHIN
SSW1 = (36 - 42)2 + + (48 - 42)2 = 176 for Fuel 1
SSW2 = (54 - 52)2 + + (46 - 52)2 = 320 for Fuel 2
SSW3 = (34 - 56)2 + + (72 - 56)2 = 1264 for Fuel 3
= 1760 units of variation
Data Set 2
Fuel 1 Fuel 2 Fuel 3
36 54 34
48 40 74
34 58 5844 62 42
48 46 72
M1 = 42 M2= 52 M3 = 56
Grand M= 50
-
8/6/2019 GMS MS 700, Lecture 7-2
18/38
An Intuitive Decomposition of Sum of Squares
Data Set 2: Conclusion
Sources ofVariation
Sum ofSquares
df MeanSquare
F p
Between Groups 520 2 260 1.77 .212
Within Groups/Error 1760 12 146.7
Total 2280 14
Accept H0 because F= 1.77 < F= 3. 89 (E = .05).
Conclude that there is not a significant difference between fuels
in MPG.
-
8/6/2019 GMS MS 700, Lecture 7-2
19/38
One-Way (One-Factor) ANOVA:
An Intuitive Decomposition of Sum of Squares/Variance
Between-Group
Variance
Within-Group
Variance
Likely
Statistical Outcome
small small hard to say.
small large factor has little or no
effect. accept HO.
large small factor has a large
effect. reject HO.
large large hard to say.
-
8/6/2019 GMS MS 700, Lecture 7-2
20/38
Post-Hoc Tukey HSD Test between Means
xsHSDTukey
21
! 73.5
67.2!!!
g
e
x n
MS
s
where ng = the number of cases in each group
Tukey1-2 = (42 - 52)/.73 = 13.7 p < .01
Tukey1-3 = (42 - 56)/.73 = 19.2 p < .01
Tukey2-3 = (52 - 56)/.73 = 5.48 p < .01
Critical value of Tukey statistic (seeTable D) is based on number of
groups/factors (3 here) and the df of the error term (12 here) 3.77 for
= .05 and 5.05 for = .01
Each of the 3 means are significantly different from each other at .01 level of
significance mileage for Fuel 3 > mileage for Fuel 2 > mileage for Fuel 1
-
8/6/2019 GMS MS 700, Lecture 7-2
21/38
SPSS Input for Data Set 1
Fuel Mileage
1 40
1 44
1 42
1 44
1 40
2 502 54
2 52
2 52
2 52
3 56
3 56
3 54
3 58
3 56
-
8/6/2019 GMS MS 700, Lecture 7-2
22/38
SPSS Output for Data Set 1
Test of Homogeneity of Variances
Mileage
Levene Statistic df1 df2 Sig.
1.000 2 12 .397
ANOVA
M
Sum of Squares df Mean Square F Sig.
Between Groups 520.000 2 260.000 97.500 .000Within Groups 32.000 12 2.667
Total 552.000 14
Tests the H0 that the error
variance of the dependent
variable is equal across
groups.
-
8/6/2019 GMS MS 700, Lecture 7-2
23/38
-
8/6/2019 GMS MS 700, Lecture 7-2
24/38
An Intuitive Decomposition of SS: Practice
Decision Rule
Data Set 3
Fuel 1 Fuel 2 Fuel 320 25 28
22 27 28
21 26 27
22 26 29
20 26 28
M1 = 21 M2= 26 M3 = 28
Grand M= 25
-
8/6/2019 GMS MS 700, Lecture 7-2
25/38
An Intuitive Decomposition of SS: Practice
Between-Groups Variance
Data Set 3
Fuel 1 Fuel 2 Fuel 320 25 28
22 27 28
21 26 27
22 26 29
20 26 28
M1 = 21 M2= 26 M3 = 28
Grand M= 25
-
8/6/2019 GMS MS 700, Lecture 7-2
26/38
An Intuitive Decomposition of SS: Practice
Within-Groups Variance
Data Set 3
Fuel 1 Fuel 2 Fuel 320 25 28
22 27 28
21 26 27
22 26 29
20 26 28
M1 = 21 M2= 26 M3 = 28
Grand M= 25
-
8/6/2019 GMS MS 700, Lecture 7-2
27/38
An Intuitive Decomposition of SS: Practice
Data Set 3
Fuel 1 Fuel 2 Fuel 3
20 25 28
22 27 28
21 26 27
22 26 29
20 26 28
M1 = 21 M2= 26 M3 = 28
Grand M= 25
Sources of Variation Sum of
Squares
df Mean
Square
F p
Between Groups
Within Groups/Error
Total
-
8/6/2019 GMS MS 700, Lecture 7-2
28/38
One-Way (One-Factor) ANOVA:
Fishers Randomized Block Design
In some cases, an extraneous factoris a systematic sourceof variance that increases the error term
The goal of a randomized block design is to block theextraneous source of variance and to remove it from the errorterm, thus increasing the between-groups F value
in effect, the randomized block design removes unexplainedvariance from the error term by associating it with anextraneous factor that is affecting the results
Fisher (from whom we get ourFvalue) developed the blockdesign to account forextraneous variance in crop yieldassociated with farm location (e.g., northern vs. central vs.southern locales in England) in order to test whether therewere real differences in his main experimental factor, fertilizer
type
-
8/6/2019 GMS MS 700, Lecture 7-2
29/38
One-Factor Randomized Block Design
SSTOTAL = SSBETWEEN + SSWITHIN
Fertilizer 1 Fer tilizer 2
38 50
42 52
29 3832 41
18 27
22 28
M1 = 30.17 M2= 39.33
Grand M= 34.75
SST = (38 34.75)2 + (42 34.75)2 + + (27 34.75)2 + (28 34.75)2
= 1232.25 units of variation
Data Set Unblocked
-
8/6/2019 GMS MS 700, Lecture 7-2
30/38
One-Factor Randomized Block Design
SSTOTAL = SSBETWEEN + SSWITHIN
Fertilizer 1 Fer tilizer 2
38 50
42 52
29 3832 41
18 27
22 28
M1 = 30.17 M2= 39.33
Grand M= 34.75
Data Set Unblocked
SSB = 6 [(30.17 34.75)2 + (39.33 - 34.75)2]
= 252.1 units of variation
-
8/6/2019 GMS MS 700, Lecture 7-2
31/38
One-Factor Randomized Block Design
SSTOTAL = SSBETWEEN + SSWITHIN
Fertilizer 1 Fer tilizer 2
38 50
42 52
29 3832 41
18 27
22 28
M1 = 30.17 M2= 39.33
Grand M= 34.75
Data Set Unblocked
SSW1 = (38 30.17)2 + + (22 - 30.17)2 for Fertilizer 1
SSW2 = (50 39.33)2 + + (28 - 39.33)2 for Fertilizer 2
= 980.17 units of variation
-
8/6/2019 GMS MS 700, Lecture 7-2
32/38
One-Factor Randomized Block Design
Sources of Variation Sum of
Squares
df Mean
Square
F p
Between Groups 252.1 1 252.1 2.57 .140
Within Groups/Error 980.2 10 98.02
Total 1232.3 11 112.03
Fertilizer 1 Fer tilizer 2
38 50
42 5229 38
32 41
18 27
22 28
M1 = 30.17 M2= 39.33Grand M= 34.75
Data Set Unblocked
-
8/6/2019 GMS MS 700, Lecture 7-2
33/38
One-Factor Randomized Block Design
SSTOTAL = SSBETWEEN + SSBLOCK + SSWITHIN
Blocked
Variable
Fertilizer 1 Fer tilizer 2 Sector Mean
Northern Sector 38 50 MN
= 45.5
42 52Central Sector 29 38 M
C= 35
32 41
Southern Sector 18 27 MS
= 23.75
22 28M1
= 30.17 M2= 39.33 Grand M= 34.75
SST = (38 34.75)2 + (42 34.75)2 + + (27 34.75)2 + (28 34.75)2
= 1232.25 units of variation
Data SetBlocked
-
8/6/2019 GMS MS 700, Lecture 7-2
34/38
One-Factor Randomized Block Design
SSTOTAL = SSBETWEEN + SSBLOCK + SSWITHIN
Blocked
variable
Fertilizer 1 Fer tilizer 2 Sector Mean
Northern Sector 38 50 MN
= 45.5
42 52
Central Sector 29 38 MC
= 35
32 41
Southern Sector 18 27 MS
= 23.75
22 28
M1
= 30.17 M2= 39.33 Grand M= 34.75
Data SetBlocked
SSB = 6 [(30.17 34.75)2 + (39.33 - 34.75)2]
= 252.1 units of variation (NOTE: Same as forUnblocked Data Set)
-
8/6/2019 GMS MS 700, Lecture 7-2
35/38
One-Factor Randomized Block Design
SSTOTAL = SSBETWEEN + SSBLOCK + SSWITHIN
Blocked
variable
Fertilizer 1 Fer tilizer 2 Sector Mean
Northern Sector 38 50 MN
= 45.5
42 52
Central Sector 29 38 MC
= 35
32 41
Southern Sector 18 27 MS
= 23.75
22 28
M1
= 30.17 M2= 39.33 Grand M= 34.75
Data SetBlocked
SSBL = 4 [(45.5 34.75)2 + (35 - 34.75)2 + (23.75 - 34.75)2]
= 946.5 units of variation
-
8/6/2019 GMS MS 700, Lecture 7-2
36/38
One-Factor Randomized Block Design
Sources of Variation Sum of
Squares
df Mean
Square
F p
Blocked/Extraneous Factor 946.5 2 473.25 112.4 .000
Between Groups 252.1 1 252.1 59.9 .000
Within Groups/Error* 33.7 8 4.21
Total 1232.3 11 112.03
Blocked
variable
Fertilizer 1 Fer tilizer 2 Sector Mean
Northern Sector 38 50 MN= 45.542 52
Central Sector 29 38 MC
= 35
32 41
Southern Sector 18 27 MS
= 23.75
22 28
M1
= 30.17 M2= 39.33 Grand M= 34.75
*Was 980.2 Unblocked. 980.2 946.5 = 33.7
-
8/6/2019 GMS MS 700, Lecture 7-2
37/38
-
8/6/2019 GMS MS 700, Lecture 7-2
38/38
SPSS Input for Blocked Data Set
Fertilizer Plot Bushels
1 1 381 1 42
1 2 29
1 2 32
1 3 18
1 3 22
2 1 50
2 1 52
2 2 38
2 2 41
2 3 27
2 3 28