Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

50
Lesson10-1 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson 10: Analysis of Variance Analysis of Variance
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    223
  • download

    3

Transcript of Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Page 1: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-1 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Lesson 10:

Analysis of VarianceAnalysis of Variance

Page 2: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-2 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Outline

Characteristics of F-Distribution

Test for Equal Variances

Analysis of Variance (ANOVA)

Two-Factor ANOVA

Sampling distribution of the sample means

Probability histograms and empirical histograms

Central Limit Theorem

Normal approximation to Binomial

Page 3: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-3 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Two Sample Tests

TEST FOR EQUAL VARIANCESTEST FOR EQUAL VARIANCES TEST FOR EQUAL MEANSTEST FOR EQUAL MEANS

HHo

HH1

Population 1

Population 2

Population 1

Population 2

HHo

HH1

Population 1

Population 2

Population 1Population 2

Page 4: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-4 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Characteristics of F-Distribution

There is a “family” of F Distributions.

Each member of the family is determined by two parameters: the numerator degrees of freedom and the denominator degrees of freedom.

F cannot be negative, and it is a continuous distribution.

The F distribution is positively skewed. Its values range from 0 to . As F the curve

approaches the X-axis.

Page 5: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The F-Distribution, F(m,n)

0 1.0

Not symmetric (skewed to the right)

F

Nonnegative values only

Each member of the family is determined by two parameters: the numerator degrees of freedom (m) and the denominator degrees of freedom (n).

Page 6: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-6 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Test for Equal Variances

For the two tail test, the test statistic is given by:

where s12 and s2

2 are the sample variances for the two samples.

The null hypothesis is rejected at level of significance if the computed value of the test statistic is greater than the critical value with a confidence level /2 and numerator and denominator dfs.

),(

),( arg=

22

21

22

21SSofSmaller

SSoferLF ),(

),( arg=

22

21

22

21SSofSmaller

SSoferLF

Page 7: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-7 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Test for Equal Variances

For the one tail test, the test statistic is given by:

where s12 and s2

2 are the sample variances for the two samples.

The null hypothesis is rejected at level of significance if the computed value of the test statistic is greater than the critical value with a confidence level and numerator and denominator dfs.

22

2112

2

21 > :H if = σσS

SF 2

22112

2

21 > :H if = σσS

SF

Page 8: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-8 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 1

Colin, a stockbroker at Critical Securities, reported that the mean rate of return on a sample of 10 internet stocks was 12.6 percent with a standard deviation of 3.9 percent. The mean rate of return on a sample of 8 utility stocks was 10.9 percent with a standard deviation of 3.5 percent. At the .05 significance level, can Colin conclude that there is more variation in the software stocks?

Page 9: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-9 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 1 continued

Step 1: The hypotheses are:

Step 2: The significance level is .05.

Step 3: The test statistic is the F distribution.

Step 4: H0 is rejected if F>3.68. The degrees of freedom are 9 in the numerator and 7 in the denominator.

Step 5: The value of F is

221

220

:

:

UI

UI

H

H

2416.1)5.3(

)9.3(2

2

F

H0 is not rejected. There is insufficient evidence to show more variation in the internet stocks.

Page 10: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-10 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Analysis of Variance(ANOVA)

Page 11: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-11 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Underlying Assumptions for ANOVA

The F distribution is also used for testing whether two or more sample means came from the same or equal populations. if any group mean differs from the mean of all

groups combinedAnswers: “Are all groups equal or not?”

This technique is called analysis of variance or ANOVA.

ANOVA requires the following conditions: The sampled populations follow the normal

distribution. The populations have equal standard

deviations. The samples are randomly selected and are

independent.

Page 12: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-12 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The hypothesis

Suppose that we have independent samples of n1, n2, . . ., nK observations from K populations. If the population means are denoted by 1, 2, . . ., K, the one-way analysis of variance framework is designed to test the null hypothesis

ji1

210

, pair one least at For:

===:

μμμμH

μμμH

ji

K

Page 13: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-13 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sample Observations from Independent Random Samples of K Populations

Population1 2 . . . K

Mean1 2 . . . K

Variance2 2 . . . 2

Sample

observations from

the population

x11

x12

.

.

.x1n1

x21

x22

.

.

.x2n2

. . .. . .

. . .

xK1

xK2

.

.

.xKnK

Sample sizen1 n2 . . . nK

Same !!

unequal !!

Unequal number of observations in the K samples in general.nT=n1+…+nK

Page 14: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-14 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sum of Squares Decomposition for one-way analysis of variance

Suppose that we have independent samples of n1, n2, . . ., nK observations from K populations.

Denote by the K group sample means and by the overall sample mean. We define the following sum of squaressum of squares:

Kxxx ,,, 21 x

∑∑ -in

jiij

K

i

xxSSE1

2

1

)( :Groups)-(Within Error Squares of Sum

∑ -K

iii xxnSST

1

2)( :Groups)-(Between Treatment Squares of Sum

∑∑ -in

jij

K

i

xxSSTotal1

2

1

)( :Total Squares of Sum

where xij denotes the jth sample observation in the ith group.

Page 15: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-15 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

An Numerical Example of Sum of Squares Decomposition

Population 1 2 3

Mean 1 2 K

Variance 2 2 2

Sample obs from the

population(xij)

123

2345

135

Sample size (nj)

3 4 3

Sample mean

2 3.5 3

Grand mean

2.9

9.18

])9.25(...)9.21[(...])9.23(...)9.21[(

)(

2222

1

2

1

∑∑ -in

jij

K

i

xxSSTotal

9.3

)9.23(3)9.25.3(4)9.22(3)( 222

1

2

∑ -K

iii xxnSST

15

])35(...)31[(...])23(...)21[(

)(

2222

1

2

1

∑∑ -in

jiij

K

i

xxSSE

SSTotal = SST + SSESSTotal = SST + SSE

Page 16: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-16 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

A proof of SSTotal = SST + SSE

Population

1 2 . . . K

Sample obs

x11

x12

.

.

.x1n1

x21

x22

.

.

.x2n2

. . .. . .

. . .

xK1

xK2

.

.

.xKnK

Sample size

n1 n2 . . . nK

SSTSSE

xxnxx

xxxxxxnxx

xxxxxxxx

xxxxxxxx

xxxx

xxxx

xx

SSTotal

K

iii

in

jiij

K

i

in

jiij

K

ii

K

iii

in

jiij

K

i

in

jiiij

K

i

in

ji

K

i

in

jiij

K

i

in

jiiijiiij

K

i

in

jiiij

K

i

in

jiiij

K

i

in

jij

K

i

∑∑∑

∑∑∑∑∑

∑∑∑∑∑∑

∑∑

∑∑

∑∑

∑ -∑

1

2

1

2

1

111

2

1

2

1

111

2

11

2

1

1

22

1

1

2

1

1

2

1

1

2

1

)()(

)()(2)()(

))((2)()(

)])((2)()[(

)]()[(

)(

)(

ii

i

i

ii

i xnxnxxxx ii

n

j

n

jij

n

jij

∑∑∑111

)(

Page 17: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-17 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Two Ways to estimate the population variance

Note that the variance is assumed to be identical across populations

If the population means are identical, we have two ways to estimate the population variance Based on the K sample variances. Based on the deviation of the K sample means

from the grand mean.

Page 18: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-18 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

An estimate the population variance based on sample variances

Anyone of the K sample variances can be used to estimate the population.

)1/()(ˆ1

222

i

n

jiiji nxxs

i

∑ -

KnSSE

Knsn

nxx

K

ii

K

ii

K

iii

K

ii

n

jiij

K

i

i

)/(

)/()1(

)1(/)(ˆ

1

11

2

11

2

1

2

∑∑

∑∑∑ -

We can get a more precise estimate if we use all the information from the K samples.

Page 19: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-19 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

An estimate the population variance based on deviation of the K sample means from the grand sample mean.

If the sample sizes are the same for all samples, the Central Limit Theorem suggests that sample mean will be distributed normally with the population mean and the population variance divided by sample size.

)1/()(ˆ1

22

KxxnK

i

i∑

)1/(

)1/()(ˆ1

22

KSST

KxxnK

ii i∑

When sample sizes are different across samples, we will have to weight

???

Page 20: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-20 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Comparing the Variance Estimates: The F Test

If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of ratio of the two variance estimates follows F distribution with K - 1 and nT - K.

If the means of the K populations are not equal, the value of F-stat will be inflated because SST/(K-1) will overestimate2.

Hence, we will reject H0 if the resulting value of F-stat appears to be too large to have been selected at random from the appropriate F distribution.

)/(

)1/(

)/(

)1/(

1

KnSSE

KSST

KnSSE

KSSTstatF

TK

ii

Page 21: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-21 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Test for the Equality of k Population Means

Hypotheses H0: 1=2=3=….=k

H1: Not all population means are equal

Test StatisticF = [SST/(K-1)] / [SSE/(nT-K)]

Rejection Rule Reject H0 if F > F

where the value of F is based on an F distribution with K - 1 numerator degrees of freedom and nT - K denominator degrees of freedom.

Page 22: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-22 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sampling Distribution of MST/MSE

Do Not Reject Do Not Reject HH00Do Not Reject Do Not Reject HH00

Reject Reject HH00Reject Reject HH00

MST/MSE

Critical ValueFF

The figure below shows the rejection region associated with a level of significance equal to where F denotes the critical value.

Page 23: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-23 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The ANOVA Table

Source of Variation

Sum of Squares

Degree of Freedom

Mean Squares F

Treatment SST K-1 MST MST/MSE

Error SSE nT-K MSE

Total SSTotal nT-1

Page 24: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-24 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Does learning method affect student’s exam scores?

Consider 3 methods: standard osmosis shock therapy

Convince 15 students to take part. Assign 5 students randomly to each method.

Wait eight weeks. Then, test students to get exam scores.

Are the three learning methods equally effective? i.e., are their population means of exam scores

same?

Page 25: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-25 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

“ Analysis of Variance” (Study #1)

The variation between the group means and the grand mean is larger than the variation within each of the groups.

Page 26: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-26 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

ANOVA Table for Study #1

One-way Analysis of Variance

Source DF SS MS F PFactor 2 2510.5 1255.3 93.44 0.000Error 12 161.2 13.4Total 14 2671.7

“ Source” means “find the components of variation in this column”

“ DF” means “degrees of freedom”

“ SS” means “sums of squares”

“ F” means “F test statistic”

“ MS” means “mean squared”

P-Value

Page 27: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-27 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

ANOVA Table for Study #1

One-way Analysis of Variance

Source DF SS MS F PFactor 2 2510.5 1255.3 93.44 0.000Error 12 161.2 13.4Total 14 2671.7

“ Factor” means “Variability between groups” or “Variability due to the factor of interest” “ Error” means “Variability within groups” or “unexplained random variation”

“ Total” means “Total variation from the grand mean”

Page 28: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-28 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

ANOVA Table for Study #1

One-way Analysis of Variance

Source DF SS MS F PFactor 2 2510.5 1255.3 93.44 0.000Error 12 161.2 13.4Total 14 2671.7

14 = 2 + 12

2671.7 = 2510.5 + 161.2

1255.2 = 2510.5/2 13.4 = 161.2/12

93.44 = 1255.3/13.4

Page 29: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-29 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

“ Analysis of Variance” (Study #2)

The variation between the group means and the grand mean is smaller than the variation within each of the groups.

Page 30: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-30 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

ANOVA Table for Study #2

One-way Analysis of Variance

Source DF SS MS F PFactor 2 80.1 40.1 0.46 0.643Error 12 1050.8 87.6Total 14 1130.9

The P-value is pretty large so cannot reject the null hypothesis. There is insufficient evidence to conclude that the average exam scores differ for the three learning methods.

Page 31: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-31 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Do Holocaust survivors have more sleep problems than others?

Page 32: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-32 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

ANOVA Table for Sleep Study

One-way Analysis of Variance

Source DF SS MS F PFactor 2 1723.8 861.9 61.69 0.000Error 117 1634.8 14.0Total 119 3358.6

The P-value is so small that we reject the null hypothesis of equal population means and favor the alternative hypothesis that at least one pair of population means are different.

Page 33: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-33 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Potential problem with the analysis

What is driving the rejection of null of equal population means? From the plot, the Healthy and Depress seem to have

different mean sleep quality. It looks like that the rejection is due to the difference between these two groups.

If we pooled Healthy and Depress, the distribution will look more like Survivor. That is, an acceptance of the null is more likely.

This example illustrate that we have to be careful about our analysis and interpretation of the result when we conduct a test of equal population means.

Page 34: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-34 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 2

Rosenbaum Restaurants specialize in meals for senior citizens. Katy Polsby, President, recently developed a new meat loaf dinner. Before making it a part of the regular menu she decides to test it in several of her restaurants. She would like to know if there is a difference in the mean number of dinners sold per day at the Anyor, Loris, and Lander restaurants. Use the .05 significance level.

Page 35: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-35 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 2 continued

# of dinners sold per day

Obs Aynor Loris Lander

1 13 10 18

2 12 12 16

3 14 13 17

4 12 11 17

5 17

Page 36: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-36 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 2 continued

Step 1: H0: 1 = 2 = 3

H1: Treatment means are not the same

Step 2: H0 is rejected if F>4.10. There are 2 df in the numerator and 10 df in the denominator.

Page 37: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-37 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 2 continued

To find the value of F:

Source SS df MS F p-value

Treatment 76.25 2 38.125 39.10 1.87E-05

Error 9.75 10 0.975

Total 86.00 12      

The decision is to reject the null hypothesis. The treatment means are not the same. The mean number of meals sold at the three

locations is not the same.

Page 38: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-38 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Inferences About Treatment Means

When we reject the null hypothesis that the means are equal, we may want to know which treatment means differ.

One of the simplest procedures is through the use of confidence intervals.

Page 39: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-39 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Confidence Interval for the Difference Between Two Means

where t is obtained from the t table with degrees of freedom (nT - k).

(nT - k) degree of freedom because MSE = [SSE/(nT - k)]

21

2111

)(nn

MSEtXX

Page 40: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-40 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 3

From EXAMPLE 2 develop a 95% confidence interval for the difference in the mean number of meat loaf dinners sold in Lander and Aynor. Can Katy conclude that there is a difference between the two restaurants?

)73.5,77.2(48.125.4

5

1

4

1975.228.2)75.1217(

Because zero is not in the interval, we conclude that this pair of means differs.

The mean number of meals sold in Aynor is different from Lander.

Page 41: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-41 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Two-Factor ANOVA

For the two-factor ANOVA we test whether there is a significant difference between the treatment effect and whether there is a difference in the blocking effect.

Page 42: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-42 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sample Observations from Independent Random Samples of K Populations

TREATMENT

1 2 . . . K

BLOCK

1 x11 x21 . . . xK1

2 x12 x22 . . . xK2

.

.

.

.

.

.

.

.

.

. . . . . . . . .

.

.

.

B x1B x2B . . . xKB

Page 43: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-43 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Sum of Squares Decomposition for Two-Way Analysis of Variance

Suppose that we have a sample of observations with xij denoting the observation in the ith group and jth block. Suppose that there are K groups and B blocks, for a total of n = KH observations. Denote the group sample means by ,

the block sample means by and the overall sample mean by x.

B

1j

2j )xx(KSSB

),,2,1( Kixi

K

1i

2i )xx(BSST

B

1j

2ij

K

1i

)x(xSSTotal

),,2,1( Bjx j

B

1j

2jiij

K

1i

)xxx(xSSE

SSTotal = SSE+SST+SSBSSTotal = SSE+SST+SSB

Page 44: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-44 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

General Format of Two-Way Analysis of Variance Table

Source of Variation

Sums of Squares

Degrees of

Freedom

Mean Squares F Ratios

Treatments SST K-1 MST=SST/K-1) MST/MSE

Blocks SSB B-1 MSB=SSB/(B-1) MSB/MSE

Error SSE (K-1)(B-1) MSE=SSE/[(K-1)(B-1)]

Total SSTotal nT-1

Page 45: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-45 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 4

The Bieber Manufacturing Co. operates 24 hours a day, five days a week. The workers rotate shifts each week. Todd Bieber, the owner, is interested in whether there is a difference in the number of units produced when the employees work on various shifts. A sample of five workers is selected and their output recorded on each shift.

At the .05 significance level, can we conclude there is a difference in the mean production by shift and in the mean production by employee?

Page 46: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-46 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 4 continued

Employee Day Output

Evening Output

Night Output

McCartney 31 25 35

Neary 33 26 33

Schoen 28 24 30

Thompson 30 29 28

Wagner 28 26 27

Page 47: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-47 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 4 continued

TREATMENT EFFECT Step 1: H0: µ1= µ2= µ3 versus H1: Not all

means are equal.

Step 2: H0 is rejected if F>4.46, the degrees of freedom are 2 and 8.

Page 48: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-48 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 4 continued

Step 3: Compute the various sum of squares:

Source SS df MS F p-value

Treatments 62.53 2 31.267 5.75 .0283

Blocks 33.73 4 8.433 1.55 .2762

Error 43.47 8 5.433

Total 139.73 14      

Step 4: H0 is rejected. There is a difference in the mean number of units produced for the different time periods.

Page 49: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-49 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 4 continued

Block Effect: Step 1: H0: µ1= µ2= µ3 = µ4 = µ5 versus H1: Not all

means are equal.

Step 2: H0 is rejected if F>3.84, the degrees of freedom are 4 and 8.

Step 3: F=[33.73/4]/[43.47/8]=1.55

Step 4: H0 is not rejected since there is no significant difference in the average number of units produced for the different employees.

Page 50: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.

Lesson10-50 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

- END -

Lesson 10:Lesson 10: Analysis of VarianceAnalysis of Variance