Statistics 571: Statistical Methods Ramón V....

1

7/23/2003 Unit 12 - Stat 571 - Ramón V. León 1

Unit 12: Analysis of Single Factor Experiments

Statistics 571: Statistical MethodsRamón V. León


Introduction

• Chapter 8: How to compare two treatments.• Chapter 12:

– How to compare more than two treatments– Limited to a single treatment factor

• Example of single factor experiment:– Compare the flight distances of three types of golf balls differing in the

shape of dimples on them: circular, fat elliptical, and thin elliptical– Treatment factor: type of ball– Factor levels: circular, fat elliptical, and thin elliptical– Treatments: circular, fat elliptical, and thin elliptical

• How would an experiment with more than one treatment factor look?

2


Experimental Designs

RandomizedBlock Design

Matched Pair Design

Dependent Samples

Completely Randomized Design

Independent Samples Design

Independent Samples

More Than Two Treatments

Two Treatments


Completely Randomized Design

Random sample drawn in each of six molding stations.Runs should be in random order to protect against time trend

3


Completely Randomized Design Notation

1

a

ij

N n=

= ∑

If the sample sizes are equalthe design is balanced;otherwise thedesign is unbalanced


Completely Randomized Design: Comments

• In a CRD the experimental units are randomly assigned to each treatment

• Similar data also arises in observational studies where the units are not assigned to the different groups by the investigator

• Stronger conclusions are possible with experimental data

4


Completely Randomized Design Data Inspection

Nominal Variable


CRD Side-by-Side Box Plots

Wei

ghts

51

51.5

52

52.5

1 2 3 4 5 6

Station

Station 5 has twooutliers

Stations 4, 5, and 6which are suppliedby feeder 2 have a higher average as a group thanstations 1, 2, and 3that are supplied byfeeder 1. Is this difference realor the resultsampling variation?

5


CRD Model and Estimation

2

Model assumption: the data on the i-th treatment are a random sample from an ( , ) population iN µ σ

2

( 1, 2,..., ; 1, 2,..., )

where are independent and identically distributed (i.i.d.)

(0, ) random errors.

ij i ij i

ij

Y i a j n

N

µ ε

ε

σ

= + = =


CRD Model and Estimation2

iThe treatment means and the error variance are unknown parameters. The primary interest is on comparing the means

µ σ

i

11

1

i

Frequently, we write where is the "grand mean"defined as the weighted average of the :

if are egual

and is the deviation of the i-th treatment

i i

aai i ii i

iaii

i

nn n

an

µ µ τ µµ

µµµ

τ µ µ

==

=

= +

= = =

= −

∑∑∑

i

meanfrom this grand mean.We refer to as the i-th treatment effect.τ

6


CRD Model and Estimation

( )i

1 1

i

:( 1, 2,..., ; 1, 2,..., )

The are subject to the contraint:

0 if the are equal

So there are only -1 linearly independent '

ij i ij i

a ai i i ii i

Y i a j n

n n n

a

µ τ ε

τ

τ τ

τ= =

= + + = =

= = =∑ ∑

Alternative Formulation of the Model

s.


CRD Parameter Estimates

2 2ˆ sσ =

Measure of common experimentalerror

7


ANOVA in JMP’s Fit Model Platform

Note that the Station variable is nominal



1

2

3

4

5

ˆˆˆˆˆˆ

µτττττ

6ˆHow do we find the value of ?τ

2s

8


Relationship to Dummy Variable Regression

1 2 3 4 5

1 1 2 2 3 3

1 if station i

1 if station 6 0 otherwise

1, 2,...,5

51.57 0.09 0.23 0.33 0.05 0.13ˆ ˆ ˆ ˆ ˆ

iz

i

y z z z z zy z z z

εµ τ τ τ τ

= −

=

= + − − + + += + + + + 4 4 5 5ˆz zτ ε+ +



2s

9


CRD (1-α)-level Confidence Interval

, 2 , 2

i

However, usually we are more interested in comparingthe with each other than estimating them separately.

i N a i i N ai i

s sy t y tn nα αµ

µ

− −− ≤ ≤ +

Fit Y by X:


Mean Diamonds in JMP

Why do all the diamonds have the same height?

10


Analysis of Variance

0 1 2 1 i

0 1 2 1 i

: ... vs. : Not all the are equal.: ... = 0 vs. : At least some 0.

a

a

H HH H

µ µ µ µτ τ τ τ

= = =

= = = ≠

Homogeneity Hypothesis :

Note SSA = Treatment sums of squares


ANOVA in JMPWrong ANOVA table:

Correct ANOVA table:

Note that the SS has the wrong number of degrees of freedom

0 1(Model: )Y Stationβ β ε= + +

1 1 2 2 3 3 4 4 5 5(Model: )Y z z z z zµ τ τ τ τ τ ε= + + + + + +

11


Model Diagnostics: Residuals versus Fitted Value

ij ij ie y y= −Part of “Fit Model” Output This plot checks the

assumption of constanterror variance σ2

A cone shape in this plot would

suggest a logtransformation of response


Model Diagnostic: Assumption of Equal Variances (More Formal Tests)

12


Model Diagnostics: Residual Versus Row (Time?) Order

A time pattern here would be confounded with a station effect. JMP table should be in the random order that the data is supposed to have been collected

Fit Model Platform:


Model Diagnostics: Normal Plot of Residuals

Strong indication that errors are normallydistributed.

13


Multiple Comparison of Means

0 1If : ... is rejected all that we can say is thatthe treatment means are not equal. The -test does not pinpoint which treatment means are significantly differentfrom each other.We could test al

aHF

µ µ= =

( )

0

0 , 2

, 2

l :

| |Reject if

1 1

| | 1 1

Least significant difference, LSD

ij i j

i jij ij N a

i j

i j N a i j

Hy y

H t ts n n

y y t s n n

α

α

µ µ

−

−

=

−= >

+

⇔ − > + =

pairwise equality hypotheses


Pairwise Equality Hypotheses

Since each of the 15 pairwise test have a level α, the type I errorprobability of declaring at least one pairwise differencefalsely significant will exceed α.

Family Wise Error rate (FWE):FWE = P{Reject at least one true null hypothesis when they are true}

If all six means are actually equal in the plastic container exampleFWE = 0.350 when each LSD test is done at the 0.05 level.

Fisher’s protected LSD method:Use LSD method only after the F-test rejects(This method is not recommended today.)

14


LSD Method in JMP

Overlap Marks

If the overlap marks overlap the two means are notsignificantly different according to the LSD criterion


LSD Method in JMP

Fit Y by X JMP platform:

15


TukeyMethod

Recommended Method:FWE = α if the sample sizes are equal and is slightly conservative(i.e., the actual FWE is < α ) when sample sizes are unequal


This report shows the ranked differences, from highest to lowest, with a confidence interval band overlaid on the plot. Confidence intervals that do not fully contain their corresponding bar are significantly different from each other.

16


Tukey Method Confidence IntervalsThis is a way of construction 100(1-α)% Simultaneous Confidence Intervals(SCIs) for all pairwise difference of means


Tukey Method Confidence Intervals

Compare to the Minitab output at the bottom of Figure 12.6 of your textbook. How would you get the top output in that figure?

17


Dunnett Method for Comparisons with a Control


Dunnett Method in JMP

18


Hsu Method for Comparison with the Best


Test

Sco

re

10

15

20

25

30

35

40

Case Equation Formula Unitary Analysis

Method

Box Plots for Teaching Method

19


Hsu Method in JMP

Explanation Next Page


Hsu Method in JMP

The UnitaryMethod is best

Can’t tell which is the worse method

20


Randomized Block Design•Blocking helps to reduce experimental error variation caused bydifference in the experimental units by grouping them into homogeneous sets (called blocks).•Treatments are randomly assigned within each block


Randomized Block Design Model: Fixed Block Effects

2

i

j

bj 1 j = 1

( 1,..., ; 1,..., )

where are i.i.d. N(0, )

is called the grand meanis called the th treatment effect is called the th block effect

0 and 0 so there are

ij i j ij

ij

aii

Y i a j b

ij

µ τ β ε

ε σ

µτβ

τ β=

= + + + = =

= =∑ ∑1 independent treatment effects

-1 independent block effectsab

−

21


Mystery of Degrees of Freedom Explained

Counting the grand mean there are 1 ( -1) ( -1) 1unknown parameters. (This many degrees of freedom are neededto estimate these parameters.)There are observations (total degrees of freedom).So

a b a b

N ab

+ + = + −

= there are ( 1) ( 1)( 1) degrees of

freedom for estimating the error variation(degrees of freedom for error).

ab a b a bν = − + − = − −


No Interactions Between Treatments and Blocks

The difference in mean responses between any two treatmentsis the same across all blocks

' ' '( ) ( )which is indepedent of the particular block jij i j i j i j i iµ µ µ τ β µ τ β τ τ− = + + − + + = −

Example: Consider the treatments to be fertilizer and the blocks to be different fields. Then no interaction implies that the differencein mean yields between any two fertilizers is the same for all fields.

We say that there are no interactions between treatments and blocks

22


RBD Example

Notice that interest is on the differences among the positions. We assume that these differences are the same for all three batches except for random error, that is,we assume no interactionbetween batch and position.


JMP Analysis of Drip Loss Experiment

Nominal

23


JMP Analysis of Drip Loss

Experiment

Position and batch explain 86% of the variation in drip loss

SSModel = SSTreatment + SSBlocksTrue because we assume no interaction betweentreatment and block. (See next slide.)


JMP 4 Analysis of Drip Loss Experiment. III

Model SS = 56.654971

These two tablewere not thesame in regression.They are equal herebecause the modelis balanced.

Also in regressionthe sum of the TypeIII sums of squares is not equal to the model sumsof squares. This only true here becausethe model is balanced.

(Type III)

Recall: The sum of the Type I sums of squares is always equal to the model sums of squares

The P-values show that there are significant position effects. We recommend ignoring the Block (Batch) test because it is not meaningful for the RBD.

24


Drip Loss in Meat Loaves: Residual Plots

The predicted versus residual plot is partof the standard output of the Fit Modelplatform. The normal plot was obtainedby saving the residuals and then going tothe Distribution platform.


Tukey Method for the RBD

Warning: Don’t use the Fit Y by X platform to do Tukey’s test as you will use the wrong number of degrees of freedom.

Using the Fit Modelplatform with batch and position in the model. That the two variables be included is important.

25





26


Mixed Effects Model for the RB Design

2

2B

i

j

1

( 1,..., ; 1,..., )

where are i.i.d. N(0, )

and are i.i.d. N(0, )

is called the grand mean is called the th treatment effect's are called the block effects

0 so

ij i j ij

ij

j

aii

Y i a j b

i

µ τ β ε

ε σ

β σ

µτβ

τ=

= + + + = =

=∑ there are 1 independent treatment effectsa −

Independent


27


Compare withResults in Section 12.4.5,Example 12.16 of your textbook

The variability due to batches accounts for about 58.4% of the total variability in drip loss.

Statistics 571: Statistical Methods Ramón V....

Documents

Transcript of Statistics 571: Statistical Methods Ramón V....