Chapter 3 Completely Randomized Designs

Chapter 3 Completely Randomized Designs

Yibi Huang

3.1-3.8 Completely Randomized Designs3.9 Experiments with Quantitative Factors, Goodness of Fit

(Dose Response Modeling)• Effects Model for CRD

Ch3 - 1

Definition of a Completely Randomized Design (CRD) (1)

An experiment has a completely randomized design if

I the number of treatments g (including the control if there isone) is predetermined

I the number of replicates (ni ) in the ith treatment group ispredetermined, i = 1, . . . , g , and

I each allocation of N = n1 + · · ·+ ng experimental units into ggroups of size (n1, . . . , ng ) occurs equally likely.

I Say we have 4 units: A, B, C, D and, 2 treatments w/ 2 unitseach. The CRD ensures the following allocations occurequally likely

(AB,CD), (AC ,BD), (AD,BC ),

(BC ,AD), (BD,AC ), (CD,AB).

Ch3 - 2

Definition of a Completely Randomized Design (CRD) (2)I Tossing a coin for each of the 20 patients,

if head −→ treatment, if tail −→ control

I NOT a CRD, as the number of replications in the 2 groups isnot fixed.

I If the patients draw lots, say, from 20 tickets in a hat, 10 ofwhich are marked “treatment”, it is a CRD.

I Tossing a coin for each of the 20 patients to determine whogo to the treatment or the control group until one of the 2groups reaches size 10, and all the unallocated patients go tothe other group, it is a CRD.

I Say 10 of the patients are men and 10 are women. Randomlyassign 5 men and 5 women to the treatment, and the rest tothe control

I this is a block design (in Ch13), NOT a CRD.I both groups will have 5 men and 5 women. Using a CRD, the

number of men and women in the groups may not be even

Ch3 - 3


if head −→ treatment, if tail −→ controlI NOT a CRD, as the number of replications in the 2 groups is

not fixed.






Ch3 - 3



not fixed.




I this is a block design (in Ch13), NOT a CRD.

I both groups will have 5 men and 5 women. Using a CRD, thenumber of men and women in the groups may not be even

Ch3 - 3



not fixed.






Ch3 - 3

Means Model For A CRD ExperimentConsider an experiment with g treatments, and ni replicates (i.e.,ni experimental units) for treatment i , i = 1, 2, . . . , g .

Treatment 1 : y11, y12, . . . , y1n1Treatment 2 : y21, y22, . . . . . . . . . , y2n2

...Treatment g : yg1, yg2, . . . . . . , ygng

jth unit for treatment errortreatment i effect (or noise)

↓ ↓ ↓ i = 1, 2, . . . , gyij = µi + εij j = 1, 2, . . . , ni

I µi = mean response for the ith treatment

I The error terms εij are assumed to be independent withmean 0 and constant variance σ2.Sometimes we further assume that errors are normal.

Ch3 - 4

Dot and Bar Notation

A dot (•) in subscript means summing over that index, for example

yi• =∑j

yij , y•j =∑i

yij , y•• =

g∑i=1

ni∑j=1

yij

A bar over a variable, along with a dot (•) in subscript meansaveraging over that index, for example

y i• =1

ni

ni∑j=1

yij , y•• =1

N

g∑i=1

ni∑j=1

yij

Ch3 - 5

Parameter Estimation for the Means ModelRecall the least square estimates µi ’s are the µi ’s that minimizethe sum of squares of the observations yij to their hypothesizedmeans µi based on the model,

SS =

n1∑j=1

(y1j − µ1)2 +

n2∑j=1

(y2j − µ2)2 + · · ·+ng∑j=1

(ygj − µg )2.

In order to minimize S , we could differentiate it with respect toeach µi and set the derivative equal to zero.

∂S

∂µi= −2

ni∑j=1

(yij − µi ) = −2ni (y i• − µi ) = 0

The least square estimate for µi is thus the sample mean ofobservations in the corresponding treatment group,

µi = y i• =1

ni

∑ni

j=1yij .

Moreover the LS estimate y i• for µi is unbiased.Ch3 - 6

Sum of Squares (1)The means model for a CRD also have an a sum of squaresidentity very similar to SST = SSR + SSE for a regression model.

yij − y•• = (y i• − y••) + (yij − y i•)

Squaring up both sides we get

(yij − y••)2 = (y i• − y••)2 + (yij − y i•)2 + 2(y i• − y••)(yij − y i•)

Summing over the indexes we get

SST︷︸︸︷g∑

i=1

ni∑j=1

(yij − y••)2 =

SSR=SSTrt︷︸︸︷g∑

i=1

ni∑j=1

(y i• − y••)2 +

SSE︷︸︸︷g∑

i=1

ni∑j=1

(yij − y i•)2

+ 2

g∑i=1

ni∑j=1

(y i• − y••)(yij − y i•)︸︷︷︸= 0, see next slide

In this context of experimental design, SSR is often addressed asSSTrt , called the treatment sum of squares

Ch3 - 7

Sum of Squares (2)

Observe that

g∑i=1

ni∑j=1

(y i• − y••)(yij − y i•) =

g∑i=1

(y i• − y••)

ni∑j=1

(yij − y i•)︸︷︷︸and

ni∑j=1

(yij − y i•) = yi• − niy i• = yi• − ni (yi•ni

) = 0

and henceg∑

i=1

ni∑j=1

(y i• − y••)(yij − y i•) = 0.

Ch3 - 8

g∑i=1

ni∑j=1

(yij − y••)2︸︷︷︸SST

=

g∑i=1

ni∑j=1

(y i• − y••)2︸︷︷︸=SSTrt=SSB

+

g∑i=1

ni∑j=1

(yij − y i•)2︸︷︷︸=SSE=SSW

I SST = total sum of squaresI reflects total variability in the response for all the units

I SSTrt = treatment sum of squaresI reflects variability between treatmentsI also called between sum of squares, denoted as SSB

I SSE = error sum of squaresI Observe that SSE =

∑gi=1(ni − 1)s2i , in which

s2i =1

ni − 1

∑ni

j=1(yij − y i•)2

is the sample variance within treatment group i .So SSE reflects the variability within treatment groups.

I also called within sum of squares, denoted as SSW

Ch3 - 9

Degrees of Freedom

Under the means model yij = µi + εij , and εij ’s i.i.d. ∼ N(0, σ2), itcan be shown that

SSE

σ2∼ χ2

N−g .

Further assuming that µ1 = · · · = µg , then

SST

σ2∼ χ2

N−1,SSTrt

σ2∼ χ2

g−1

and SSTrt is independent of SSE.Note the degrees of freedom of the 3 SS

dfT = N − 1, dfTrt = g − 1, dfE = N − g

break down just like SST = SSTrt + SSE ,

dfT = dfTrt + dfE

Ch3 - 10

Fitted Values, Residuals, and Estimate for σ2

I fitted value for yij is yij = µi = y i•

I residual for yij is eij = yij − yij = yij − y i•

I The estimate for σ2 is again the MSE

σ2 = MSE =SSE

dfE

=1

N−g

g∑i=1

ni∑j=1

e2ij =1

N−g

g∑i=1

ni∑j=1

(yij − y i•)2.

I The MSE is an unbiased estimator of σ2.

I A way to memorize dfE = N − g is to see that there areN = n1 + · · ·+ ng observations in total and g parameters(µ1, . . . , µg ) in the means model

Ch3 - 11

ANOVA F -test and ANOVA TableTo test whether the treatments have different effects

H0 : µ1 = · · · = µg (no difference between treatments)

Ha : µi ’s not all equal (some difference between treatments)

the test statistic is the F -statistic.

F =MSTrtMSE

=SSTrt/(g − 1)

SSE/(N − g)

which has an F distribution with g − 1 and N − g degrees offreedom.

Sum ofSource Squares d.f. Mean Squares F0

Treatments SSTrt g − 1 MSTrt =SSTrtg − 1

MSTrtMSE

Errors SSE N − g MSE =SSE

N − gTotal SST N − 1

Ch3 - 12

Interpretation of the ANOVA F -Statistic

H0 : µ1 = · · · = µg (no difference between treatments)

Ha : µi ’s not all equal (some difference between treatments)

F =SSTrt/(g − 1)

SSE/(N − g)=

SSB/(g − 1)

SSW /(N − g)

=Variation Between Groups

Variation Within Groups

The larger the variation between groups relative to variation withineach group, the stronger the evidence toward Ha

Ch3 - 13

Example — Resin Glue Failure Time — Background

I How to measure the lifetime of things like computer diskdrives, light bulbs, and glue bonds?E.g., a computer drive is claimed to have a lifetime of 800,000hours (> 90 years).Clearly the manufacturer did not have disks on test for 90years; how do they make such claims?

I Accelerated life test: Parts under stress (higher load, highertemperature, etc.) will usually fail sooner than parts that areunstressed. By modeling the lifetimes of parts under variousstresses, we can estimate (extrapolate to) the lifetime of partsthat are unstressed.

I Example: resin glue failure time

Ch3 - 14

Example — Resin Glue Failure Time1

I Goal: to estimate the life time (in hours) of an encapsulatingresin for gold-aluminum bonds in integrated circuits(operating at 120◦C)

I Method: accelerated life test

I Design: Randomly assign 37 units to one of 5 differenttemperature stresses (in Celsius)

175◦, 194◦, 213◦, 231◦, 250◦

I Treatments: temperature in Celsius

I Response: Y = log10(time to failure in hours) of the testedmaterial.

1Source: p. 448-449, Accelerated Testing (Nelson 2004). Original data isprovided by Dr. Muhib Khan of AMD.

Ch3 - 15

Example — Resin Glue Failure Time — Data

Temperature (◦C) 175 194 213 231 250

Y 2.04 1.66 1.53 1.15 1.261.91 1.71 1.54 1.22 0.832.00 1.42 1.38 1.17 1.081.92 1.76 1.31 1.16 1.021.85 1.66 1.35 1.21 1.091.96 1.61 1.27 1.28 1.061.88 1.55 1.26 1.171.90 1.66 1.38

Data file: resin2017.txt

Ch3 - 16

A First Look at the Data> resin = read.table("resin.txt", header=T)

> str(resin)

’data.frame’: 37 obs. of 2 variables:

$ tempC: int 175 175 175 175 175 175 175 175 194 194 ...

$ y : num 2.04 1.91 2 1.92 1.85 1.96 1.88 1.9 1.66 1.71 ...

> plot(resin$tempC,resin$y, ylab="log10(Failure time in hours)",

xlab="Temperature in Celsius")

●

●

●●●

●●●

●●

●

●

●●●

●

●●

●●●●●

●

●●●●●●

●

●

●

●●●●

180 200 220 240

0.8

1.2

1.6

2.0

Temperature in Celsius

log1

0(Fa

ilure

tim

e in

hou

rs)

Ch3 - 17

plot(tempC,y,ylab="log10(Failure time in hours)",

xlab="Temperature in Celsius") # Plot A

plot(tempC,y,ylab="log10(Failure time in hours)",

xlab="Temperature in Celsius",xaxt="n")

axis(side=1,at=c(175, 194, 213, 231, 250)) # Plot B

Plot A Plot B●

●

●●●

●●●

●●

●

●

●●●

●

●●

●●●●●

●

●●●●●●

●

●

●

●●●●

180 200 220 240

0.8

1.2

1.6

2.0


log1

0(Fa

ilure

tim

e in

hou

rs)

●

●

●●●

●●●

●●

●

●

●●●

●

●●

●●●●●

●

●●●●●●

●

●

●

●●●●

0.8

1.2

1.6

2.0


log1

0(Fa

ilure

tim

e in

hou

rs)

175 194 213 231 250

Which plot conveys more information?

Ch3 - 18

Example — Resin Glue Failure Time — SStrt

Temperature (◦C) 175 194 213 231 250yij 2.04 1.66 1.53 1.15 1.26

1.91 1.71 1.54 1.22 0.832.00 1.42 1.38 1.17 1.081.92 1.76 1.31 1.16 1.021.85 1.66 1.35 1.21 1.091.96 1.61 1.27 1.28 1.061.88 1.55 1.26 1.171.90 1.66 1.38

ni 8 8 8 7 6y i• 1.933 1.629 1.378 1.194 1.057

y•• = 137 (2.04 + 1.91 + · · ·+ 1.06) = 1.465

The between group sum of squares

SSTrt =∑g

i=1

∑ni

j=1(y i• − y••)2 =

∑5

i=1ni (y i• − y••)2

= 8(1.933− 1.465)2 + 8(1.629− 1.465)2 + 8(1.378− 1.465)2

+ 7(1.194− 1.465)2 + 6(1.057− 1.465)2 ≈ 3.54

Ch3 - 19

Example — Resin Glue Failure Time — SSE

The within group sum of squares:

n1∑j=1

(y1j − y1•)2 = (2.04− 1.933)2+ (1.91− 1.933)2+ · · ·+ (1.90− 1.933)2

n2∑j=1

(y2j − y2•)2 = (1.66− 1.629)2+ (1.71− 1.629)2+ · · ·+ (1.66− 1.629)2

......

n5∑j=1

(y5j − y5•)2 = (1.26− 1.057)2+ (0.83− 1.057)2+ · · ·+ (1.06− 1.057)2

So

SSE =

n1∑j=1

(y1j−y1•)2+

n2∑j=1

(y2j−y2•)2+· · ·+n5∑j=1

(y5j−y5•)2 ≈ 0.294

Ch3 - 20

Example — Resin Glue Failure Time — F -statistic

I The observed F -statistic is

F0 =SSTrt/(g − 1)

SSE/(N − g)=

3.54/(5− 1)

0.29/(37− 5)≈ 97.66

I The resulting F -statistic is very large compared to 1. In fact,the p-value

P(F4,32 ≥ 97.66) = 1.842× 10−17 � 0.001

I The data exhibit strong evidence against the null hypothesisthat all means are equal.

Ch3 - 21

ANOVA F -Test in R

> lm1 = lm(y ~ as.factor(tempC), data=resin)

> anova(lm1)

Analysis of Variance Table

Response: y

Df Sum Sq Mean Sq F value Pr(>F)

as.factor(tempC) 4 3.5376 0.88441 96.363 < 2.2e-16 ***

Residuals 32 0.2937 0.00918

Ch3 - 22

Always Check Degrees of Freedom

Without as.factor(), R will treat tempC as numeric variable andfit the model

yij = β0 + β1tempCi + εij .

rather than the means model yij = µi + εij .

> lm2 = lm(y ~ tempC, data=resin)

> anova(lm2)


Response: y

Df Sum Sq Mean Sq F value Pr(>F)

tempC 1 3.4593 3.4593 325.41 < 2.2e-16 ***

Residuals 35 0.3721 0.0106

The easiest way to know if R fits the right model is to check thedegrees of freedom in the output. In the table above, the d.f. fortemp should be 5− 1 = 4 rather than 1.

Ch3 - 23

Limitation of ANOVA F -TestsThe ANOVA F -test merely tells us the glue has different failuretime at different temperature.However, our goal is to predict the lifetime of the glue at atemperature of 120◦.

●●●●●●●●

●●

●

●●●●●

●●

●●●●●●

●●●●●●●

●

●

●●●●

120 140 160 180 200 220 240

1.0

1.5

2.0

2.5

3.0

Temperature (Celcius)

log1

0 (F

ailu

re ti

me)

(red cross marks group means)

Ch3 - 24

R Codes for Plot on the Previous Page

lm1 = lm(y ~ tempC, data=resin)

lm2 = lm(y ~ tempC + I(tempC^2), data=resin)

plot(resin$tempC, resin$y, ylab="log10 (Failure time)",xlab="Temperature (Celcius)",

xlim=c(115,250),ylim=c(1,3.2))

# Adding the vertical dash line at tempC=120

abline(v=120, col=2, lty=2)

# Adding the linear regression line

abline(lm1)

# Adding the fitted curve for the quadratic function

curve(predict(lm2, newdata=data.frame(tempC=x)), col=4, add=T)

# Marking the group means by a red cross

points(c(175, 194, 213, 231, 250), tapply(y, tempC, mean),

col=2, pch=4, cex=1.5)

Ch3 - 25

Dose-Response ModelingIn some experiments, the treatments are associated with numericallevels xi such as drug dose, baking time, or temperature.

We will refer to such levels as doses.

I The means model yij = µi + εij specifies no relationshipbetween treatment levels xi and the response y , which cannotbe used to infer the response at some dose x other than thoseused in the experiment

I With a quantitative treatment factor, experimenters areusually more interested on how the response is affected by thefactor as a function of xi

yij = f (xi ; θ) + εij ,

e.g.,

f (xi ;β0, β1) = β0 + β1xi ;

f (xi ;β0, β1, β2) = β0 + β1xi + β2x2i ; or

f (xi ;β0, β1) = β0 + β1 log(xi ).

Ch3 - 26

yij = f (xi ; θ) + εij

Advantages of dose-response modeling

I less complex (fewer parameters)

I easier to interpret (sometimes)

I generalizable to doses not included in the experiment

Issues to consider:

I How to choose the function f ?

I One commonly used family of functions f are polynomials:

f (xi ;β) = β0 + β1xi + β2x2i + · · ·+ βkx

ki ,

But polynomials are NOT always the best choiceI For simplicity, we would choose the lowest possible order of

polynomial that adequately fit the data.

I How to assess how well f fits the data? . . . . . . Goodness of fit

Ch3 - 27

Polynomial Models

Let ti denote the temperature in Celsius in treatment group i .Consider the following polynomial models for the resin glue data.

Null Model : yij = µ+ εij

Linear Model : yij = β0 + β1ti + εij

Quadratic Model : yij = β0 + β1ti + β2t2i + εij

Cubic Model : yij = β0 + β1ti + β2t2i + β3t

3i + εij

Quartic Model : yij = β0 + β1ti + β2t2i + β3t

3i + β4t

4i + εij

I Every model is nested in the model below it. (Why?)

I Don’t skip terms. If a higher order term is significant, e.g., t3i ,than all lower order terms have to be kept (1, ti , t

2i ), even if

they are not significant.

I Why no quintic or higher order models?

Ch3 - 28

In general, for an experiment with g treatment groups, if thetreatment factor is numeric, one can fit a polynomial model up todegree g − 1

yij = β0 + β1xi + · · ·+ βg−1xg−1i + εij .

Question: For the resin glue data, what will happen if a quinticmodel (a polynomial of order 5) is fitted?

yij = β0 + β1ti + β2t2i + β3t

3i + β4t

4i + β5t

5i + εij

Answer: There exist more than one polynomial of degree 5passing through the 5 points (175, µ1), (194, µ2), (213, µ3),(231, µ4), and (250, µ5). Thus the 6 coefficients β0, β1, . . . , β5CANNOT be uniquely determined.

As a rule of thumb, for an experiment with g treatments, we canfit a model with at most g parameters.

Ch3 - 29

Linear Model (1)Let’s try to fit the linear model: yij = β0 + β1ti + εij .

> lm1 = lm(y ~ tempC, data = resin)

> summary(lm1)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 3.9560075 0.1391174 28.44 <2e-16 ***

tempC -0.0118567 0.0006573 -18.04 <2e-16 ***

---

Residual standard error: 0.1031 on 35 degrees of freedom

Multiple R-squared: 0.9029, Adjusted R-squared: 0.9001

F-statistic: 325.4 on 1 and 35 DF, p-value: < 2.2e-16

I Fitted equation: log10(failure time) = 3.956− 0.01186T

I Predicted log10(failure time) at 120◦ is

3.956− 0.01186× 120 ≈ 2.5332,

and hence the failure time at 120◦ is predicted as

102.5332 ≈ 341 hours.

Ch3 - 30

Linear Model (2)

R commands for the predicted log10(failure time) along with a95% prediction interval:

> predict(lm1, newdata=data.frame(tempC=120), interval="prediction")

fit lwr upr

1 2.533201 2.289392 2.777011

●●●●●●●●

●●

●

●●●●●

●●

●●●●●●

●●●●●●●

●

●

●●●●

120 140 160 180 200 220 240

1.0

1.5

2.0

2.5

3.0


log1

0 (F

ailu

re ti

me)

By imposing the regressionline on the top of thescatter plot, we can see y isa slightly curved withtemperature. Using thelinear model, the failuretime at 120◦ will beunderestimated.

Ch3 - 31

Quadratic Model

> lm2 = lm(y ~ tempC+I(tempC^2), data=resin)

> summary(lm2)

(... part of the output is omitted ...)

Coefficients:


(Intercept) 7.4179987 1.1564331 6.415 2.51e-07 ***

tempC -0.0450981 0.0110542 -4.080 0.000258 ***

I((tempC)^2) 0.0000786 0.0000261 3.011 0.004879 **

---




I Fitted model: log10(time) = 7.418− 0.0451T + 0.0000786T 2

I Predicted log10(time) at 120◦ is

7.418− 0.0451× 120 + 0.0000786× (120)2 ≈ 3.138

The predicted failure time at 120◦ is 103.138 ≈ 1374 hours.

Ch3 - 32

Cubic & Quartic Model> lm3 = lm(y ~ tempC+I(tempC^2)+I(tempC^3), data = resin)

> summary(lm3)


(Intercept) 6.827e+00 1.299e+01 0.526 0.603

tempC -3.659e-02 1.865e-01 -0.196 0.846

I(tempC^2) 3.815e-05 8.860e-04 0.043 0.966

I(tempC^3) 6.357e-08 1.392e-06 0.046 0.964




> lm4 = lm(y ~ tempC+I(tempC^2)+I(tempC^3)+I(tempC^4), data = resin)

Coefficients:


(Intercept) 9.699e-01 1.957e+02 0.005 0.996

tempC 7.573e-02 3.750e+00 0.020 0.984

I(tempC^2) -7.649e-04 2.679e-02 -0.029 0.977

I(tempC^3) 2.600e-06 8.459e-05 0.031 0.976

I(tempC^4) -2.988e-09 9.962e-08 -0.030 0.976




Ch3 - 33

Arrhenius LawThe Arrhenius rate law in Thermodynamics says, the log of failuretime is linear in the inverse of absolute Kelvin temperature, whichequals the Centigrade temperature plus 273.16 degrees.

Arrhenius Model: yij = β0 +β1

T + 273.15.

●

●●●●

●●●

●●

●

●

●●●

●

●●

●●●●●

●

●●●●●●

●●

●

●●●●

0.0018 0.0020 0.0022

0.5

1.0

1.5

2.0

2.5

1/(Centigrate Temperature +273.15)

log1

0 (F

ailu

re ti

me)

Ch3 - 34

> lmarr = lm(y ~ I(1/(tempC+273.15)), data=resin)

> summary(lmarr)

(... some output is omitted ...)

Coefficients:


(Intercept) -4.3120 0.3007 -14.34 3.2e-16 ***

I(1/(tempC + 273.15)) 2783.7764 144.6808 19.24 < 2e-16 ***




●

●

●●●

●●●

●●

●

●

●●●

●

●●

●●●●●

●

●●●●●●

●

●

●

●●●●

120 140 160 180 200 220 240

1.0

1.5

2.0

2.5

3.0


log1

0 (F

ailu

re ti

me)

quadraticcubicquarticArrheniuslinear

Ch3 - 35

Data Can Distinguish Models Only at Design PointsIn addition to polynomial models and the Arrhenius model, manyother models can be considered

yij = β0 + β1 log(ti ) + εij ,

yij = β0 + β1 exp(ti ) + εij ,

yij = β0 + β1 sin(ti ) + εij ,

yij = β0 + f (ti ) + εij .

As we only have observations at five temperatures:

175, 194, 213, 231, 250,

the data cannot distinguish between two models:

yij = f (ti ) + εij and yij = g(ti ) + εij ,

if f (t) and g(t) coincide at t = 175, 194, 213, 231, 250, even if fand g behave differently in other places.

Ch3 - 36

The Model that Fit the Data the BestIf no restriction is placed on f , how well the model yij = f (ti ) + εijcan possibly fit the data?The least square method will choose the f that minimize∑

i

∑j

(yij − f (ti ))2

Recall that given a list of numbers x1, x2, . . . , xn the c thatminimize

∑ni=1(xi − c)2 is the mean x = 1

n

∑ni=1 xi .

Thus the least square method will choose the f that

f (ti ) = y i•.

Thus the smallest SSE a model yij = f (ti ) + εij can possiblyachieve is ∑

i

∑j

(yij − y i•)2

which is the SSE for the means model yij = µi + εij .

Conclusion: no other models can beat the means model inminimizing the SSE.

Ch3 - 37

Goodness of Fit

As the means model is the model that fit the data the best, we canaccess the goodness of a model yij = f (ti ) + εij by comparing itwith the means model.

Full Model : yij = µi + εij

Reduced Model : yij = f (ti ) + εij

This comparison is legitimate because any model yij = f (ti ) + εij isnested in the means model yij = µi + εij (letting µi = f (ti ) ).

We can use the F -statistic below to compare a reduced model witha full model

F =(SSEreduced − SSEfull)/(dfreduced − dffull)

SSEfull/dffull

Ch3 - 38

Goodness of Fit of the Linear Model

Since the linear model (reduced model) is nested in the meansmodel (full), use the F -statistic for model comparison we get

> lm1 = lm(y ~ tempC, data = resin) # linear model

> lmmeans = lm(y ~ as.factor(tempC), data = resin) # means model

> anova(lm1,lmmeans)


Model 1: y ~ tempC

Model 2: y ~ as.factor(tempC)

Res.Df RSS Df Sum of Sq F Pr(>F)

1 35 0.37206

2 32 0.29369 3 0.07837 2.8463 0.05303 .

The P-value 0.05303 is moderate evidence showing the lineardoesn’t fit the data so well.

Ch3 - 39

Goodness of Fit of the Quadratic Model

Since the quadratic model (reduced model) is also nested in themeans model (full model), again use the F -statistic for modelcomparison we get

> lm2 = lm(y ~ tempC+I((tempC)^2), data=resin) # quadratic model

> lmmeans = lm(y ~ as.factor(tempC), data = resin) # means model

> anova(lm2,lmmeans)


Model 1: y ~ tempC + I((tempC)^2)



1 34 0.29372

2 32 0.29369 2 2.6829e-05 0.0015 0.9985

The large p-value 0.9985 shows the quadratic model fits the datanearly as good as the best model. Thus, the quadratic seems to bean appropriate model for the data.

Ch3 - 40

Shall We Consider a Cubic or a Quartic Model?

No. Because

Quadratic ⊂ Cubic ⊂ Quartic ⊂ Means Model

the cubic model and quartic model won’t fit the data better thanthe means model does. As the quadratic model fits the data nearlyas well as the means model, the 4 models just fit as well as eachother. In this case we simply choose the model of lowestcomplexity.

Ch3 - 41

Be Cautious About Extrapolation

●

●

●

●●

●

●●

●●

●

●

●●●

●

●●

●●●

●●

●

●●●●●●

●

●

●

●●●●

120 140 160 180 200 220 240

1.0

1.5

2.0

2.5

3.0


log1

0 (F

ailu

re ti

me)

quadraticcubicquarticlinear

Though the quadratic, cubic, and quartic model fit the 5 pointsnearly as well, their predicted values at 120◦C are quite different,

quadratic > cubic > quartic > linear

Ch3 - 42

95% Prediction Intervals

●●●●●●●●

●●

●

●●●●●

●●●●●●●●

●●●●●●● ●

●●●●●

120 160 200 240

12

34

5


log1

0 (F

ailu

re ti

me)

Linear Model

●●●●●●●●

●●

●

●●●●●

●●●●●●●●

●●●●●●● ●

●●●●●

120 160 200 240

12

34

5


log1

0 (F

ailu

re ti

me)

Quadratic Model

●●●●●●●●

●●

●

●●●●●

●●●●●●●●

●●●●●●● ●

●●●●●

120 160 200 240

12

34

5


log1

0 (F

ailu

re ti

me)

Cubic Model

●●●●●●●●

●●

●

●●●●●

●●●●●●●●

●●●●●●● ●

●●●●●

120 160 200 240

12

34

5


log1

0 (F

ailu

re ti

me)

Quartic Model

Ch3 - 43

Prediction Intervals at 120◦CObserve the length of the 95% prediction interval increase with thedegree of the polynomial.

> predict(lm1, newdata=data.frame(tempC=120),interval="p")

fit lwr upr

1 2.533201 2.289392 2.777011


fit lwr upr

1 3.138128 2.674383 3.601874


fit lwr upr

1 3.095342 1.132382 5.058303


fit lwr upr

1 2.917399 -9.330658 15.16546

Though the quadratic, cubic, and quartic models fit in the range ofdata points (175◦C - 250◦C) as well as each other, outside thatrange the reliability of their prediction changes drastically.Ch3 - 44

Since the Arrhenius model is nested in the means model, we cancheck its goodness of fit.

> anova(lmarr,lmmeans)


Model 1: y ~ I(1/(tempC + 273.15))



1 35 0.33093

2 32 0.29369 3 0.037239 1.3525 0.2749

The moderately large P-value 0.2749 told us the Arrhenius Modelis acceptable relative to the best model.

> predict(lmarr, newdata=data.frame(tempC=120),interval="p")

fit lwr upr

1 2.76868 2.525909 3.011451

So the predicted failure time at 120◦C is 102.769 = 587.0571 hours,and the 95% prediction interval is (102.526, 103.011) ≈ (336, 1026)hours.

Ch3 - 45

95% Prediction Interval Based on the Arrhenius Model

●

●

●●●

●●●

●●

●

●

●●●

●

●●

●●●●●

●

●●●●●●

●

●

●

●●●●

120 140 160 180 200 220 240

1.0

1.5

2.0

2.5

3.0


log1

0 (F

ailu

re ti

me)

Ch3 - 46

Effects Model for CRD

Ch3 - 47

Means Model Is a Multiple Linear Regression Model

For an experiment with g treatments, the Means model

yij = µi + εij

can be written as a multiple linear regression model by defining adummy variable for each treatment group. The dummy variable forthe ith treatment is defined as

Di =

{1 if the experimental unit receives the ith treatment

0 otherwise

The means model can then be written as a regression model

Yk = µ1D1k + µ2D2k + · · ·+ µgDgk + εk

Note that this regression model has no intercept.

Ch3 - 48

In R, putting -1 in the model formula tells R to fit a regressionmodel with no intercept.

> resin = read.table("resin2017.txt", header=T)

> lmmeans = lm(y ~ -1 + as.factor(tempC), data = resin)

> lmmeans

Call:

lm(formula = y ~ -1 + as.factor(tempC), data = resin)

Coefficients:

as.factor(tempC)175 as.factor(tempC)194 as.factor(tempC)213

1.932 1.629 1.377

as.factor(tempC)231 as.factor(tempC)250

1.194 1.057

Recall for the resin glue data, the group means y i• are

Temperature (◦C) 175 194 213 231 250y i• 1.933 1.629 1.378 1.194 1.057

Observed the coefficients 1.932, 1.629, . . ., etc, are simply thegroup means y i•.

Ch3 - 49

The command as.factor(tempC) tells R to create dummyvariables for each levels the temperature. Without as.factor(),R will fit the model

yij = βti + εij

where ti is the temperature in Celsius for treatment group i .

> lmmeans1 = lm(y ~ -1 + tempC, data = resin)

> lmmeans1

Call:

lm(formula = y ~ -1 + tempC, data = resin)

Coefficients:

tempC

0.006695

Ch3 - 50

Effects Model for a CRD ExperimentsThe textbook formulate the means model in another form:

yij = µi + εij (means model)

= µ+ αi + εij (effects model)

I Observe the effects model has g + 1 parameters µ, α1, . . . , αg ,while the means model only has g parameters µ1, . . . , µg

I The effects model is overparameterized, meaning that itspecifies more parameters than we actually need. The twosets of parameters below

(µ, α1, . . . , αg ) and (µ− c, α1 + c , . . . , αg + c)

specifies identical means for the responses. Thus theparameters for the effects model cannot be uniquelydetermined.

I These two models are equivalent in the sense that fittedvalues for responses will be identical.

Ch3 - 51

How to Deal With Overparametrization?There are various ways to deal with overparametrization in theeffects model

yij = µ+ αi + εij

Some common ways includeI letting µ = 0

I letting one of the αi ’s to be 0

I letting α1 + α2 + . . .+ αg = 0

I lack natural interpretationI give simpler formulas when CRD is generalized to

factorial designs (Chapter 8).

I letting n1α1 + n2α2 + . . .+ ngαg = 0

Here ni is the number of replicates in the ith treatment group

I Under this constraint, least square estimates for µ andαi ’s have the simple form

µ = y••, αi = y i• − y••

Ch3 - 52





I letting α1 + α2 + . . .+ αg = 0I lack natural interpretation

I give simpler formulas when CRD is generalized tofactorial designs (Chapter 8).




µ = y••, αi = y i• − y••

Ch3 - 52





I letting α1 + α2 + . . .+ αg = 0I lack natural interpretationI give simpler formulas when CRD is generalized to





µ = y••, αi = y i• − y••

Ch3 - 52





I letting α1 + α2 + . . .+ αg = 0I lack natural interpretationI give simpler formulas when CRD is generalized to



Here ni is the number of replicates in the ith treatment groupI Under this constraint, least square estimates for µ andαi ’s have the simple form

µ = y••, αi = y i• − y••

Ch3 - 52

When One of the αi ’s is Dropped ...

Say α1 is dropped, the mean response for the g treatments are

E[yij ] =

µ for treatment 1

µ+ α2 for treatment 2...

µ+ αg for treatment g

I The mean response under the first treatment (i = 1) is µ

I αi = the difference between the mean response of the ithtreatment and that of the 1st treatment. One can comparethe effect of the ith treatment and the 1st treatment bytesting αi = 0

I Useful for comparing treatments

Ch3 - 53

> lmeffects1 = lm(y ~ as.factor(tempC), data = resin)

> lmeffects1

Call:

lm(formula = y ~ as.factor(tempC), data = resin)

Coefficients:

(Intercept) as.factor(tempC)194 as.factor(tempC)213

1.9325 -0.3037 -0.5550

as.factor(tempC)231 as.factor(tempC)250

-0.7382 -0.8758

Note there is no as.factor(temp)175, α1, since R sets α1 = 0.

Temperature (◦C) 175 194 213 231 250y i• 1.933 1.629 1.378 1.194 1.057

Observed µ = y1• and αi = y i• − y1•.

Ch3 - 54

Chapter 3 Completely Randomized Designs

Documents

Transcript of Chapter 3 Completely Randomized Designs