Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia.

A tale of randomization: randomization versus mixed model analysis for single and chain randomizations

Chris BrienPhenomics & Bioinformatics Research Centre, University of South Australia.The Australian Centre for Plant Functional Genomics, University of Adelaide.This work was supported by the Australian Research Council.

2

A tale of randomization: outline

1. Era uma vez quando eu era jovem…

2. Randomization analysis for a single randomization.

3. Examples for a single randomization.

4. A BIBD simulation study.

5. Randomization model for a chain of randomizations.

6. Randomization analysis for a chain of randomizations.

7. Some issues.

8. Conclusions.

3

1. Era uma vez In the 70s, I believed randomization inference to be

the correct method of analysing experiments.

4

Purism

These books demonstrate that p-value from randomization analysis is approximated by p-value from analyses assuming normality for CRDs & RCBDs;

Welch (1937) & Atiqullah (1963) show that true, provided the observed data actually conforms to the variance for the assumed normal model (e.g. homogeneity between blocks).

Kempthorne (1975):

5

Sex created difficulties … as did time Preece (1982, section 6.2): Is Sex a block or a

treatment factor? Semantic problem: what is a block factor? Often Sex is unrandomized, but is of interest – I believe

this to be the root of the dilemma. If it is unrandomized, it cannot be tested using a

randomization test (at all?). In longitudinal studies, Time is similar. Sites also. What about incomplete block designs with

recombination of information? Missing values? Seems that not all inference is possible with

randomization analysis.

6

Fisher (1935, Section 21) first proposed randomization tests:

It seems clear that Fisher intended randomization tests to be only a check on normal theory tests.

7

Fisher (1960, 7th edition) added Section 21.1 that includes:

Less intelligible test nonparametric test. He is emphasizing that one should model using subject-

matter knowledge. Others (e.g. Kempthorne) have interpreted this differently.

8

Conversion I became a modeller,

BUT, I did not completely reject randomization inference. I have advocated randomization-based mixed

models: a mixed model that starts with the terms that would be in

a randomization model (Brien & Bailey, 2006; Brien & Demétrio, 2009).

This allowed me to: test for block effects and block-treatment interactions; model longitudinal data.

I comforted myself that when testing a model that has an equivalent randomization test, the former is an approximation to the latter and so robust.

9

More recently …. Cox, Hinkelmann and Gilmour pointed out, in the

discussion of Brien and Bailey (2006), no one had so far indicated how a model for a multitiered

experiment might be justified by the randomizations employed.

Rosemary Bailey and I have been working for some time on the analysis of experiments with multiple randomizations, using randomization-based (mixed) models; Bailey and Brien (2013) details estimation & testing.

I decided to investigate randomization inference for such experiments, but first single randomizations.

10

2. Randomization analysis: what is it? A randomization model is formulated.

It specifies the distribution of the response over all randomized layouts possible for the design.

Estimation and hypothesis testing based on this distribution. Will focus on hypothesis testing.

A test statistic is identified. The value of the test statistic is computed from the data for:

all possible randomized layouts, or a random sample (with replacement) of them randomization distribution of the test statistic, or an estimate;

the randomized layout used in the experiment: the observed test statistic.

The p-value is the proportion of all possible values that are as, or more, extreme than the observed test statistic.

Different to a permutation test in that it is based on the randomization employed in the experiment.

11

Randomization model for a single randomization Additive model of constants:

y = w + Xht where y is the m-vector of observed responses for each unit ; w is the m-vector of constants representing the contributions of each

unit to the response; and t is a t-vector of treatment constants; Xh is mt design matrix giving the assignment of treatments to units.

Under randomization, i.e. over all allowable unit permutations applied to w, each element of w becomes a random variable, as does each element of y. Let W and Y be the m-vectors of random variables and so we have

Y = W + Xht. The set of Y forms the multivariate randomization distribution, our

randomization model. Now, we assume ER[W] = 0 and so ER[Y] = Xht .

12

Randomization model (cont’d) Further,

R Rvar .H H H H H HH H H

Y V B S QH H H

H is the set of s1 generalized factors (terms) derived from a poset of factors on the units;

zH is the covariance between variables with the same levels of generalized factor H;

yH is the canonical component of excess covariance for H;

hH is the spectral component (eigenvalue) of VR for H and is its contribution to E[MSq];

BH, SH, and QH are known mm matrices.

This model has the same terms as a randomization-based mixed model (Brien & Bailey, 2006; Brien & Demétrio, 2009)

However, the distributions differ.

13

Randomization by permutation of units & unit factors

Unit Blocks Units Treatments

1 1 1 12 1 2 23 2 1 14 2 2 2

Permutations for an RCBD with b = 2, k = v = 2. The allowable permutations are:

those that permute the blocks as a whole, and those that permute the units within a block; there are b!(k!)b = 2!(2!)2 = 8.

Unit Blocks Units Treatments Permutation

1 1 1 1 42 1 2 2 33 2 1 1 14 2 2 2 2

Permutedunit Blocks Units Treatments Permutation Blocks Units1 1 1 1 4 2 22 1 2 2 3 2 13 2 1 1 1 1 14 2 2 2 2 1 2

Equivalent to Treatments randomization 1, 2, 2, 1.

14

Null randomization distribution: RCBD Under the assumption of no treatment effects, Y* = W +

m*1. In which case, the randomization distribution of Y* is termed the null

randomization distribution Actual distribution obtained by applying each unit permutation to y:Permutation Y*11 Y*12 Y*21 Y*22

1 y11 y12 y21 y22

2 y12 y11 y21 y22

3 y11 y12 y22 y21

4 y12 y11 y22 y21

5 y21 y22 y11 y12

6 y21 y22 y12 y11

7 y22 y21 y11 y12

8 y22 y21 y12 y11

Note that the null ANOVA is the same for all permutations. Can show that 1st & 2nd order parameters of the distribution, m*, z*

G, z*

B and z*BU, are equal to sample statistics.

For example, for all Y*ij:

* * 2.. BU, .yy s

Y*ij for Unit

j in Block i.

The distribution of gives the distribution of W. * yY 1

Source df

Blocks b – 1

Units[Blocks] b(k – 1)

Total bk – 1

15

VR for the RCBD example The matrices in the expressions for are known.

* * * * * * *R G G B B BU BU G 2 2 2 B 2 2 2 BU 2 2

* * * *BU B G G* * * *B BU G G* * * *G G BU B* * * *G G B BU

V B B B J I J I J I I I

* * * * * * *R G G B B BU BU G 2 2 B 2 2 BU 2 2

* * * * * * *G B BU G B G G

* * * * * * *G B G B BU G G

* * * * * * *G G G B BU G B* * * * * * *G G G B G B BU

V S S S J J I J I I

* * * *R G G B B BU BU

* * *1 1 1 1 1G 2 2 B 2 2 2 BU 2 2 2 22 2 2 2 2

V Q Q Q

J J I J J I J I J

*RV

where * * * * * * * * *G BU B G B BU B BU BU2 4 , 2 ,

16

Randomization estimation & testing for a single randomization

Propose to use I-MINQUE to estimate the ys and use these estimates to estimate t via EGLS.

I-MINQUE yields the same estimates as REML, but without the need to assume a distributional form for the response.

17

Test statistics Have a set R of idempotents specifying a treatment

decomposition. For single treatments factor, only RT = MT – MG for treatment effects.

For an R R, to test H0: RXht = 0, use a Wald F, a Wald test statistic divided by its numerator df:

1 1 1( ) { ( ) ( ) } ( )Wald h h h h h hF traceRX RX X V X RX RX R

Numerator is a quadratic form: (est)’ (var(est))-1 (est). For an orthogonal design, FWald is the same as the F from an ANOVA.

Otherwise, it is a combined F test statistic. For nonorthogonal designs, an alternative test statistic is an

intrablock F-statistic. For a single randomization, let QH be the matrix that projects on the

eigenspace of V that corresponds to the intrablock source. Then and var .ˆ

H H HH

h H h Q R Q Q RQ Q

RX RQ Y RX

The intrablock ˆ' .H HH H HtraceF Q R QRQ Y RQ Y QR

18

Randomization distribution of the test statistic To obtain it:

Apply, to the unit factors and y, but not the treatment factors, all allowable unit permutations for the design employed: effects a rerandomization of the treatments;

Compute the test statistic for each allowable permutation; This set of values is the required distribution.

Number of allowable permutations. For our RCBD, there are 8 permutations and so computing the 8

test statistics is easy. For b = 10 and k = 3, there are 1.4 x 1035 — not so easy. An alternative is random data permutation (Edgington, 1995): take a

Monte Carlo sample of the permutations.

19

Null distribution of the test statistic under normality

Under normality of the response, the null distribution of FWald is: for orthogonal designs, an exact F-distribution; for nonorthogonal designs, an F-distribution

asymptotically. Under normality of the response, the null

distribution of an intrablock F-statistic is an exact F-distribution.

20

3. Examples for a single randomization

Wheat experiment in a BIBD (Joshi, 1987) Rabbit experiment using the same BIBD (Hinkelmann &

Kempthorne, 2008). Casuarina experiment in a latinized row-column

design (Williams et al., 2002).

21

Wheat experiment in a BIBD (Joshi, 1987)

Six varieties of wheat are assigned to plots arranged in 10 blocks of 3 plots.

The intrablock efficiency factor is 0.80. The ANOVA with the intrablock F and p:

plots tier treatments tier

source d.f. source d.f. MS F p-value

Blocks 9 Varieties 5 39.32 0.58 0.718

Residual 4 67.59 1.17

Plots[B] 20 Varieties 5 231.29 4.02 0.016

Residual 15 57.53

FWald = 3.05 with p = 0.035 (n1 = 5, n2 = 19.1).

Estimates: yB = 14.60 (p = 0.403); yBP = 58.28.

22

Test statistic distributions 50,000 randomly selected permutations of blocks

and plots within blocks selected. Intrablock F-statistic Combined F-statistic

Peak on RHS is all values 10.

23

Combined F-statistic

Part of the discrepancy between F- and the randomization distributions is that combined F-statistic is only asymptotically distributed as an F. Differs from Kenward & Rogers (1997) & Schaalje et al (2002) for

nonorthogonal designs.

Parametric bootstrapRandomization distribution50,000 samples from ˆ,N 0 V

24

Two other examples Rabbit experiment using the same BIBD

(Hinkelmann & Kempthorne, 2008). 6 Diets assigned to 10 Litters, each with 3 Rabbits. Estimates: yL = 21.70 (p = 0.002), yLR = 10.08.

Casuarina experiment in a latinized row-column design (Williams et al., 2002). 4 Blocks of 60 provenances arranged in 6 rows by 10

columns. Provenances grouped according to 18 Countries of

origin. 2 Inoculation dates each applied to 2 of the blocks. Estimates: yC = 0.2710; yB, yBR , yBC < 0.06;

yBRC = 0.2711.

25

ANOVA for Casuarina experiment

Provenance represents provenance differences within countries.

plots tier treatments tier

source d.f. source d.f. Eff. MS F p-value

Blocks 3 Innoculation 1 11.5411.46

0.077

Residual 2 1.011.17

Columns 9 Country 9 7.25

Rows[B] 20 Country 17 0.90

Provenance 3 0.43

B#C 27 Country 17 0.69

Provenance 10 0.48

R#C[B] 176 Country 170.761

2.4610.25

<0.001

Provenance 410.685

0.291.22

0.235

I#C 170.681

0.130.54

0.917

I#P 410.522

0.150.63

0.938

Residual 60 0.24

26

Comparison of p-values

For intrablock F, p-values from F and randomization distributions generally agree.

For FWald, p-values from F-distribution generally underestimates that from randomization distribution: (Rabbit Diets an exception – little interblock contribution).

Intrablock F FWald (Combined)

Example Source n2

F-distri-bution

Randomiz-ation n2

F-distri-bution

Randomiz-ation

Wheat Varieties 15 0.016 0.012 19.1 0.035 0.095

Rabbit Diets 15 0.038 0.039 16.0 0.032 0.035

Tree Country 60 <0.001 <0.001 79.3 <0.001 0.013

Provenance 60 0.235 0.232 79.0 0.338 0.484

Innoc#C 60 0.917 0.917 84.8 0.963 0.975

Innoc#P 60 0.938 0.937 81.1 0.943 0.969

27

4. A BIBD simulation study Evidently, the adequacy of the approximation

depends on the size of the Block component. But on what else does it depend?

Size of the experiment? Residual d.f.? Efficiency? Took 6 BIBDs that varied in these aspects.`

Size Design n t b k eBP

Block Residual d.f.

Plots[Blocks] Residual d.f.

Small b5t5 20 5 5 4 0.938 0 11

Medium b10t6 30 6 10 3 0.800 4 15

Medium b15t6 30 6 15 2 0.600 9 10

Large b15t10 60 10 15 4 0.833 5 36

Large b30t10 120 10 30 4 0.833 20 81

Large b21t21 105 21 21 5 0.840 0 64

28

Skeleton anova for a BIBD Taking Block and Plots to be random and

Treatments to be fixed, gives E[MSq] in the following table.

plots tier treatments tier E[MSq]

source d.f. source d.f. canonical covariance components

Blocks b – 1 Treatments t – 1 yBP + kyB + eBq(T)

Residual b – t yBP + kyB

Plots[B] b(k – 1) Varieties t – 1 yBP + eBPq(T)

Residual bk – b – t +1 yBP 2

2 .( )( )

1iq T r rt

29

Data generation Generated two data sets for each BIBD that differ in their

Blocks covariance components (yB). Wanted data sets with known properties:

Generated a set of normally-distributed, random values, one value for each unit;

The generated values projected into the following 4 subspaces: o Treatments from Block, Blocks Residual, Treatments from Plots[Blocks], and

Plots[Blocks];

The projected effects for each subspace are scaled so that the variance of each equals its E[MSq] and then they are summed.o The scaling ensures that all ANOVA quantities are fixed.o yBP = 1, yB = 0.25 or 2, and sT

2 chosen so that intrablock F is 0.05;

However, the I-MINQUE estimates of yB for the same ANOVA values vary for different generated values.o Generated data sets until had one with I-MINQUE estimates close to the target

yB values.

30

Analysis of data for b10t6 and yB = 0.25

Source d.f. s.s. m.s. v.r. F pr.Blocks

Treats 5 11.127 2.225 1.27 0.420Residual 4 7.000 1.750

Plots[Blocks]Treats 5 14.506 2.901 2.90 0.050Residual 15 15.000 1.000

Total 29 48.25

Here, the Treats from Plots[Blocks] E[MSq] is:

2 2 2BP BP

21 0.8 5 1 4 2.90

and so 0.475.e r

The I-MINQUE estimates for the chosen data set are:

2B BP0.249, 0.9999, 0.4760.

BP B 1 3 0.253 1.75. For the Blocks Residual E[MSq] is:

31

The treatment effect variance (st2) changed so that p-values

are approximately 0.05 — reflects power. Combined d.f. larger than intrablock d.f. when yB = 0.25,

except for b5t5. This indicates that both inter- and intra-block information used, F values also larger.

Not the case for yB = 2.

Properties of the simulated data setsIntrablock Combined

yB = 0.25 yB = 2

Size Design st2 denomin

ator d.f.F

valuedenominator d.f.

F valuedenominat

or d.f.F

valuesmall b5t5 0.628 11 3.36 11.7 3.45 11.2 3.36medium b10t6 0.475 15 2.90 19.3 3.18 16.1 3.00medium b15t6 1.938 10 3.33 19.8 4.45 12.8 3.61large b15t10 0.775 36 2.15 42.8 2.25 37.6 2.18large b30t10 0.220 81 2.00 95.8 2.11 84.5 2.02large b21t21 0.175 64 1.74 74.3 1.79 66.2 1.75

32

p-values from 50,000 randomizations There is a general tendency

for the p-values based on: the intrablock F to be similar, the Combined F to be smaller

than those for intrablock F, especially for small yB,

the Combined F to differ between Rand and F distn , especially for small yB – F distn dangerous.

For large yB, can use intrablock F and F-distn, except for a design of: Medium size with low intrablock

efficiency (b15t6): use Combined F and Rand.

For small yB, can use intrablock F and F-distn, except for a design of: small size (b5t5): use Rand; a medium design with low

intrablock efficiency (b15t6) : use Combined F and Rand;

large size with large Blocks Residual d.f. (b30t10): use Combined F. S M M L L L S M M L L L

33

p-values from 50,000 bootstraps Comparing bootstrap- and

randomization-based p-values for one type of F, they are similar, except for small design.

Small design needs further investigation.

S M M L L L S M M L L L

34

Conclusion Use Intrablock F and F

distn, except for for a design with low

intrablock efficiency (b15t6);

when Block variation is small,

o for a small design (b5t5), or o a design with large Blocks

Residual d.f. (b30t10). For small design may

need randomization analysis.

For other exceptions, can use bootstrap or randomization analysis for combined-F.

S M M L L L S M M L L L

35

5. Randomization model for a chain of randomizations

A chain of two randomizations consists of: the randomization of treatments to the first set of units; the randomization of the first set of units, along with treatments, to a

second set of units. For example, a two-phase sensory experiment (Brien &

Payne, 1999; Brien & Bailey, 2006, Example 15) involves two randomizations: Field phase: 8 treatments to 48 halfplots using split-plot with 2

Youden squares for main plots. Sensory phase: 48 halfplots randomized to 576 evaluations, using

Latin squares and an extended Youden square.6 Judges2 Occasions3 Intervals in O4 Sittings in O, I4 Positions in O, I, S, J

576 evaluations48 halfplots

2 Squares3 Rows4 Columns in Q2 Halfplots in Q, R, C

8 treatments

4 Trellis2 Methods

(Q = Squares) Three sets of objects: treatments (G), halfplots () & evaluations (W).

36

Randomization model Additive model of constants:

y = z + Xf(w + Xht) = z + Xfw + XfXht where y is the n-vector of observed responses for each unit w after

second phase; z is the n-vector of constants representing the contributions of each

unit in the 2nd randomization (w W) to the response; w is the m-vector of constants representing the contributions of each

unit in the 1st randomization (u ) to the response; and t is a t-vector of treatment constants; Xf & Xh are nm & mt design matrices showing the randomization

assignments. Under the two randomizations, each element of z and of w

become random variables, as does each element of y.

Y = Z + XfW + XfXht where Y, Z and W are the vectors of random variables. Now, we assume ER[Z] = ER[W] = 0 and so ER[Y] = XfXht .

37

Randomization model (cont’d) Further, R

.

H H H HH H

H H H HH H

H H H HH H

V C C

A B

T S

P Q

H H

H H

H H

CW & C are the contributions to the variance arising from W and , respectively.

HW & H are the sets of s2 & s1 generalized factors (terms) derived from the posets of factors on W and ;

are the covariances; are the canonical components of excess covariance; are the spectral components (eigenvalues) of CW and C,

respectively; are known nn matrices.

,H H

,H H

,H H

, , , , ,H H H H H H A B T S P Q

38

Forming the null randomization distribution of the response

Under the assumption of no treatment effects,

Y* = Z + XfW + m*1. There are two randomizations, G to and to W;

to effect G to , and H are permuted, and

to effect to W, W and HW are permuted.

However, in this model Xf is fixed and reflects the result of the second randomization in the experiment.

Hence, we do not apply the second randomization and consider the null randomization distribution, conditional on the observed randomization of to W.1) Apply the permutations of to H, HW and y, to effect a rerandomization of

G to .o must also be applied to HW so that it does not effect a rerandomization of to W.

Again the null ANOVA is the same if permutations for just first randomization used, but is not if both randomizations are considered.

39

6. Randomization analysis for a chain of randomizations

Again, based on the randomization distribution of the response.

Use the same test statistics as for a single randomization: FWald and intrablock F-statistics.

Obtain or estimate the randomization distributions of these test statistics Based on randomization of G to and is conditional on

the observed randomization of to W. An additional consideration is that it is necessary to

constrain spectral components to be nonnegative.

A Two-Phase Sensory Experiment (Brien & Bailey, 2006, Example 15)

Involves two randomizations:

40

(Brien & Payne, 1999)

6 Judges2 Occasions3 Intervals in O4 Sittings in O, I4 Positions in O, I, S, J

576 evaluations48 halfplots

2 Squares3 Rows4 Columns in Q2 Halfplots in Q, R, C

8 treatments

4 Trellis2 Methods

(Q = Squares)

The randomization distribution will be based on the randomization of treatments to halfplots and is conditional on the actual randomization of halfplots to evaluations. Permuting evaluations and y will almost certainly result in unobserved

combinations of halfplots and evaluations, so that the randomization model is no longer valid.

ANOVA table for sensory exp't

41

evaluations tier

source df

Occasions 1

Judges 5

O#J 5

Intervals[O] 4

I#J[O] 20

Sittings[OI] 18

S#J[OI] 90

Positions[OISJ] 432

treatments tier

eff source df

1/27 Trellis 3

Residual 3

2/27 Trellis 3

Residual 3

8/9 Trellis 3

Residual 9

Method 1

T#M 3

Residual 20

Intrablock Trellis

Orthogonalsources

halfplots tier

eff source df

Squares 1

Rows 2

Q#R 2

Residual 16

1/3 Columns[Q] 6

Residual 12

2/3 Columns[Q] 6

R#C[Q] 12

Residual 72

Halfplots[RCQ] 24

Residual 408

42

Fit a mixed model Randomization model:

Trellis * Methods | (Judges * (Occasions / Intervals / Sittings) ) / Positions +

(Rows * (Squares / Columns)) / Halfplots T + M + TM |

J + O + OI + OIS + OJ + OIJ + OISJ + OISJP + R + Q + RQ + QC + RQC + RQCH(Q = Square)

Model of convenience, to achieve a fit Delete one of O and Q (see decomposition table on

previous slide). Actually dropped both because a small 1 df random term

is very difficult to fit.

43

Checking spectral components Recall that R

.

H H H HH H

H H H HH H

H H H HH H

V C C

A B

T S

P Q

H H

H H

H H

It is necessary that each of CW and C are positive semidefinite. For this, all spectral components, x and , must be nonnegative. However, fit canonical components, f and y. Calculate spectral components from canonical components, the

relationship between spectral and canonical components being expressions like those for expected mean squares.

If negative, constrain canonical components so that spectral components are zero.

44

Spectral and canonical components relationships

Canonical components

Spectral component

yR yRQ yQC yRQC yRQCH fJ fOJ fOI fOIJ fOIS fOISJ fOISJP

R 192 96 24 12

RQ 96 24 12

QC 72 24 12

RQC 24 12

RQCH 12

xJ 96 48 16 4 1

xOJ 48 16 4 1

xOI 96 16 24 4 1

xOIJ 16 0 4 1

xOIS 24 4 1

xOISJ 4 1

xOISJP 1 To constrain RQC to zero, constrain yRQC = -(12/24) yRQCH.

45

Estimates of componentsUnconstrained

Random term Canonical Spectral

R 0.083 14.880

R.Q 0.010 1.021

Q.C 0.004 0.330

R.Q.C 0.004 0.025

R.Q.C.H 0.005 0.063

J 0.048 4.592

O.J 0.153 9.192

O.I 0.015 3.565

O.I.J 0.093 1.839

O.I.S 0.010 0.594

O.I.S.J 0.012 0.345

O.I.S.J.P 0.394 0.394

Constrained

Canonical Spectral

0.078 15.012

0.0001 0

0.0001 0

0.002 0.008

0.005 0.063

0.050 4.592

0.159 9.348

0.017 3.598

0.087 1.708

0.012 0.612

0.018 0.322

0.394 0.394

46

Comparison of p-values

Note the difference in denominator df for Trellis: Although these are the df for the unconstrained fit, because algorithm

failed for the constrained fit. Not a problem for randomization p-values as they are not needed.

Source Intrablock F p-values

FWald (Combined)p-values

n2 F-distribution

Randomiz-ation

n2 F-distribution

Randomiz-ation

Trellis 9 0.001 0.004 14.9 <0.001 0.004

Method 20 0.627 0.630

Trellis#Method 20 0.009 0.005

The constrained analysis provides the observed FWald. Now calculate p-values using the F or randomization

distribution. Need to check spectral components for each rerandomization.

47

F = 0.24pF = 0.627pR = 0.630

Fintra = 13.47pF = 0.001pR = 0.004

F = 5.10pF = 0.009pR = 0.005

Fcomb = 15.92(unconst 25.59)pF = <0.001pR = 0.004

Comparison of distributions

Trellis

Method

Trellis

Trellis#Method

48

7. Some issues Size of permutations sample A controversy: sometimes pooling Unit-treatment additivity

49

Size of permutations sample A study of subsamples of the 50,000 randomly

selected permutations revealed that: the estimates of p-values from samples of 25,000 or

more randomized layouts have a range < 0.005. samples of 5,000 randomized layouts will often be

sufficiently accurate – the estimates of p-valueso around 0.01 or less, exhibit a range < 0.005; o in excess of 0.20, show a range about 0.03;o around 0.05, display a range of 0.01.

50

Unit-treatment additivity Cox and Reid (2000) allow random unit-treatment

interaction; Test hypothesis that treatment effects are greater than unit-

treatment interaction. Nelder (1977) suggests the random form is questionable.

The Iowa school allows arbitrary (fixed) unit-treatment interactions. Test difference between the average treatment effects over all units,

which is biased in the presence of unit-treatment interaction. Such a test ignores marginality/hierarchy.

Questions: Which form applies? How to detect unit-treatment interaction? Often impossible, but,

when it is possible, cannot be part of a randomization analysis. Randomization analysis requires unit-treatment additivity.

If not appropriate, use a randomization-based mixed model.

51

A controversy Should nonsignificant (??) unit sources of variation

be removed and hence pooled with other unit sources?

The point is that effects hypothesized to occur at the planning stage have not eventuated. A modeller would remove them; Indeed, in mixed-model fitting using REML will have no

option if the fitting process does not converge. Some argue, because in randomization model,

must stay. They are automatically included if doing randomization

inference based on randomization permutations. Sometimes-pooling may disrupt power and

coverage properties of the analysis (Janky, 2000).

52

8. Conclusions Fisher was right:

One should employ meaningful models; Randomization analyses provides a check on parametric analyses.

I am still a modeller, with the randomization-based mixed models as my starting point.

Nice that, for single-stratum tests, the normal theory test approximates an equivalent randomization test, if one exists.

However, the p-values for combined test-statistics from the F-distribution are not always applicable: Novel that depends on ‘interblock’ components and residual df, size of

design & efficiency; I have provided a randomization analysis for a combined test statistic; Using the randomization distribution has the advantages:

o of avoiding the need to pool nonsignificant (??) unit sources of variation, although fitting can be challenging;

o of not needing the denominator degrees of freedom.

Randomization analysis for multiple randomizations has surprises. Cannot use the second randomization; Need to check the non-negativity of the spectral components.

53

References Atiqullah, M. (1963) On the randomization distribution and power of the variance

ratio test. J. Roy. Statist. Soc., Ser. B (Methodological), 25: 334-347. Bailey, R.A. & Brien, C.J. (2013) Randomization-based

models for experiments: I. A chain of randomizations. arXiv preprint arXiv:1310.4132.

Brien, C.J. & Bailey, R.A. (2006) Multiple randomizations (with discussion). J. Roy. Statist. Soc., Ser. B (Statistical Methodology), 68: 571-609.

Brien, C.J. & Demétrio, C.G.B. (2009) Formulating Mixed Models for Experiments, Including Longitudinal Experiments. J. Agric. Biol. Environ. Statist., 14: 253-280.

Cox, D.R. & Reid, N. (2000). The theory of the design of experiments. Boca Raton, Chapman & Hall/CRC.

Edgington, E.S. (1995) Randomization tests. New York, Marcel Dekker. Fisher, R.A. (1935, 1960) The Design of Experiments. Edinburgh, Oliver and

Boyd. Hinkelmann, K. & Kempthorne, O. (2008) Design and analysis of experiments.

Vol I. Hoboken, N.J., Wiley-Interscience. Janky, D.G. (2000) Sometimes pooling for analysis of variance hypothesis tests:

A review and study of a split-plot model. The Amer. Statist. 54: 269-279. Joshi, D.D. (1987) Linear estimation and design of experiments. Delhi, New Age

Publishers.

http://scholar.google.com.au/scholar?oi=bibs&hl=en&cluster=5448415329903059512&btnI=Lucky



54

References (cont’d) Kempthorne, O. (1975) Inference from experiments and randomization. A

Survey of Statistical Design and Linear Models. J. N. Srivastava. Amsterdam., North Holland.

Mead, R., S. G. Gilmour & Mead, A.. (2012). Statistical principles for the design of experiments. Cambridge, Cambridge University Press.

Nelder, J.A. (1965) The analysis of randomized experiments with orthogonal block structure. I. Block structure and the null analysis of variance. Proc. Roy. Soc. Lon., Series A, 283: 147-162.

Nelder, J. A. (1977). A reformulation of linear models (with discussion). J. Roy. Statist. Soc., Ser. A (General), 140: 48-77.

Preece, D.A. (1982) The design and analysis of experiments: what has gone wrong?" Util. Math., 21A: 201-244.

Schaalje, B. G., J. B. McBride, et al. (2002). Adequacy of approximations to distributions of test statistics in complex mixed linear models. J. Agric. Biol, Environ. Stat., 7: 512-524.

Welch, B.L. (1937) On the z-test in randomized blocks and Latin squares. Biometrika, 29: 21-52.

Williams, E.R., Matheson, A.C. & Harwood, C.E. (2002). Experimental design and analysis for tree improvement. Collingwood, Vic., CSIRO Publishing.

Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia.

Documents

Transcript of Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia.