APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance....

20
.. ... - : c_ ,- APPLICATIONS OF ANALYSIS OF VARIANCE by 1\[. S. Bartlett november 1, 1946 u ! -: l.!. 1 .JUI\.

Transcript of APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance....

Page 1: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

..

•...~."- : c_ _~ ,-

APPLICATIONS OF ANALYSIS OF VARIANCE

by

1\[. S. Bartlett

november 1, 1946

u ! .~. -:l.!. 1•.JUI\.

Page 2: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

OONTENTS.

I. Basic Pr1ncip~e8

1. Introductory remarks.2. Algebra of analysis of variance.3~ Accuracy and sign:tiQance tests.

II. Analysis of Homogeneous Data

4. Two war-ti~e avu1ications(Examples A and' B) •

5. A change-over feeaing experiment(~xamp1e 0).

III. AJ:la.lysis of Heteroge~1eous Data

6. Residual heterogeneity (Example D).7 ~ llVertical ll heterogenei ty (ExC'.Jnnle E),8. Use of interaction terDs in tests of

signifiance.9. IHox'1.z;onta1" heteroge"'!.eity, and other

cO~llpJ.ications.

Page 3: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

I. BASIC PRINCIPLES

1. Introductory reillarks.

Most working statisticians are bJ now so familiar withR. A. Fisher's technique of 'analysis of variance' that Idoubt whether they will discover muol1 new in my contri1Jutionto this conference. It was, however, su.ggested to me thatsome discussion and actual examples illustrating the technique,with narticular reference to the two somewhat distinct pro­blems' of estimating the magnitude af

(i) fixed 1 i:'.'"j,eal: tj:'feo·L:l;:i G::~t rt"-.,,,y be present instatist5.cc,:!' ,J.:;::a, and

(11) -variable: inGc',:- ai::'ects givlng ri se to addi­tional iTaxiCi,"'ll:"s J "mponents,

would be complementary to a 1jh6'J C'etical pP.~00j~ by 'iJald on thesame topic.

n1e two problems may sometimGB have been' confused becauseof the very name 'analysis of variance', an 1 an.a.lysis ofvariance' table referring directly not to (ii), but to (i),and, as has been co~nented by other writers, might nerhapsless ~~biguously be called 'quadratic ~~alysis' or 'analysisof tbe sum of squares'.* In problems of type (ii) the esti­mation of the different variance components is a further andseparate prooeduxe from the calculation of the primary analysisof varianoe table, Ex~~ple$ of the two problems (i) and (ii)are discussed in seotions II and III of this paper respeotively,but befol'e coming to these I would like to review very brieflythe basic principles of a.nalysis of varia:1oe, both for refer­ence and also to justify some Of the n~lles I shall use tocbristen different types of data. Some general acquaintancewith analysis of variance technique is assllined.

2. A1beb~a of Analysis of V~riance.

I personally find it convenient theoretioally to er~loy

the geometrioal or vectorial representation of a sar~le ofobservations, and have previously discussed analysis ofvariro1ce from this point of view (~). ~ne s&aple of n obser­vations is represented in this pioture by a vector with n

* E. J. G. Pitm~n once suggested 'qnalysis of squariance', butthis phrase frankly appals me~

Page 4: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 2 -

components corresponding to the n observations, and theanalysis of variance table merely corresponds to an appro-'nriate analysis of 'the total Sll.Y.l of squares of the observa­tions, i.e~, of the square of the length of the vector, intorelevant perpendicular or orthogonal oomponents, whose squaresthen necessarily add to the total square. TIle orthogonal com­ponents lie in directions in the salilple space so chosen tocorrespond to the isolation of the £ixed linear effectsreferred to under (i), and it is a well-known feature of agood analysis of variance design that this can convenientlybe done. To be more explicit, let tli,e row vector or rt18,trixS denote t:le n observations (xl' . . ~), cVld the correspond­ing column vector S'. Then the isolat5,oD of the general meancorresponds to segregation oJ t~le qOY'tl~)::>lel1t of S along thedirection given by the veotor Z ';:; ~ 1, 1, • • . . . • 1). Thelinear oompone~lt of S ex-oresHibl~ in te~;ms of a.,ny vector Z isobtained by writing S in the form

W~lere V is orthogonal to Z, OJ" V' 7 "" (8 .0 bZ)ZI == 001',(1 )

(2)

Thus in tl1i S trivial example iNhere Z == (1,1,. . . . 1), wehave for the q..Tlalysis of varia110e table, by squ.aring* bothsides of (1),

or

sst := (SZ' )(ZZ' )'-l(ZS') + WI

~x2 := nX2 + 4(X - x)2l' r

(3)

(4)

correspo~1.ding to t:le speoial case Of (1),

(xl') := (x) + (Xl' - x) (5)

It is an obvious oonsequence of the linear form of (1)that any 'real' linear dependence on Z, which would remainafter r~1dom and llilbiased effects were removed, correspondingto t~e expeotation equation

(5)

is entirely segregated into the oOw0onent parallel to Z, sinceno linea1J superoosition of a component parallel to Z can affeota~y components of S orthogonal to Z. For example, if we addany oonstant qU&1tity to (5) it affects only the general meancomponent, 8.nd not the residual cOlYl'Oonent (xx - x). Conversely"

* i. e.' Row by colur;1.11 matrix multiplication.

Page 5: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 3 -

if such superposition does not affect any components of S,these must be orthogonal to Z, a useful practical rule forverifying if such orthogonality holds. It is also readilyShO~l that, if the observations S are uncorrelated* and allhave the same variance 0'2, 't~le square of the component alongZ has expectation i

(7)

l~ore generally, we may not want to isolate each of individualcomponents such as bZ in (1), but to separate the groun'corresponding to any linearly inde-oendent set of P vectorsZl' . .Zo, which we may denote by a column matrix Z of suchvectors Zl' .. Zr,. Equations (1), (2), (3), and (6) stillhold provided tha' b is now interpreted as a row matrix ofcoefficients b1..• bpi equation- (7) generalizes to

//.$ (ZZ i ),,/3 '+ P0'2 , (8)

The observatio:lal field of :'Jul tiple regression where thep vectors Zl' .• Z are not in general orthogonal, is oneexample w~lere the cgL1ponent~Z i8 I'emoved en bloc in this wayin the ane.lysi s of ~J'ariance table, but the same ryrincinleapplies when, say, treatme:1ts or blocks areseuarated in arandomized block layout. The further reduction from the simul­taneou.s set Z to a...i1 equivalent set of mutually ortb,ogonalvectors is of course 'always possible, and in fact becomesrelevant to the n1J.meric~l method of l'eduction of the eQuationsof estimation for/<7, and the evaluation of sta:1da.rd errors.It is made particular use of in the fitting of trends by themethod of orthogonal polynomials.

One further remark on the internretation of Z. It .oftenha";)pens that Z is not orthogonal to another vector or groupof vectors W cor:eesponding to tile other real effects in thedata. Wnen this occurs, the effects associated with Z and Ware said to be confounded, and if we ignore the effects associ­ated wi th Vol they will inter:tere wi th the esti:nation of theeffects associated with Z. The only way to free the Z effectsrro.11 W is to con61o.er tl"le components of Z orthogonal to W, a.ndin the above equations Z should be so interryreted. It iscustomary to eliminate the effect of the general mean in thisway, and it is ~l interesting property of randomized blocks,Latin squares, and other familiar orthogonal layouts that theorthogonality property holds after, mldnot before, suche~imination. For eX~llple, in a randomized block design, bothtreatment aeans and block means are affected by the generalmean, and tot~li8 exte'!1t are confounded; but the eliminationof the ge',leral [1ean implies ·that we can only consider differ­ences of treatment me~~s, Or of block mex1s, ~1d these two

* It is usual to aSsume theY~1 independe:1.t, but absence ofcorrelation is sufficient here.

Page 6: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 4 -

sets of differences are orthogonal, so that no further adjust­ment of treatment means for block means "is necessary. Thedesign of experiments, arranged to give orthogoi1Rl componentscorresponding to all, or nearly all, of the effects we areinvestigating, has by now of course reached an advanced stage(see, for example, Yates (1». In connection with the presentviewpoint, it is instructive for the student to consider whatvectors Z will isolate the effects to be estimated, beginningwith simple randomized blocks and proceeding to the more com­plicated factorial experiments.

nle above discussion comprises in shorthand form thestructural basis of ~1alysis of variance in quite generalterms and includes all forms of mu.l tiple regression, leastsquares and Fourier analysis. It so fa:c covers only problem(1), which I regard as the primary rolalysis of variance pro­blem. Before any illustrations of these principles, there is,however, a point still to be treated under (i), the questionof accuracy and significance tests.

3. Accuracy ands*gnificauce ~e8~£.

We had in ~ 2 forr.mla (6) for unbiased devie.tions,r ")

Et bj == /"3 (9 )

and under t~!.e assumptions used to obtain (7) or (8), we obtainby taking the ex-oectation of bb', the resl).l t

E L(b -/.:1)(b -,_d)j == Q'2(ZZI )-1 (10)

W~1icl1. gives tn-e vi1ria:nces and oovaria~ces of the coeffioientsb1•.• bp in terms of the varianoe o'Q, ~e shall for con­ve:nie"'1ce refer to a sample of data where the observations areunbiased a,21d u.ncorrelat-ed, and have apart from the fixedlinear effects to be estioated, a oonstant varianoe 0'2, as ahOiLloge:1eous sample, and problem (i), "Thich is to be illustratedin section II ,re-o:resents analysis of the above type Q1.1 suohhor.:lOgeneOlJ.S data. It is known 'chat the estimates b, which arethe usual least-squares estimates, have on thi s assumptionminirilUln varia:_1ce of any linear oomlJinations of the observations.

To obtain exaot tests of signifiance, further assumntions8.re needed. The usual one i s that t~le observations are inde­pe-lde:.1t and norrllally distributed; this e!lSUreS (c. f. jJ thatthe cha~oe -oosi tion of t:le vector S has a direction in thesample spaoe randomly orientated, and the ra'ldom angle itmakes with tile fixed 'plane' of vectors Zl •• , Zn has a knowndistribuJcion equivalent to Fisher's z distribution. This~ssu~ption is a less fund6~eDtal one than the one correspond­ing to the hOi'!logeneous sa.mple, but is of'Gen a reasonable and

Page 7: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 5 -

certainly a convenient one to make. It will allow exact con­fid~nce or fiducial limits to be assigned to any of the/.9.·and also allows a more complete justification of the methodof estimation implied above in terms of modern theories ofestimation and tests of significance. !f necessary, from theknown distribution of the multinle correlation coefficient Rbetween S and the set of vectors ·Z, when the latter are fixed,a dist~cibution d.epel1ding only on the I truecoefficient:,a,where

b{ZZ' )b' ,sst . (11)2

confidence lirlli ts can be set for /-l, b"1J:t this requirement inmy opinion rarely occurs.

As a varia'.'"J.t on the above normal as sUl'apt i on , ad.vantagehas been taken of tl1e device of randmi.lization in rllany statis­tical experiments, (a prooedure oarried out primarily withthe impol'tant· object of eliminatLlg bias and lack of independ­ence in the observations), to inveAtigate tests of signifi­Cffi1ce in6ependent of the asstlliTption of normality (see forexal'.lple, Welch (~) ~ However, tl1i. s can hardly be reco;~"lmendedas a routine procedure, and the aS8'lJ.mption of normali ty isthe o1;1e usually made. It may be noted ths.t transformationsof soale to me.ke t:J.e assumption of oonstant variance* morereasonable sometimes have the effeot of improving the normalityof the distribution.

* Heterogeneous varin.nce is referred to more generally laterin this paper.

Page 8: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 6 -

II. AHALYSIS OF HOliOGElmOUS DATA

4. Two War-tir,le applications (Exar,'lples A and B).

In choosing one or two ex~rrples to illustrate straight­forwa:co a~lalyses of variance of type (i), I have attemptedto avoid tIle more well-worn illustrations, and the relatedex~nules A and B, which arose during the war, may be of someinterest.

Ex~~ple A. A certain type of rocket projectile was firedfrom a cylindrical barrel which was suspected, possibly owingto imperfeot construction, of causing a fixed 'barrel deviationl

. relative to the projector, but could be turned round on itsaxis and fired in any fixed orientation to investigate t11is.In a firing trial of this kind, such a re-orientation of thecylindrical projector was made after every four rounds hadbeen fired, and the tlleoretical oasis of t11e analysis was ananalysis 0:: va'rianQe between and within such lorientation lgroups of rounds, together with a regression analysis of themean of each group on two orthoGona.l .uni t vectors fixed inthe projector perpendicular to its axis and serving as acoordinate svstem with reapect to v@ichthe barrel deviationwas measured: If these two vectors are denoted by i and J~and the projector has been rotated a positive ~1gle 9, (j'correspondin~ toa positive anti-olockwise rotation of 900from T, and i' ini tially pointing in the positive left-to­ribht lateral direction) the components of the unknown barreldeviation b1I' + baT in the lateral and angle of sight direc­tions will be

b 1 Cos e - b2 Sin 9,

resnectively.. The formal regression analysi s ca:1 thus oecarried out 0::'1 two quantities (zl) and (za)' who,e 'resnectivevalues for lateral and angle of sight proJ ectile deviations(Wl1ich tenc.ed to have the sar,le varia:1ce) are

Zl z2,I

La.teral Cos e I -Sin eAngle of sight Sin e I Cos ei

It is obviousllr convenient to C~loose a Deriodical set of valuesof e, suc~ as stens of 450 or 900 , so that Z1 and Z2, are notonly mutually 01't}:lOg'onal but orthog'011al to tne general meancomponent. In the siwple case of 900 steps, we thus have thevalues for zl and z2:

~ Lateral I Auo'l·'" of Sight iOrie'1.tation I

J. :".1 v !Z1 Z? Z1 Z~

e 00 i 1 0 I 0 1;:;; I900 I '0 -1 1 0

1800 -1 0 0 -1,2700 I

Q 1 -1 0! I-

Page 9: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 7 -

In the analysis given below, the order was actually 0 =: 270°,900 00, and 1800, and then repeated, the angles senarated bylB06being keDt as ,close together as nracticable to safeguardagainst fu"1y slow chance in the general mean.

The statistical analysis was as follows:--

T:ABLE I

(x) iI Anel_e,.-..o_f_s_i_g_h_t_d_e_v_i.,...a_t.l_·o_n_s~(y_) .....Il'----11 Totals' of I r

:; four r01U1ds' zl z2 i JI "

:i(T) " ( T)!l I- .' . .. ~ -I

-3.24 ° 1 Exzl 8.77il -2.38 I I :E YZl 8.53

\I: 1-1 °,i

3.25 0 -1 :E xz 2 -13.50.,

2.31 ° I E YZ2 -9.32'I ' 1I'

I °I

2.71 1 ° " -1.63 1 1

-3.42 -1 ° E-v-2 23 .569~H:· Ii 1,95 ° -1 I I: 2 15. 793~m(:E;)C02

Ii l'-'y-3.83 0 1 Ii -2.29 -1 0 i (Ey)2j2 0.001- Ii3.18 0 -1 23.569 ,I

·1.55 1 0 I 15.79ri Ii I I

1.97 1 - 2.55j I0 1: . 0 1 I

-0.67 -1 0 ET2/4 17.303 ii 3.19 I 0 -1 I E T2/4 10.445-0.0~~~ (~x)2/32 - Ii -o:T5'i~ I (Zy)2/32 G.. OOl I

17.303 I: I I 10.445 II j II " t... I

;

ID.F. I \1I I i

S.S. I ~-I.8 • I D.F.: 8.S. M.S.,

BetVl'een groups I 7 j 17.)03 I 2.4?2 '. Between groups -f 7 1 10.445 1.493'I, I I: i, , .

r-I Lateral deviations

ITotals of . t Ifour rounds I zl z2

~~

"hUlln groups 24 'I 6.266 I 0.201 I: 1"Jltllln groups 24 5.347 0.22) I,'i------:-------,--......-i1------H11-------l-.--.......,---+-----,

I I 'I"'-"-__T._o_t_al__~__31__..;.L...._2,3.569 I 0.760 !: Total 31 15.792 0.509!

~~7hese totals are snaIl since the deviations are already measured from theirapproximatel:l~anvalues.

~H~btained from the origi.1"J.al set ,of deviations.

-13 •.50

COi]bined anal;ys:is

______--------...-- ----__.... "'H'.,I

-I- 9.32 I-22.82/32 = 0.71 I

........_'!:l_Ja...r_re_1_d_8'_n_·a_t_i..,~n__: _O~.5ht _-_o_•...;.7_1_·~r......;('-3_•.,.,.~,_._0....;._0.;..9.i-)__~IIi D "'" 'K S: .11

• 8.S. I 1~J.."" •

Barrel deviation j 2 ; 25.626 12.813 IRemainder betVJ83nI! 0.177)".. 229 "

grolrps ! 12 \6 ; 2.'122)p 735 v

'.~fithTino.tGa-lroup,s I' 46

82' ) 111.613) J' 0.242) • i

l----:-""~~~--+l~~-' 39.361

~ote: The abbreviations D.F., S.S., and li.8" denote as usual degrees of freedom,sum of squares and mean square.

Page 10: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

8 -

It will be seen that the barrel deviation accounts forthe significant variation, and the re~ainder term betweengroups was pooled with the 'within groups' term to obtainthe residual 'dispersion', wi th estimated variance 0.829.

Example B.. -In cOlY1.ection wi th the investige.tion illus­trated by Ex~nple A, some direct measurements were made of

. il,11jerfectio:~lS in the construction of the cylindrical projec­tor; with reference to an axis of rotation which could bedefined in relation to the cylinder, external gauge measure­ments. were made at 450 angular intervals at each of 8 eaui­di·stant sections along the cylinder. A Fourier analysis ofthe readings at each section was then taken up to the 26 terms,according to the scheme of coefficients (0(., sa.y):--

Coefficients 0::.

Angle 0 III &{. = 1 Sine Oos e Sin8e Oos 26

I 00 !l 1 0', ,

1 \ 0,

1II i

II 450 1 0.707\ 0.707 1 0I 900 II 1 1 . 0 0 -1I 1350 I.

1 0.707 -0.707 -1 I 0I

'II,

1800

"1 I 0 -1' 0 1

I 2850 'I 1 I -0.707 -0.707 1 I 0I 2700 I, 1 , -1 0 I 0 I -1I 3150 II 1 -0.707 , 0.707 I -1 0I

,I IZd- 2 P 8 4 4 ! 4 I 4I

I j! I Ii I

These may be sho~~ to determine approximately the mean radius(from an artificial zero), the two components of the shift oftile .oe!ltre relative to the axis of rotation taken; and the twocOl111Jonents of 'ovality' of the section. Variation of tneseconstants along the cylinder (no single.seotion being of anyspecial interest) was then sunrraarized with the help of theorthogonal sets of coefficients (cf. Fisher and Yates, (:I))Up to the second order: -- '

Ooefficiellts /..7

,\'; I I 1

z4 ISection ,

1 2 3 4 , '5' ,. 6 , 7 t 8 I,I

. i I i

Zean C~:= 1) 1 1 1 1 I 1 ! 1 I 1 I 1 8 ILinear '7 5 3 1 -1 -3 I -5 I -7 168Parabolic '7 1 -3 -5 I -5 -3 I 1 '7 I 168 II

! ! I ! I1 :

• The laboratory meaSUrelJents were of an exnloratorycharacter, and interne,l measureuel1ts on the. cylinder would havebeen prefera.ble. The" are used her!? pr;1.marily as an illustrationof the statistioal method of ~nalYs~s,

Page 11: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 9 -

The vectors Z were then the complete sets of combinationscf x/.J making 5 x 3 == 15 orthogonal sets of coefficients in all.It is not certain that the remainder term,which was taken fortile error variance, did not contain fu:ether systematic i tei'"fiS,but it seemed reaso:nable to assume that the most imuortant oneshad been isolated. For such an investigation the resultinganalysis of variance, with the omission of the general mean .term, is given below; (it does not refer to the same nrojectoras the data quoted in Ex&rtule A).

TABLE II

I I ....I

J.:.... .; !

Change in radius ~Linear i 1 224.3along cylinder Parabolic I 1 9.2I

I

Shift of centre 1"jean(Sine I 1 3042.0I

along oylinder (Cosine I 1 21.1(Sine I 1 14.1(Linear (Cosine I 1 400.5

! (parabolic (Sin~ I 1 135·.0( . . (C081ne I 1 14.0

Ova1i ty along (Uean (Sine I 1 15.1cylinder

iLineax(Cosine 1 1 47;5(Sine I 1 3.7I(Cosine I 1

I1.3

I ~parabolic (Sine I 1 7.7(OoSinE! 1 I 45.6

Remainder term 1 49 I 16.4I

!

!: D H' ·.r S

The mean shift of the centre, and its linear tre11el alongthe cyli;1c.e1', define the axis of rotation taken; apa,rt fromthese terms, it will be seen that there is a significant linearchange in radius, and a parabolic shift of the centre in apla:!1e at right angles to the axis and the i11i tial angulareli 1'ection e == 0

5.' Example C.

To stress the fundame~tal assmnption of a homogeneoussanrple it may. be worth quoting from a pauer by Oochran, Autreyand Cannon (~) one further example from a more familiar fieldof experiE1entation. For full details refere'1ce should be madeto the original paper', but briefly the exoerirnent was a dairyoattle feeding trial of the change,...over t~"()e involving tl1ree

Page 12: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 10 -

tyPes of rations (0.e~lOted here by X, Y, and Z), and threeconsecutive periods I, II, and III. Groups of three cowsreceived rations according to the following scheme:--

Oow321p . derlo !

I X Y ZII Y Z X

III Z X Y,

The analysis o-x variance table of total ration consurLlntion percow per period for five suc~ groups of these cows, tpe grouDsbeing chosen on the basis of eXDected similar yielding abilitywithin a group was (Table 3 of the paper referred to):--

TABLE III,

Total conswuption of digestible nutrientsper cow per period

I D. F. S.S. K.S. II I

Between groups 4 15 rj ,936 39,684Between cows within

f,rouDs. 10 105,336 10,534 IBetween periods withingroups 10 40,534 4,053

Betwee'1 rations 2 533,869 266,934Ration x groupinteractions

~1~12,021 1,503)1 .!l51

B..eY:lainder 14,~OO 1,410) ,.,

I Total i 44 863,796,

The relevant point here (noted by the authors) is thq,tthere is no evidence of a ration x group interaction, so tnatit is reasonable to pool this ter::l wi tl1 t~1e reLainder to forma homogeneous error term ~gainst which the other items may betested. If, however, the interactions term had been signifi­cant, ray later cor.1~lel1ts(Par·t:; III) on the analysis of hetero­geneous data would become relevant, and the interpretationless simple.

In the case of the milk yields, an interesting featureof the analysis was t~e detection of carry-over effects ofthe rations between t~e periods. Suc~ carry-over effects,when allowed for, are not orthogonal to the direct rationeffects in the above design, and in view of this confounding,

Page 13: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- lOa -

the yrobleus arise of the signifiance of the carry-over effectsfreed from direct effects and the estimation of tbe directeffects freed from the carry-over effects. For details the

. reader is referred to the original paper; the general princi-01e8involved are covered by my remarks on confounding towards theend of '~2.

Page 14: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

------------- --

.... 11 -

. .III. AEALYSIS OF '::'ETE?lOGE~TEOUS DATA

6. Residual heterogeneity. Example D.

Once we depart from the assumption of 'homogeneity' asdefined in I, a rniscella..'11y of si tuations may confront us, andthe practical statistician has often to treat any specificproblem on its own merits. Some definitions and results may,however, be useful. First of all, it is possible thatal though we know t:::te minimum error variance realisable in ourmaterial, the latter manifests a residual heterogeneity whichinflates this minimurn varianoe. Suoh cases Oan still betreated by the methods of I and II, by treating this basioheterogeneity as the level of variability appropriate to ourassunwtion of 'homogeneity'.

As an example,(Ex~3Jnple D), oonsider the analysis ofvariance recorded in Table IV of (Q). Th.i s represents allanalysis of estimated varianoes, first transformed to thelog soale to 'render their variability independent of themean. The data oonsisted of estimates for eaoh of threegrouns on 15 days,.and the analysis of! varia:q.oe (for theloglO s.d.' s) was:........ .

TABLE IV

I iI D. F. S.S. M. S.

~,. I . .Groups 2 0.1333I 0.2667Days 1 14 ~ 0.1047 0.00"148 IResidual 28 I 0.1005 0.00359Theoxetioal Ivaria:lce I 0.0020

'rotal 44 I 0.4719:

The theoretical variance given oorxesponds to that expectedfrom the logarithms (to base 10) of standaxd deviations eaohbased on 48 d.f., but while the residual variance is signifi­oantly higher, it maybe used as the basio varianoe withwhioh to compare the other two items, both of which aresig:1ificant.

Page 15: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

and

(13)

- 12 -

7. 'Vertical heterogeneity. Example E.

The further point which arises however, goes beyond themethods of I and II viz., if we consider the variation fromday to 'day not for these particular 15 days, but as informa­tion on the day-to-day variw1ce, what estimate is to be made?This asslliuption of random superposed effects giving rise toan extra variance componetit is what I have called problem (i1).I shall charaoterize it by what I shall call "vertical :1etero­geneity", owing to this feature of variance components pilingup on each other in the different items of the analysie ofvariance table. If we look back at fOrmula (8), we shall 11.0Wmake the rather special assumption that the Z form a group forwhich the corresponding coefficients ..4a.re not fixed, but varyin an ul100rrelated and un~iased mamler about zero wit~1 theS&fle constant variance ~. Averaging equation (8) for varia­tion in the,~, we then obtain the general formula

~2(trace ZZI) + pa2 (12)

whe+e trace ZZ' denotes the sum of the diagonal elements ofZZ' • Thus if we had n observations in p + 1 sets, and forman orthogonal ~1alysis of variance table between and withinsets, we have for the first vector of Z (after eliminationof the general mean)

Z;;(1_~111_nl "1 tim es'I1. n/ .n, n, • )IT 'n" ... L l' .•• '-n·' -'7, . · · ,- -- , - --', ... , . · .n n n

( n.\2. n'j21 n· 2trace ZZI ::: 4tni(1 - -irJ.) + (n - ni)-fi~= n - l: n6- ,

in agreement with a result given by Winsor end Olarke (11).When all the ni are equal to q, this beco:.nes·pq, and (12)becOil1es

p rqa 2 + rj2]- (, ./3 J'

a familiar result w:1ich Il1ay be used to esti:'-late the 'days'vari~1ce in Table IV above. ~1e occurrence of the groupsterm is iI'rel evant, owing to t11e ort~!.ogonali ty of the designwhen the nwuber of observations in each cell is constant (int~is case, 1). We obtain the estiDates:--

3g-3 2 + rj~ N 0.00748rj "'" O. 00359

rj 2<3,. ------ 0 .• 00130

Page 16: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 13 -

e As fal' as Iam aware, no s~lution for exact confidence limits(in Heyman Is sense) for C0 is kno'wn, but in many cases we canobtain limits for r::Jp>2/(j2 if this is of interest. For ourbasic analysis was represented by

S = (be +~)Z + V, (14)

where for convenience the essentially random or error partbeZ of bZ is separated from the part/.?Z. ~Nhen ,$is variable,the independence of /~Z from beZ implies that the 'analYs~sof variance' term b(ZZt)b' cannot be proportional to a'J(with p.d.f. unless ,/.1(zzt.),4-' is, but this condition is morestringent than the one leading to (12), and is in general notsatisfied even in the simple case of p + 1 sets unless thenumbers in the sets are equal. When the condition is satis­fied, we oan iriltAeo.iately obtain oonfidence limits for theratio of the exneotation of b(ZZI)b l to that ofVV t , andhenoe for (j42 /r:J 4 • (",Vald has noted. that even in the oase ofunequal olas~ frequenoies it is still theoretioally possibleto set up ~ . expressions leading to confidenoe limits; thesolution is, however, much more involved.) We find for theabove ex~uple using t~e Fisher ~1d Yates tables (~ of thevarianoe ratio, (j~~/(j'-J ,.......0.36, With upper and lower 5~ oonfi­denoe lL:1its 1.28 and 0.00.

In narticular, where mea~s of equal nurabers of observa­tions are built up from means of smaller equal grouDS, and wesuppose the extra variations of eaoh further oontribution toheterogenei ty to be normal, thi s 8,110ws oonfidenoe limi ts tobe readil~r obtained for the ratios of each ext:ea variance tothe total va.riance of' the ili.'ll"aediately preceeding i ter11. T1:,isis,however, not quite what is reqUired, sinoe the ureoedingitelil may itself have a composite variance; in view of the 'frequent OOCUrr8?lCe of thi's 6i tuation in sampling investiga.,.tions, some further consideration of this nroblem would beuseful. An exaot solution whioh appears to me theoreticallyfeasible is in terms of the confidence region for the sir!lul­taneou.s true ratios of suocessive i temt;l, obtainable from theknown simultaneous distribution of the observed ratios, andleading to a corresponding oonfidence region for the ratiosof the oomponent variances.

As a very simple illustration (Example E) of this tyoeof saL1pling investigation, the following analysis of varianoeC?) reoresents the analysis of % dry matter in subsamplesof fresh herbage, two subs~Jples being taken for chemioalanalysis from eaoh sample, and in turn two samples taken fromeaoh plot.

Page 17: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 14 -

TABLE V

Among plotsBetween samplesBetween subswnples

D. F.5

1224

M.S.6.49441.12'770.2852

If we denote the compo~ent ~aria~ces between subsamples,samples and plots by a , as', ap respective~, th~ thr~elines in the above 8,.l'1alysis cor!'espond to 4ap"+.20' s'" + (j",2a s2 + a2 , a2" whence we obtain the estimates:~-

0'p 2 .- 1 •342

as2 '-' 0.422

0'2 .-- 0.285

8. Use of interaction terms in tests of signifiance.

When 'vertical' heterogeneity is present, the furtherquestion arises as to how ge:1uine linear effects of a, tynestill classifiable under (i) are to "lie tested. Formally, inthe common si tuation where the design is orthogonal wi th asingle obsexvation (or equal numbers) in eachinj. tial category,there is nQ theoretical difficulty, for while the pooling ofthe individual observations into new composite unitslilay .introduce extra variability, this does not necessarily invali­date any further analysis carried out on such units. For suchanalysis the ex-:cra variability may beoome quite like residualheterogenei ty (~6). .' '

.f

Thus in Exar.l'Jle E, 'prhere the five degrees of freedoi:1 forplots actually represented the error term corresponding tos~nples from the first two blocks of a r8~domized block lav­out with six treatments and five blocks, the composite U

character of the plot variance wOlJ.ld not affect a straight­forward ~~alysis of variance of the plot means into treat­ments, blocks 8u'J.d interaction or error term if samples wereavailable from all the plots. A similar situation ariseswith experimental layol1ts containing main and subplots, whenit is well-known that the main-Dlot treatments should betested against their interaction with blocks, and not againstthe SUb-Dlot error vari~~ce.

But the onus must alwavs be on the nractical statisticianto consider to what extent this swi tching of the assurtlptionsof a homogeneous set of data required in I ~!d II to thishigher level of variability are likely to be justifiable.When the data Gonsists of a nmnber of similar reolicatedfield experiments carried out with the s~ne treatments or

Page 18: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 15 -

varieties over a nurilbel' of differe;·;t sites, or over more tha"'1one year, or both, and the interaction of treatments with years(or sites) is significant, t1le question invaris;oly arises:'As the test of treat:ne'lts aga,inst the exnerime!ltal error vari­ance c~! only test the signifia~ce of their average effect overthe particular years (or sites) taken, and there is evide~ceof real differences in the behavior of the treatueuts in thedifferent years (or on the different sites), is it legitimateto test the treatments aga~~st t~e releva~t interaction termto judge its signific.ance in cor~1:parison wi t11 tl1ese ac1c~i tionalvariations? ' •

In answer, it has been stressed by various writers, innarticular by Yates and Cochran CQ.), that considerable cautionis advisable in such aj,1alysis of multi'ole exneriments. Thusif t11e treatmen.. ts are themselves highly individual in theireffect, their interactions with site or year may also be, andthe assumntion of a ho~ogeneou8 interaction term for alltreatments may not be justifiable. Yates and Cochran give adetailed discussion of a mu~tinle varietal exneriment of thistYYJe. ..

It is perhaps worth noting tha.t such breakdovvl1 of themore routine tyT)e of' analysis of variance is in no sense.pecu.liar to these larger ;:11.11 ti-ple experiments, Their greaterrange in place or time renders the basic homogeneity assllil1p­tions we require' for a·valid analysis more problematic, buteven in a single experiment these assumptions may occasio~ally

be untei.1able. vV11en in. a randomized block layout we test treat­ments agaL1st the treatmen.ts x block i~1teraction, we assumethat the latter term is.ho~ogeneous; if one treatment differenceaccounts for the greater part of the treatr.1ent variations, itsinteraction with bloo~s' ~ight be a real effect which is not

, homogeneous with the block interaction of the other treatments,and the situation is exactly similar to that envisaged in themultiple experiment.

9. ~ '~orizon tal' heterogenei ty, and. other compl ications.

Yates and Cochran in their paper also discuss the analysisof a seco:!c. multiple exnerirne:1t in which t:1e basic error vari­ances of the inc1.ivicluaf' experir.1ents are not homogeneous. Thisis another breakdown of the homogeneity assumption which ismore likely to occur in ro1 experiment extending over several'sites, but again may occasionally OCcur between blocks of asinrle exneriment sufficien t1v to inva,lidate the routineanalysis '(see (;2.> p. 141). By analogy with r,ly definition of'vertical' heteroge::18i ty, ! shall for convenience refer' tosuch heterogeneity in varianoe between parallel groups orblocks as 'horizontal' heterogeneity,

Page 19: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

1I

- 16 -

If small, this type of heterogeneity may be neglected toa first approximation, especially for ort:1ogonal designs wherein any average estimates equal numbers of observations withsuch different variances occur. But if appreciable, we encoun­ter in any exact analysis problems of statistical inference ofconsiderable com"olexi ty. They constitute a different classfrom what I have labelled problem (ii), and I do not proposeto do more than note their existence. If, for instance, werequire the mean treatment effect y over a number of parti­cular sites, for each of which the treatmen~ effeot is esti­mated to be xi' say, with error variance ~i estimated by Si~the problem of giving exact confidence limits for y from X,with unknovmerror varianoe i:~12, is equivalent to the onerecently disoussed by 'Veloh (W). If, however, we asstunethat there is no interaotion with sites, the most efficientestimate of the mean treatment effect is not the straight­forward arithmetic mean, and the oorreot equation of estima­tion is still not even entirely agreed on. If there is 8-11interaotion, and we require the arithmetio raean estimate ofthe average effect over all possible sites, assuming the parti­cular sites ohosen to oonstitute a ra~dom s&~le of these, wehave yet a third problem in thisolass~

.Other oonrolioations like non-orthogonality are beooming

better known in the analysis of variance of type (1), butbecome even more oOlnplex when heterogeneity is present. Itwas in. faot noted in ~ 7 that unequal olass numbers invalidateany simple extension of the hypothesis of homogeneity to allowfor I vertioal t hete:rogenei ty, eve'1 in the orthogonal oase ofa single olassifioation into sets, so that the praotioaldiffioulties in analyzing mOre oomplicated cases of unequalclass numbers are obvious.

While drawing attention to these various complioationsand diffioulties, I should not want to olose without areminder of the great simplioity and value of the basioteohnique di sou.ssed in I and II. The II exoeptions" do noteliminate the II rule". .

.See also Coohran, W. G., J. Roy. Stat. Soc. (Supnl.),4(1937) 102.

Page 20: APPLICATIONS OF ANALYSIS OF VARIANCE by - Nc … · 2008-11-14 · Algebra of analysis of variance. 3~ Accuracy and sign: ... correspo~1.dingto t:le speoial case Of (1), ... squares

- 17 -

References

(1) Bartlett, N. B.,

(8) Yates,

(9) V/elch,

(10) Welch,

Bartlett, E. B., "Bmne examples of statistical methodsof researoh in agriculture and appliedbiologyll, S.· RoX. Btat. Soc., (Bupnl)4 (1937), 137. ;

Bartlett, lIi.. S, a:1d Kendall, D. G., "The statisticalanalysis of variance heterogeneity andthe logarithmic transformation lf , J. Roy.Stat. Soc. (Supul) ? (1946).

Cochran, W. Gq Autrey, 'k. M., and Cannon, C. Y., " Adou~le change-over design for dairyoattle feedingexperiments lf , J. DairySc~., 24 (1941), 937.

Fisher, R. A., al1d Yates,F., "Statistical Tables forBiological Agricultural and 11:edioalResearch'1, (Oliver and Boyd, Edinburgh,1938) •

Yates, F., II The desigi1 a..71d analysis of factorial exoeri­ments", Imnerial Bureau of Soil Science,Techn. B~ll. No. 35 (Harpender, England,

.1937) •.F., and Cochran, W. G., II The anal'Tsis of groups of ....

experiments", J. Agrio. Sci.?8 (1938),556.B. t., liOn the z-test in :randomized 'blocks and'

Latin squares", Biometrika, 29 (1937),21.B. L., II The generalization of 'Stu.dent' s I 'orob1em

when several differe:1t population vari­ances are involved", Biometrika, 34(1946), •..

Winsor, C. P., and Olarke, G. L., IIA statistical study ofvariation in the oatoh of plankton nets",Bears Foundation, J. 1:8,rine Research,3 (1940), 1. ... . ..

"The vector representation of a sample"Proc. C£,.:..lb. Phil. Boc., 30 (1936),327,'

(2) Bartlett, lit B•. and Green.'l-),ill, A. W. liThe relative impor-ta~ce of plot variation and of fieldand laboratory sampling errors in smallplot pasture productivi ty exoeriments lf ,

J.Agrio.Bci., 26 (1936),258.

(4)

(3)

(5)

(7)

(11 )

r[

~..

f~ .