Mardia (1970)

12
Biometrika (1970), 57, 3, p. 519 519 Printed in Great Britain Measures of multivariate skewness and kurtosis with applications BY K. V. MARDIA University of Hull STJMMABY Measures of multivariate skewness and kurtosis are developed by extending certain studies on robustness of the t statistic. These measures are shown to possess desirable properties. The asymptotic distributions of the measures for samples from a multivariate normal population are derived and a test of multivariate normality is proposed. The effect of nonnormality on the size of the one-sample Hotelling's T* test is studied empirically with the help of these measures, and it is found that Hotelling's T* test is more sensitive to the measure of skewness than to the measure of kurtosis. 1. INTRODUCTION A popular measure of univariate skewness is p x and of univariate kurtosis is /? 8 . These measures have proved useful (i) in selecting a member of a family such as from the Karl Pearson family, (ii) in developing a test of normality, and (iii) in investigating the robustness of the standard normal theory procedures. The role of the tests of normality in modern statistics has recently been summarized by Shapiro & Wilk (1965). With these applications in mind for the multivariate situations, we propose measures of multivariate skewness and kurtosis. These measures of skewness and kurtosis are developed naturally by extending certain aspects of some robustness studies for the t statistic which involve fi x and y? 2 . It should be noted that measures of multivariate dispersion have been available for quite some time (Wilks, 1932, 1960; Hotelling, 1951). We deal with the measure of skewness in § 2 and with the measure of kurtosis in § 3. In §4 we give two important applications of these measures, namely, a test of multivariate normality and a study of the effect of nonnormality on the size of the one-sample Hotelling's T* test. Both of these problems have attracted attention recently. The first problem has been treated by Wagle (1968) and Day (1969) and the second by Arnold (1964), but our approach differs from theirs. 2. A MEASUEB OF SKEWNESS 2-1. Notation and intuitive development Let the mean and variance of a random sample of size n from a nonnormal population be % and S 2 , respectively. Let K r denote the rth cumulant for the population and write 72 P% 3- Kendall & Stuart (1967, p. 466) have investigated the effect of nonnormality on the one-sample t test with the help of the asymptotic result 2 i. (2-1) 33-2 at University of Auckland on September 28, 2010 biomet.oxfordjournals.org Downloaded from

Transcript of Mardia (1970)

Page 1: Mardia (1970)

Biometrika (1970), 57, 3, p. 519 5 1 9

Printed in Great Britain

Measures of multivariate skewness and kurtosiswith applications

BY K. V. MARDIA

University of Hull

STJMMABY

Measures of multivariate skewness and kurtosis are developed by extending certainstudies on robustness of the t statistic. These measures are shown to possess desirableproperties. The asymptotic distributions of the measures for samples from a multivariatenormal population are derived and a test of multivariate normality is proposed. The effectof nonnormality on the size of the one-sample Hotelling's T* test is studied empirically withthe help of these measures, and it is found that Hotelling's T* test is more sensitive to themeasure of skewness than to the measure of kurtosis.

1. INTRODUCTION

A popular measure of univariate skewness is px and of univariate kurtosis is /?8. Thesemeasures have proved useful (i) in selecting a member of a family such as from the KarlPearson family, (ii) in developing a test of normality, and (iii) in investigating the robustnessof the standard normal theory procedures. The role of the tests of normality in modernstatistics has recently been summarized by Shapiro & Wilk (1965).

With these applications in mind for the multivariate situations, we propose measures ofmultivariate skewness and kurtosis. These measures of skewness and kurtosis are developednaturally by extending certain aspects of some robustness studies for the t statistic whichinvolve fix and y?2. I t should be noted that measures of multivariate dispersion have beenavailable for quite some time (Wilks, 1932, 1960; Hotelling, 1951).

We deal with the measure of skewness in § 2 and with the measure of kurtosis in § 3. In§4 we give two important applications of these measures, namely, a test of multivariatenormality and a study of the effect of nonnormality on the size of the one-sample Hotelling'sT* test. Both of these problems have attracted attention recently. The first problemhas been treated by Wagle (1968) and Day (1969) and the second by Arnold (1964), butour approach differs from theirs.

2. A MEASUEB OF SKEWNESS

2-1. Notation and intuitive development

Let the mean and variance of a random sample of size n from a nonnormal populationbe % and S2, respectively. Let Kr denote the rth cumulant for the population and write72 — P% — 3- Kendall & Stuart (1967, p. 466) have investigated the effect of nonnormality onthe one-sample t test with the help of the asymptotic result

2 i. (2-1)33-2

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

bmag011
Line
bmag011
Line
Page 2: Mardia (1970)

520 K. V. MABDIA

The result (2-1) is obtained by taking var(JT), var(<S2) and cov(X,£2) to order nr1. If wefurther assume that AC4 is negligible, then (2-1) reduces to

corr(X<S8)-(*&)*• (2-2)

Consequently, under the above assumptions, corr (X, S1) can be regarded as a measure ofunivariate skewness.

This discussion suggests the following extension of fit to the multivariate case. LetXJ = (Xu,..., Xjx) (i = 1,2,..., n) be a random sample of size n from a -variate populationwith random vector X' = (X l f..., Xp). Let X' = (Xlt ...,Xp) and S = {8^} denote the samplemean vector and the covariance matrix respectively. Let jx and £ be the mean vector andthe covariance matrix of X. We assume in this section for simplification that jx = 0 and£ is a nonsingular matrix. A suitable measure of multivariate skewness, denoted by y? p,may be obtained by considering the canonical correlations between X and S under thefollowing assumptions which are analogous to those used in the derivation of (2-2).

(i) The second order moments of X and S are taken to order nr1.(ii) The cumulants of order higher than 3 of X are negligible. Let us write the elements of

S = {Sy} as the vector

U = ( ' - ' i l l •••i&ppt " I S ) " • ) O j j , i •••>"2p> • • • > "p-X,pl >

which has p + q elements, q = \p(p — 1). The first p components of u are the diagonalelements of S while the remaining q components are the elements above the main diagonalof S. Let us denote the covariance matrix of the random vector (X, u) by

[Au A12]LA21 A j '

which means that A n is the covariance matrix of X and so on. The canonical correlationsA1;..., Ap, of X and u are the roots of the determinants! equation (Kendall & Stuart, 1968,p. 305)

|Au A12Aa2 AM — A2I| = 0, (2-3)where I is the identity matrix of order p x p. We can take any function of Af,..., Xp as ameasure of skewness /3hp provided it satisfies a certain invariance property. The most naturalchoice of the function is p

A., = 22^5- (2-4)

The multiplier 2 is introduced so that the value of fihp forp = 1 reduces to (2-1). It will beshown subsequently that /?^p defined by (2-4) possesses the invariance property.

We shall first simplify Px „ for £ = I and then extend fiu p for general £.

2-2. The measure PhJ>for £ = I

From (2-3) and (2-4), we have

Pi,p = 2tr(Au1A12A^1Aai). (2#5)

Since A u = ljn, (2-5) reduces to

Au). (2-6)We have, to order n -1 ,

A,8 = (l/n)diag(2,...,2, !,...,

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 3: Mardia (1970)

Measures of mvltivariate skewness and kurtosis 521

with p 2's and q l's and on substituting for AJJ in (2-6), we have

/?!,„ = »2 £ {cov{Zt,8jk)Y. (2-7)i.i.k-l

Under the assumptions (i) and (ii)of §2-1 and \L = 0, it can be shown that we have to order n - 1

eov(Xt,8ik) = (lln)E(XtX,Xk)

for all i, j and k, so that (2-7) becomes

fiup = S {tf(X<X,XJfc)}». (2-8)

2-3. A general lemma for the extension

The extension of (2-8) for Z will be obtained with the help of a general lemma. The lemmawill also be useful for the measure of multivariate kurtosis treated here.

Let <D(X) be a functional defined on a random vector X with mean vector zero andcovariance matrix I. Let Z be a positive definite matrix. Then there exists a nonsingularmatrix U such that u z u , = L ( 2 .9 )

This implies that the random vector Y denned by

X = UY (2-10)

will obviously have the functional corresponding to O as

<D*(Y) = d>(UY). (2-11)

In view of (2-9) the covariance matrix of Y is 2. However, (2-9) does not define U uniquelyfor a given Z and so 0 * denned by (2-11) may not extend <D uniquely. The following lemmagives a sufficient condition under which <D* will be unique.

LEMMA 1. The functional <D *(Y) is uniquely defined for a given Z if ^(X) is invariant underorthogonal transformations. Furthermore O*(Y) wiU then be invariant under nonsingulartransformations.

Proof. Let V be another matrix satisfying (2-9), i.e.

VZV' = I. (2-12)

For the first part of the lemma, it is sufficient to show that

<D(UY) = O(VY). (2-13)From (2-9) and (2-12) we find that

V = CU, (2-14)

where C is an orthogonal matrix. Under the assumption of the lemma, we have

<D(UY) = <D(X) = <D(CX).From the preceding equation and (2-10) we obtain <D(UY) = O(CUY), which on nuing (2-14)leads to (2-13).

We now prove the invarianoe property. Let

Y1 = AY (2-15)

be a nonsingular transformation and let Zx denote the covariance matrix of Yv We have

(UA-i)Zx(UA-i)' = I.

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 4: Mardia (1970)

522 K. V. MABDIA

Therefore, from (2-11),

which on using (2-11) and (2-16) reduces to

Hence the lemma follows.

2-4. The measure fi^pfor general £ and its properties

We now verify that (2-8) satisfies the condition of Lemma 1, i.e. invariance under theorthogonal transformation Y = CX. Let C = {C^}. Then

Xt= XCHYr.r - l

On substituting for Xi in (2-8) we have, after some simplification,

Kv= S 2 (icHC,\(xCaiC^(icrkC,k)E(YrYaYt)E(Y,Y,Yt.). (2-16)

Since C is an orthogonal matrix, (2-16) becomes

A1>p = S [E(YTYJt)f.r.s.t

Hence filiP ia invariant under orthogonal transformations.We now obtain the extension of filp from §2-3. For a positive definite matrix E, there

exists an orthogonal matrix C suoh thatC'LC = D, (2-17)

where D = d i a g ^ , ...,dp) (dt > 0; t = 1, ...,p). Consequently, we may take U = D~iC'in(2-9) and then, from (2-10), we find

i

r-lLet S - 1 = {cr"'}. On substituting for X{ in (2-8) and using the result

obtained from (2-17), we find after some simplification that /? lfv for the random vector Y

is given by A - = S S <T"'&"'a^E{YrY,Yt)E{Y^Yv). (2-18)r.e.tr.t'.f

Hence, for a random vector X with mean vector \L = (/^, ...,/ip) and covariance matrix Z,

= S 2 0*^0*^?^, (2-19)r.s.t f.f.f

where fi^) = E{(Xr—/iT) (Xa—fta) (Xf—fif)}. From Lemma 1, the measure given by (2-19)will be invariant under the nonsingular transformation

X = AY + b. (2-20)

Hence, the measure at least satisfies the most desirable property.Let us consider some particular cases of (2-19). For p = 1, we, of course, have fl^ x = fiv

For p = 2, let us denote

of = var (Z^, o% = var (Xa), p = corr (Xlt X2), yn =

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 5: Mardia (1970)

Measures of multivariate skevmess and hirtosis 523

where fiT 8 is the central moment of order (r, s) of (X±, X%). From (2-19) we have

A. 2 = ( i -p 2 ) - 3 [ r

For p = 0 and o^ = <r2 = 1 we have

Hence, yfflf 2 is zero for uncorrelated random variables if and only if / j ^ , ^,,3, ^ and /i21 allvanish simultaneously. This is obviously a desirable property for any sensible measure ofskewness. For any symmetric distribution about (x., we have fi^p = 0. In particular, fora normal population fi1>p = 0.

For a random sample X.l,...,Xn, the measure of skewness corresponding to fihp isg i v e n b y b = S S (2-23)

where

and

It can be seen that (2-23) is invariant under the nonsingular transformation (2-20).

2-5. The asymptotic distribution ofb^v and a test of multivariate skevmess

Let Xj, ....X,, be a random sample from N(y., 2). Since b^p is invariant under lineartransformations, we assume (x = 0 and L = I. Now, S converges to 2 in probability.Therefore, from (2-23) we get

^ {Jtffet)}2; (2.24)r, s, t

in probability. On writing M^ = M^ and M^ = 3fgs) (r #= s) in (2-24), we have

bx p -> {M£Y +... + 3{if Si2'}2 + ... + 6{Jlf ffi3)}2 +. . . (2-25)

in probability. It can be shown to order n - 1 that

E{Wff?} = 0, var{Jfi"}=6/», var{i»f2f>} = 2/», v»r{Jf5g">} = l/n,

cov (M<&\ Jkfgr*-)) = 0 {(r, a, *) + (r', s',«')}.

Using the normality of the vector {-3f \ ..., M$\ ...,M^\ ...} together with its mean vectorand covariance matrix obtained from the preceding results, we find with the help of thewell-known results on the limiting distributions of quadratic forms that

has a x2 distribution with p(p +1) (p + 2)/6 degrees of freedom. Hence, from (2-24),

A = nb^j6 (2-26)

has a x2 distribution with p(p +1) {jp + 2)/6 degrees of freedom. For p > 7, we can use theapproximation that (2A)i is N[{p{p +1) (p + 2) - 3}/3,1].

To test y?^p = 0 for large samples, we calculate A and reject the hypothesis for largevalues of A. Using the terminology of the univariate case, the rejection of fiXv = 0 can bedescribed as an indication of skewness in the distribution of X.

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 6: Mardia (1970)

524 K. V. MABDIA

3 . A MBASTJBE OF KUBTOSIS

3-1. A measure with motivation

For the one-sample Pitman's permutation test in the tmivariate case, Box & Andersen(1955) have shown that the square of the t statistic for this case has approximately anF distribution with S and 8(n— 1) degrees of freedom, where

(3-1)n \n

Thus the coefficient of 1/n in 8 provides a measure of univariate kurtosis, namely, /?2. Wenow show that the multivariate extension of this problem by Arnold (1964) gives a sensiblemeasure of kurtosis pip.

We denote as in § 2 that Xx,..., H^ is a random sample from a p-variate population withrandom vector X having mean vector y. and covariance matrix Z. Let T2 denote Hotelling'sT2 statistic for the one-sample problem. Using the permutation moments of T2 given inArnold (1964) and following the method of Box & Andersen (1955), we find that thedistribution of , .m,,f . ..•> / o o\

(n-p)T2l{p{n-l)} (3-2)is approximately an F distribution with 8p and 8(n—p) degrees of freedom, where

with

Further, S is the sample covariance matrix about [L. We can write (3*3) as

8 = l + (lln)[{pitP-p(p + 2)}lp]+o(lln), (3-4)where

,*)'S-i(X-,*)}». (3-5)

On comparing (3-4) with (3-1), we may take p^v as a measure of multivariate kurtosis.A similar measure has appeared in a robustness study by Box & Watson (1962).

3-2. Properties of p^

We now show that /32 p has some desirable properties. For \L = 0 and 2 = 1, (3-5)

' (3-6)

which is invariant under orthogonal transformations. Hence, from Lemma 1 of §2-3,/Jj p defined by (3-5) is invariant under nonsingular transformations. This property can alsobe established directly from (3-5).

Let (JL = 0 and Z = I. On considering

for real values of OQ, OJ and a2, we obtain

(3-7)

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

bmag011
Line
Page 7: Mardia (1970)

Measures of multivariaie skeumess and kurtosis 525where

A = .

This implies that fi%p > #2 .For# = 1, (3-7) reduces to the well-known inequality^ ^ l+filbut fih v and fi% p are not related in such a way.

Pearson (1919) has given a Tchebycheff-type inequality for the bivariate case. His argu-ment is applicable to the p-variate case and we have for any e > 0

pr{(X—(iJ'S^fX —(i) < e} 1— fi^ple2-

In view of these properties, fi^ p may be regarded as a satisfactory measure of multivariatekurtosis.

The expression (3-5) can be simplified. In fact,

+ ^ o - ' M * + 2or"cr"t}/4$L:) + 'Zo-iioJa/A\iu), (3-8)where

and S stands for the sum over all possible unequal values of the subscripts. For p = 2 we

t h a t fi { + y + 2 r + 4 p ( p r r r ) } / ( i P 2 ) s , (3-9)

where the notation of (2-21) is used. Furthermore, for o^ = <ri = 1 and p = 0,

^2,2 = ^04+^40+2/^22- ( 3 ' 1 0 )For a normal population, we have

A (3-11)For a random sample X l t . . . , X ,, the measure of kurtosis corresponding to fi^ p is obviously

&2,P = - S {(X i-S)'S-i(X i-X)}«. (3-12)"•t-i

This is also invariant under the nonsingular transformation given by (2-20).

3-3. The asymptotic distribution ofb^p and a test of multivariaie kurtosis

Let Xx,..., Xn be a random sample from N(\L, Z). In view of the invarianoe property ofb^p, we again assume JJL = 0 and E = I. We first obtain the exact value

3-3-1. The expectation of b%p

From (3-12) we have _ _E(b%p) = tf{(Xn-X)'S-i(Xn-X)}«. (3-13)

Let X* = (Z r t, ...,Xn)' (r = 1, ...,p). We transform X? to §• = (^^ ...,§„,)' by an ortho-gonal transformation 5? = CXJf (r = 1, ...,p), where C is an orthogonal matrix with thefirst row as (n~i,...,n~i) and the second row as ( — a,..., —a, I/no), a being {n(n — l)}-i.It is found that (3-13) reduces to

E(b%p) = (n-1)*E(Y*), (3-14)where

^iyi%2 with §=(£„,...,£*) (* = 2,...,»).

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 8: Mardia (1970)

526 K. V. MAJJDIA

Furthermore, % (t = 2,..., n) are distributed independently as N(0,1). Let

1 sWe have

so that, from Rao (1965, p. 459), the probability density function of Y is

Hence,- l ) } , (3-15)

and on substituting (3-15) in (3-14) we finally have

E(b2,p)=p(p + 2)(n-l)l(n+l). (3-16)

3*3-2. The asymptotic variance of b^P

We have succeeded in obtaining the exact value of EQ)^^. By the same method, E{b\^p)can be simplified if E( YZ)Z can be obtained explicitly, where

However, it seems E( YZ)% cannot be simplified. The joint distribution of Y and Z is aTjivariate beta but is not a particular case of various known multivariate beta distributions;see Olkin & Rubin (1964).

We now proceed to obtain the value of var (b^ p) to order n~1 on following a method usedfcy Lawley (1959) for a different problem. Let S = I + S* so that to order n"1, E(S*) = 0.On using S"1 = (I + S*)-1 = I - S * + S* 2 - . . .

in (3-12), we obtain after some simplification that

(3-17)

I t can be shown from the results of Cram6r (1946, pp. 346-9) that the remaining terms in(3-17) will contribute quantities of orders lower than rtr1 in var^^p)- We can rewrite<3-17)as

i - l i+i <-l i+i i-lj+k(3-18)

-where

I ( fl (Xifi-Xjr

with S* = {S*j}. Applying the asymptotic formula for calculating sampling variances ofKendall & Stuart (1969, pp. 231-2) to b^p given by (3-18) and using the asymptotic results,

E(S*) = 0, v&r{Mii>}=2/n, var {Mp} = 96/», var {Mgp} = 8/n,

cov {Mtf>, M^} = cov {M$\ M&} = cov {M<?\ M&»} = 0 (i =# j + k),

cov {M%\ M$>) = 12/n, cov {jtf$», JIfgo} = 2/»,

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 9: Mardia (1970)

Measures of mutiivariate skewness and kurtosis 527

we find after some algebra that to order n - 1

(3-19)

3-3-2. The asymptotic distribution of b^We have , n

in probability. On naing the results given by (3-16) and (3-19) and the central limit, theorem

we find that [ b { ( + 2)(n-l)l(n+l)}]l{8p{p + 2)ln}i (3-20)

is asymptotically distributed as N(0,1). Hence, we can test j3^p = p(p + 2) for large samples.The rejection of the null hypothesis can be described as an indication of kurtosis in theprobability density function of X.

It should be noted that the asymptotic distribution of nb^ P/6 derived in §2-6 is x2 ratherthan normal. This seems intuitively inevitable for any sensible measures of skewness andkurtosis. For example, in the bivariate case with S = I, a sensible measure of skewness mustaccumulate the effects of fitl, /^g, /i^ and fi^, which in f}^ a, given by (2-22), is achieved bytaking a weighted sum of squares of these moments. However, /i^, /iw and jiM are positiveso that a weighted sum of these moments as appearing in fi^ a given by (3-10) is sufficient tomeasure the kurtosis. These basic differences will be reflected in the distributions of thecorresponding sample characteristics.

4. APPLICATIONS

4-1. A test of muUivariate normality

It can be Bhown for samples from a normal population that the covariance matrix of{M$\ ...,M$*\ ..., M{^\ ...} and {M$\ .... M£\ ..., M$\ ...} to order n'1 is zero where theM'B are defined in §§2-5 and 3-3-2. Therefore, for large samples, we can test the normalityby testing fi^p = 0 and P%p = p(p + 2) separately. To test fihP = 0, we can use the result,from§2-5'that A-frb^ (4-1)

has a x2 distribution with p(p +1) (p + 2)/6 degrees of freedom and to test fi^P = p(p + 2),we can use the result, from §3-3-2, that

is distributed as N{0,1). Since n is large, B is asymptotically equivalent to (3-20).To see the effectiveness of the test of normality, we applied it to the two sets of data

described below.Data 1. K. Pearson and A. Lee's data of the heights of 1376 fathers and daughters which

is easily accessible (Fisher, 1963, pp. 178-9).Data 2. Norton's data of the rate of discount and the ratio of reserves to deposits in 780

American Banks quoted by Yule & Kendall (1949, p. 201).Tables 1 and 2 give the moments with Sheppard's correction for data 1 and data 2

respectively. The sample means are denoted by X and Y, the sample variances by S\ and 5f,the sample correlation coefficient by r, and gn = Mnl8

r18\, where M^ is the (r, s) central

moment for the sample.

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 10: Mardia (1970)

528 K. V. MARDIA

On using the moments of Table 1, in the results corresponding to (2-21) and (3-9) forsamples, we find that for data 1 bu2 = 0-0335 and b^2 = 8-2881. From (4-1) we haveA = 7-68. The 5 % value of ^f is 9-49. Therefore & 8 is not significant. Further, using (4-2),we find B = 1-34 which is less than 1-96, so that the difference b^ 2 — 8 is not significant.Hence, we conclude that data 1 may be regarded as a sample from a normal population.Lancaster (1958) arrived at the same conclusion by another test.

Table 1. Moments for data 1 of the heights of fathers (X) and daughters (7), n = 1376

Momenta of X Momenta of Y Moments of (X, Y)

% = 67-6893 7 = 63-8492 r = 0-5157<SJ = 7-5527 <SJ = 6-6926

glt = -0-0009gM = -0-1645 gn = -0-0078 gn = -0-0328gtB = 2-9731 gM = 3-1647 ga = 1-5294

gsl = 1-4853gu = 1-4735

Table 2. Moments for data 2 of the rate of discount (X) and the ratio ofreserves to desposits (Y), n = 780

Moments of X Moments of Y Moments of (X, Y)X = 3-3039 7 = 29-8115 r = -0-5375SI = 7-0006 JSJ = 17-8811

glt = - 0-2940gK = 3-8055 gn = 1-2887 gtl = -0-4714gu = 27-7573 gM = 4-2135 9u = -1-5891

g31 =-3-6416ga = 1-1857

Similarly, on using the moments given in Table 2, we find that for data 2, & 2 = 28-2306and bKl = 48-3873, which give A = 3669-98 and B = 141-03. Both of these values arehighly significant and therefore data 2 cannot be regarded as a sample from a normalpopulation. In fact, the bivariate histogram drawn for this data by Yule & Kendall (1949,p. 204 A) indicates that the data are very skew.

4-2. The effect of nonnormality on Hotelling's Tz

Various theoretical and empirical studies have established that the size of the one-samplet test is more sensitive to fit than to ft2; see, for example, the theoretical study by Bartlett(1935) and the recent empirical study by Ratcliffe (1968). In fact, Kendall & Stuart (1967,p. 466) have demonstrated this result ingeniously with the help of (2-1). By analogy, forsensible measures flliP and /?^p it may be expected that the size of Hotelling's T2 in theone-sample case will be influenced more by fiUp than by fi^p- We investigate this problemby Monte Carlo methods.

Arnold (1964) has shown by Monte Carlo methods that the size of the one-sampleHotelling's T2 test is not much influenced if samples are drawn from bivariate distributionsof independent random variables when both marginal distributions are either of rectang-ular or of double exponential form. In fact, using (2-22) and (3-10), we can see that /?i, a = 0for both bivariate distributions while ft%% = 5-6 for the bivariate rectangular distribution

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

Page 11: Mardia (1970)

Measures of multivariale skewness and kurtosis 529

and fi^ 2 = 14 for the bivariate double exponential distribution. I t should be noted that forany bivariate normal distribution, /ff ss = 0 and fi^ 2 = 8. Hence, the study suggests that /?2 2

does not have much effect on T2. On the other hand, to see whether / ^ 2 can have an appreci-able effect on T2, we consider samples from a bivariate distribution of independent randomvariables when both marginal distributions are of negative exponential form. For thisdistribution, we have fi^ 2 = 8 and /?^ 2 = 20. A pseudo-random number generator (Behrenz,

Table 3. Actual 100<z% significance level of HoteUing's Ta test for samples fromnormal, rectangular and negative exponential populations

Exponential populationPi. t = 8, fiK , = 20

(actual %)

Normal populationA, • = 0, A , = 8

(actual %)

Rectangular population*fii, » = °> fit, i = 5 '6

(actual %)

Nominaln4

95 % confidenceEstimate limits

95% confidence 95% confidenceEstimate limits Estimate limite

5-02-51-00-6502-5100-55-02-5100-5

13148-543-63216

14-8110-316-923-66

151210-516-514-73

12-48-13-807-99-9-093-26-^-001-87-2-43

14-11-15-519-71-10-915-46-6-383-29-403

14-42-15-829-91-11-11603-6-994-31-5-15

4-912-400-980-545-322-430-870-466-202-591030-47

4-49-5-332-10-1-700-79-1-170-40-0-684-88-5-762-13-2-730-69-1-050-33-0-594-76-5-642-28-2-900-83-1-230-34-0-60

7-023-631-610-886-443-701-831-026023-241-680-96

6-45-7-593-18-4-081-31-1-910-65-1-116-04-6-843-35-^-051-63-2130-79-1-255-68-6-462-94-3-541-42-1-940-78-1-14

* Values taken from Table 1 of Arnold (1964).

1962) was usedfor drawing samples of sizesn = 4,6 and 8, from this population. The numberof trials in each case was 10,000. We obtained the number of values of T2 whioh exoeeded100a % values of the normal theory T2 for a = 0-05, 0-025, 0-01 and 0-005. The results areshown in Table 3 together with the 95% confidence limits obtained by using$) ± 1-96{#(1 - $)/10,000}i, where ft is the actual level from the empirical trials. The resultsof Arnold (1964) for samples from the bivariate rectangular distribution are also shown inTable 3 for comparison.

We know that the pseudo-random numbers can introduce serious bias into the realizationsof the problem under study; see Page (1965). A check on the suitability of the generator forour problem, as well as a check on the whole experimental process, was first obtained bytaking the parent population normal. The results for this case are also shown in Table 3.The method of Box & Muller (1958) was used to generate the normal pairs.

I t can be seen from Table 3 that the nominal levels for samples from the normal populationagree well with the actual levels. However, the deviations from the nominal levels are verylarge for samples from the negative exponential population. Hence, this study indicates thatHotelling's T2 test is more sensitive to fi^p than /?^p. Furthermore, the actual level ofsignificance of Hotelling's T2 test when used on small samples from extremely nonnormalpopulations can be as much as three times the nominal level of 5 %.

I wish to express my thanks to Dr E. A. Evans and Dr J. W. Thompson for their helpfulcomments and to Mr M. J. Norman for his assistance in programming.

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from

bmag011
Line
Page 12: Mardia (1970)

530 K. V. MAEDIA

REFERENCES

ARNOLD, H. J. (1964). Permutation support for multivariate techniques. Biometrika 51, 66-70.BABTLETT, M. S. (1935). The effect of non-normality on the t distribution. Proc. Oamb. Phil. Soc. 31,

223-31.BEHBENZ, P. G. (1962). Random number generator. Oomm. Ass. Computing Machinery 5, 563.Box, G. E. P. & ANDERSEN, S. L. (1966). Permutation theory in the derivation of robust criteria and

the study of departures from assumptions. J. R. Statist. Soc. B 17, 1-34.Box, G. E. P. <fc MULLEB, M. E. (1958). A note on the generation of normal deviates. Ann. Math.

Statist. 29, 610-11.Box, G. E. P. & WATSON, G. S. (1962). Robustness to non-normality of regression testa. Biometrika

49, 93-106. [Correction 52, 669.]CBAMEB, H. (1946). Mathematical Methods of Statistics. Princeton University Press.DAY, N. E. (1969). Divisive cluster analysis and a test for multivariate normality. Bull. Inst. Int.

Statist. 43, n, 110-12.FISHBB, R. A. (1963). Statistical Methods for Research Workers, 13th ed. Edinburgh: Oliver and Boyd.HOTELLING, H. (1951). A generalized T test and measure of multivariate dispersion. Proc. Second

Berkeley Symp. pp. 23-42.KENDALL, M. G. & STUABT, A. (1967). The Advanced Theory of Statistics, vol. 2, 2nd edition. London:

Griffin.KENDALL, M. G. & STUABT, A. (1968). The Advanced Theory of Statistics, vol. 3, 2nd edition. London:

Griffin.KENDALL, M. G. & STUABT, A. (1969). The Advanced Theory of Statistics, vol. 1, 3rd edition. London:

Griffin.LANOASTEB, H. O. (1958). The struoture of bivariate distributions. Ann. Math. Statist. 29, 719-36.LAWLEY, D. N. (1969). Tests of significance in canonical analysis. Biometrika 46, 59-66.OLKIN, I. & RTTBIN, H. (1964). Multivariate beta distributions and independence properties of the

Wishart distributions. Ann. Math. Statist. 35, 261-9.PAGE, E. S. (1965). The generation of pseudo-random numbers. In Digital Simulation in Operational

Research (ed. S. H. HoUingdale), pp. 55-63. London: English Universities Press.PEARSON, K. (1919). On generalized Tohebycheff theorems in mathematical theory of statistics.

Biometrika 12, 284-96.RAO, C. R. (1965). Linear Statistical Inference and its Applications. New York: Wiley.RATOLLTBE, J. F. (1968). The effect on the t distribution of non-normality in the sampled population.

Appl. Stat. 17, 42-8.SHAPIRO, S. S. & WELK, M. B. (1965). An analysis of variance test for normality (complete samples).

Biometrika 52, 591-611.WAGLE, B. (1968). Multivariate beta distribution and a test for multivariate normality. J. R. Statist.

Soc. B 30, 511-16.WILKS, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika 24, 471-94.WILKB, S. S. (I960). Multidimensional statistical scatter. In Contributions to Probability and Statistics

in Honor of Harold HoteUing, ed. I. Olkin et al. Stanford University Press, pp. 486-503.YULE, G. U. & KENDALL, M. G. (1949). An Introduction to the Theory of Statistics, 13th ed. London:

Griffin.

[Received December 1969. Revised March 1970]

at University of A

uckland on Septem

ber 28, 2010biom

et.oxfordjournals.orgD

ownloaded from