Comparison of ANOVA- F and ANOM tests with regard to type I error rate...

13
This article was downloaded by: [Moskow State Univ Bibliote] On: 02 December 2013, At: 07:22 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Statistical Computation and Simulation Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/gscs20 Comparison of ANOVA-F and ANOM tests with regard to type I error rate and test power Mehmet Mendeş a & Soner Yiğit a a Faculty of Agriculture, Department of Animal Science, Biometry and Genetics Unit 17020, Çanakkale Onsekiz Mart University, Çanakkale, Turkey Published online: 08 May 2012. To cite this article: Mehmet Mendeş & Soner Yiğit (2013) Comparison of ANOVA-F and ANOM tests with regard to type I error rate and test power, Journal of Statistical Computation and Simulation, 83:11, 2093-2104, DOI: 10.1080/00949655.2012.679942 To link to this article: http://dx.doi.org/10.1080/00949655.2012.679942 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Transcript of Comparison of ANOVA- F and ANOM tests with regard to type I error rate...

This article was downloaded by: [Moskow State Univ Bibliote]On: 02 December 2013, At: 07:22Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Computation andSimulationPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/gscs20

Comparison of ANOVA-F and ANOM testswith regard to type I error rate and testpowerMehmet Mendeşa & Soner Yiğita

a Faculty of Agriculture, Department of Animal Science, Biometryand Genetics Unit 17020, Çanakkale Onsekiz Mart University,Çanakkale, TurkeyPublished online: 08 May 2012.

To cite this article: Mehmet Mendeş & Soner Yiğit (2013) Comparison of ANOVA-F and ANOM testswith regard to type I error rate and test power, Journal of Statistical Computation and Simulation,83:11, 2093-2104, DOI: 10.1080/00949655.2012.679942

To link to this article: http://dx.doi.org/10.1080/00949655.2012.679942

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Journal of Statistical Computation and Simulation, 2013Vol. 83, No. 11, 2093–2104, http://dx.doi.org/10.1080/00949655.2012.679942

Comparison of ANOVA-F and ANOM tests with regard to type Ierror rate and test power

Mehmet Mendes* and Soner Yigit

Faculty of Agriculture, Department of Animal Science, Biometry and Genetics Unit 17020, ÇanakkaleOnsekiz Mart University, Çanakkale, Turkey

(Received 31 December 2010; final version received 23 March 2012)

A Monte Carlo simulation was conducted to compare the type I error rate and test power of the analysis ofmeans (ANOM) test to the one-way analysis of variance F-test (ANOVA-F). Simulation results showedthat as long as the homogeneity of the variance assumption was satisfied, regardless of the shape ofthe distribution, number of group and the combination of observations, both ANOVA-F and ANOM testhave displayed similar type I error rates. However, both tests have been negatively affected from theheterogeneity of the variances. This case became more obvious when the variance ratios increased. Thetest power values of both tests changed with respect to the effect size (�), variance ratio and sample sizecombinations. As long as the variances are homogeneous, ANOVA-F and ANOM test have similar powersexcept unbalanced cases. Under unbalanced conditions, the ANOVA-F was observed to be powerful thanthe ANOM-test. On the other hand, an increase in total number of observations caused the power values ofANOVA-F and ANOM test approach to each other. The relations between effect size (�) and the varianceratios affected the test power, especially when the sample sizes are not equal. As ANOVA-F has becometo be superior in some of the experimental conditions being considered, ANOM is superior in the others.However, generally, when the populations with large mean have larger variances as well, ANOM test hasbeen seen to be superior. On the other hand, when the populations with large mean have small variances,generally, ANOVA-F has observed to be superior. The situation became clearer when the number of thegroups is 4 or 5.

Keywords: analysis of variance; ANOM; type I error; test power; simulation

1. Introduction

Most of the studies in practice concern comparison of the difference of group means. The one-wayfixed effects analysis of variance F-test (ANOVA-F) is a commonly used technique for comparingthe effects of k independent group means [1,2]. A number of procedures have been developed,including the Welch test [3], the Brown–Forsythe test [4], the James-second-order test [5] and theAlexander–Govern test [6], for comparing independent group means. One of the other methodsused for the same purpose is analysis of means (ANOM) [7–9]. ANOM test was developedunder the assumptions of normality and homogeneity of variance as in ANOVA. ANOM test isused for comparing the group means, proportions or rates, and testing the homogeneity of thevariances [10]. Since it is a graphical method, understanding and interpreting the results are quiteeasy. At the same time, ANOM graphics provide information about practical significance of thedifferences in question, as well as their statistical significance [10,11]. Despite having important

*Corresponding author. Email: [email protected]

© 2013 Taylor & Francis

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

2094 M. Mendes and S. Yigit

advantages over variance analysis, ANOM test has some incompetence such as limited use asa method, and ambiguity of its performance when the normality and homogeneity of varianceassumptions are not met. There is no study in the literature that ANOM test was compared withANOVA-F or other tests in terms of type I error rate and test power. Many studies, however, areavailable in the literature thatANOVA-F was compared with some alternative tests in terms of typeI error rate and test power [12–16]. The main purpose of this study was to compare performancesof ANOM test with the ANOVA-F under various conditions.

2. Material and method

The material of this study is the random numbers generated by Monte Carlo simulation technique.RNNOA, RNSTT, RNBET and RNCHI functions of IMSL library of Microsoft FORTRANDeveloper Studio were used to generate random numbers. In the study, ANOVA-F and ANOMtest were compared with respect to type I error rate (α) and test power (1 − β) under differentexperimental conditions, such as the number of groups (k), distribution shape, sample size (n),variance ratios and effect size (�). Experimental conditions in question were given in detail inTable 1. Each experimental condition was repeated 50,000 times. Type I error rates regardingANOVA-F were calculated by dividing the number of falsely rejected H0 hypotheses by the totalnumber of trials (50,000). Type I error rates empirically estimated in terms of ANOM test wascalculated by determining the number of H0 hypotheses that incorrectly fell outside the interval(greater than upper decision line (UDL) or less than lower decision line (LDL)) as a consequenceof 50,000 simulation trials despite being in the interval in reality, and dividing this numberby 50,000.

For ANOVA-F, in order to obtain test power values in question, first, differences in termsof standard deviations among the group means in question were created. To this extend, con-stant numbers (� = 0.25, 0.50, 0.75 and 1.00) in terms of standard deviation were added to allobservations in the last group. As for ANOM test, steps of obtaining test power are:

(a) Parameter space is determined. Parameter spaces for this study are given in Table 1.(b) Specific constant numbers with standard deviation form (� = 0.25, 0.50, 0.75, 1.00) were

added to the random numbers of the one or more groups to supply at least one of the meansthat falls outside the decision lines. Thus, at least one of the means was provided to fall outsidethe decision lines.

(c) ANOM test was applied on these new data sets, and decision lines were computed. Thisprocess was repeated 50,000 times, and the number of trials (r) in which at least one of thetreatment means which fell outside the decision lines was determined. Afterwards, the testpowers were obtained by dividing this number by 50,000 which is the total number of trials.

That is, test powers were determined as 1 − β = r/50,000. In this study, type I error wasdetermined to be 5.00%.

Numbers of False positive (+) and False negative (−)were also counted (results were not given).

2.1. Statistical methods

2.1.1. One-way ANOVA-F

ANOVA-F is commonly used to test equality of several population means. In other words,ANOVA-F is used to test the hypothesis H0 : μ1 = μ2 = . . . μk versus the alternative that H1 : atleast one of the μi is different [1,17]. The ANOVA F-ratio is computed as the ratio of the mean

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

Journal of Statistical Computation and Simulation 2095

square between treatment group means (MST) to the mean square error (MSE) or pooled withingroup variance:

F = MST

MSE.

The critical test statistic is obtained from the F-distribution with k − 1 and N − k degrees offreedom. If F-ratio is equal or greater than critical F-value, then H0 will reject otherwise H0 willaccept.

2.1.2. ANOM test

ANOM test, as reported by many authors, can be taken into account as an alternative to theANOVA[7,8,10,17–24]. Likewise, Balamurali and Kalyanasundaram [25] reported that the ANOM issometimes referred to as an alternative to the ANOVA. Therefore, ANOM test can be used totest the hypothesis H0 : μ1 = μ2 = . . . μk versus the alternative that H1: at least one of the μi isdifferent as well ANOVA-F. ANOM is a graphical analogue to ANOVA, and tests the equality ofpopulation means. One important difference is that ANOVA tests whether the treatment meansdiffer from each other, while ANOM tests whether the treatment means differ from the grandmean. ANOM can also be used as a multiple comparison test. ANOM, however, compares eachtreatment mean to the overall mean, while the Tukey, Duncan, SNK tests consider pairwisedifferences between the means. The ANOM is performed by computing UDL and LDL andchecking to see whether any of the means fall outside decision lines or not.

Table 1. Experimental conditions.

Parameter space Number of groups

3 4 5Distribution shape

N(0,1), χ2(3), t(10), β(10,10) N(0,1), χ2(3), t(10), β(10,10) N(0,1), χ2(3), t(10), β(10,10)Sample sizen1 5:5:5 5:5:5:5 5:5:5:5:5n2 10:10:10 10:10:10:10 10:10:10:10:10n3 15:15:15 15:15:15:15 15:15:15:15:15n4 20:20:20 20:20:20:20 20:20:20:20:20n5 30:30:30 30:30:30:30 30:30:30:30:30n6 50:50:50 50:50:50:50 50:50:50:50:50n7 3:5:8 3:3:5:5 3:5:7:10:15n8 5:10:15 5:8:10:15 5:10:15:20:25n9 5:15:25 10:20:30:40 10:20:30:40:50Effect size (�)

�1 0:0:1 0:0:0:1 0:0:0:0:1�2 0:0.25:1 0:0.50:0.50:1 0:0.25:0.50:0.75:1�3 0:0.50:1 0:0.25:0.75:1 0:0:0.25:0.75:1�4 0:0.75:1 0:0:1:1 0:0:0.25:0.25:1�5 0:1:1 0:0.25:0.50:1 0:0:0:1:1�6 0.25:0:1 0:25:0:0:1 0.25:0:0:0:1�7 1:0:0.25 1:0:0:0.25 1:0:0:0:0.25�8 0.50:0:1 0.50:0:0:1 0.50:0:0:0:1�9 1:0:0.50 1:0:0:0.50 1:0:0:0:0.50Variance ratiosv1 1:1:1 1:1:1:1 1:1:1:1:1v2+ 1:1:4 1:1:1:4 1:1:1:1:4v2− 4:1:1 4:1:1:1 4:1:1:1:1v3+ 1:1:10 1:1:1:10 1:1:1:1:10v3− 10:1:1 10:1:1:1 10:1:1:1:1v4+ 1:1:20 1:1:1:20 1:1:1:20v4− 20:1:1 20:1:1:1 20:1:1:1:1

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

2096 M. Mendes and S. Yigit

The steps of ANOM test are:

Treatment means are calculated as Yi =∑

Yi

n .

The overall mean is calculated as Y.. = Y1+Y2+···+Ykk .

Sample variances are calculated as S2i =

∑(Yij−Y)2

n−1 .MSE used as an estimate of the true population variance is computed as

MSE = S21 + S2

2 + · · · + S2k

k

for equal sample size and

MSE =∑

(ni − 1)S2i

N − k

for unequal sample size, respectively.

UDL and LDL are computed as UDL = Y.. − h(n, k, N − k)√

MSE√

k−1N

and

LDL = Y.. − h (n, k, N − k)√

MSE

√k − 1

N

for equal sample size and

UDL = Y.. + h (n, k, N − k)√

MSE

√N − ni

Nni

and

LDL = Y.. − h(n, k, N − k)√

MSE

√N − ni

Nni

for equal sample size, respectively [10].

where k is the number of treatment groups, N is the total number of observation, ni is the samplesize for the ith group and h(n, k, N − k) is the critical values based on significance level (α),number of means being compared (k) and degrees of freedom for means square error (N − k).

Plot the sample means against the decision lines. If all means fall between the decision lines(UDL and LDL), then accept the hypothesis of k equal means. Otherwise, conclude that at leastone of the μi is different.

Using the ANOM to test the hypothesis H0 versus the H1 not only answers the question ofwhether there are any differences among the treatment means, if there are differences, but alsoshows how the treatment means differ [10]. The main idea of the ANOM is that if H0 is true,then all means have the same population mean. Consequently, all means should be close to theoverall mean (or none of the means fall outside the decision lines). On the other hand, if one ofthe treatment means is too far away from the overall mean (or one of the means falls outside thedecision lines) H0 will be rejected.

ANOM has two advantages over ANOVA:

(a) It can be presented in graphical form which allows to easily assess practical significance ofthe differences as well statistical significance and

(b) If any of the means fall outside the decision lines, ANOM point outs exactly which ones aredifferent.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

Journal of Statistical Computation and Simulation 2097

3. Results

3.1. Results of type I error rates

The relative performances of the ANOVA-F and ANOM tests with respect to empirical type Ierror rates are given in Tables 2–5.

3.1.1. Empirical type I error rate estimates when k = 3

Table 2 contains the results for the ANOVA-F and ANOM tests when homogeneity of variancesassumption is satisfied. We considered balanced and unbalanced designs. The results from Table 2reveal that the ANOVA-F and ANOM tests displayed similar type I error rates, under homogene-ity of variances and balanced design, regardless of the distribution of populations. Type I errorrates with regard to both tests were found close to 5.00%. Under these experimental conditions,ANOVA-F test generally yielded type I error rates changing between 4.65% and 5.10%, regardlessof the combination of observations studied. As for the type I error rates with regard to ANOMtest, it was found to change between 4.11% and 4.44%.

It was observed that both tests were influenced from heterogeneity of variances; however, thisnegative effect appeared to be more obvious in ANOM test (Table 3). On the other hand, whenthe variance ratio was 1:1:4, although type I error rates with regard to both tests were generallyobserved around 6.00%, they were within expectable limits in general. Most of the resulted type Ierror rates fall both between 0.040 ≤ α ≤ 0.060 interval informed by Cochran [26] and, between0.045 ≤ α ≤ 0.055 interval informed by Bradley [27]. In case of existing different sample sizesin groups, type I error rates with regard to both tests were found to be evidently less than 5.00%

With unequal sample sizes and unequal variances, both tests can be either conservative orliberal depending on the relationship between sample sizes and variance ratios. When largersample sizes were associated with larger variances (direct pairing), actual type I error rates of theANOVA-F and ANOM tests were below 5.00% (conservative), while actual type I error rates of

Table 2. Type I error rates (%) for k = 3, 4 and 5 when variances are homogeneous.

k = 3 k = 4 k = 5

Test N(0,1) β(10,10) t(5) χ2(3) N(0,1) β(10,10) t(5) χ2(3) N(0,1) β(10,10) t(5) χ2(3)

5 ANOVA-F 4.97 4.90 5.07 5.07 5.00 4.54 5.09 4.41 5.08 4.54 5.31 4.33ANOM 4.94 5.05 5.10 5.33 5.07 4.79 5.08 4.87 5.08 5.10 5.26 5.37

10 ANOVA-F 5.01 5.13 5.04 4.97 4.89 4.65 5.03 4.57 5.20 4.59 4.94 4.56ANOM 4.99 5.04 5.00 5.10 4.99 4.74 4.95 4.89 5.07 4.98 4.95 5.26

15 ANOVA-F 5.34 5.33 5.05 5.06 5.01 4.65 5.05 4.68 5.05 4.65 4.89 4.62ANOM 5.06 5.15 4.94 5.00 5.09 4.78 4.91 4.78 5.06 5.08 4.88 5.02

20 ANOVA-F 5.05 5.18 5.05 5.07 4.82 4.85 4.94 4.71 5.03 4.83 5.06 4.67ANOM 4.95 4.95 5.05 5.17 4.81 4.85 4.91 4.81 5.07 5.07 4.92 4.94

30 ANOVA-F 4.47 4.53 4.18 4.22 4.88 4.79 4.85 4.62 4.96 4.93 5.15 4.82ANOM 4.61 4.73 4.60 4.68 4.97 4.98 4.95 4.79 5.28 5.68 5.96 5.70

50 ANOVA-F 4.69 4.88 4.65 4.60 4.90 5.03 5.23 4.92 4.90 4.83 4.91 4.99ANOM 4.78 4.90 4.88 5.06 5.14 5.39 5.55 5.24 5.54 5.59 5.40 5.57

3:5:8 ANOVA-F 4.92 4.82 4.96 4.65 5.09 4.69 5.10 4.52 5.16 5.11 5.01 5.023:3:5:5 ANOM 4.17 4.11 4.28 4.13 4.63 4.53 4.69 4.55 4.88 5.34 4.62 5.633:5:7:10:155:10:15 ANOVA-F 5.07 4.99 5.06 4.71 5.07 4.81 5.02 4.66 4.89 5.04 5.02 4.855:8:10:15 ANOM 4.35 4.32 4.39 4.14 4.78 4.51 4.66 4.67 4.56 5.05 4.67 5.205:10:15:20:255:15:25 ANOVA-F 4.94 5.02 4.80 5.00 4.93 4.91 4.99 4.84 5.05 4.94 5.27 5.0910:20:30:40 ANOM 4.21 4.19 4.10 4.43 4.45 4.64 4.57 4.52 5.26 5.49 5.45 5.4410:20:30:40:50

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

2098 M. Mendes and S. Yigit

Table 3. Type I error rates (%) for k = 3 when variances are not homogeneous.

N(0,1) β(10,10) t(5) χ2(3)

N Variance ratios Variance ratios Variance ratios Variance ratios

1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:205:5:5 ANOVA-F 6.52 8.76 9.94 6.41 8.33 9.62 6.78 8.95 10.14 7.30 11.66 13.89

ANOM 6.64 9.16 10.69 6.47 8.74 10.22 6.90 9.40 10.88 7.44 11.92 14.3710:10:10 ANOVA-F 6.55 7.97 8.67 6.21 7.49 8.62 6.26 7.82 8.88 7.07 9.97 11.84

ANOM 6.70 8.41 9.50 6.27 7.91 9.41 6.42 8.33 9.70 7.16 10.33 12.5315:15:15 ANOVA-F 5.96 7.54 8.19 6.15 7.60 8.35 6.03 7.63 8.49 6.78 9.55 10.40

ANOM 6.07 8.07 8.99 6.32 8.32 9.38 6.23 8.29 9.45 7.02 10.02 11.2720:20:20 ANOVA-F 6.19 7.16 8.01 6.07 7.45 7.96 6.24 7.58 8.18 6.78 8.90 9.95

ANOM 6.29 7.81 8.92 6.18 7.91 8.77 6.24 8.13 9.07 6.87 9.37 10.6630:30:30 ANOVA-F 6.07 6.96 7.72 6.11 7.06 8.05 5.96 7.33 8.16 6.49 8.23 9.27

ANOM 6.36 7.72 8.66 6.35 7.71 9.09 6.10 7.93 9.09 6.64 8.81 10.1650:50:50 ANOVA-F 6.04 7.13 7.67 6.01 7.24 7.87 5.97 7.55 7.92 6.14 7.82 8.69

ANOM 6.38 7.92 8.89 6.34 7.90 8.88 6.19 8.18 8.88 6.35 8.37 9.653:5:8 ANOVA-F 2.47 2.21 2.37 2.43 2.13 2.17 2.50 2.39 2.43 3.99 4.75 5.26

ANOM 2.22 2.11 2.35 2.14 2.06 2.18 2.27 2.29 2.43 3.67 4.58 5.125:10:15 ANOVA-F 2.25 1.99 1.97 2.13 1.86 1.90 2.34 2.05 2.06 3.40 3.70 3.97

ANOM 2.10 1.95 2.06 2.01 1.87 2.00 2.21 2.02 2.14 3.14 3.57 3.995:15:25 ANOVA-F 1.34 1.00 0.81 1.43 0.95 0.78 1.40 1.08 0.93 2.37 2.28 2.09

ANOM 1.27 1.03 0.88 1.28 0.97 0.84 1.28 1.13 1.00 2.20 2.21 2.178:5:3 ANOVA-F 13.59 21.59 26.83 13.79 21.65 27.22 12.93 20.80 25.95 12.64 23.04 29.18

ANOM 12.28 20.57 26.30 12.48 20.66 26.67 11.62 19.76 25.30 11.33 21.89 28.3915:10:5 ANOVA-F 14.44 22.46 27.27 14.65 22.42 27.27 14.27 22.09 26.77 14.50 24.02 28.86

ANOM 13.17 21.60 26.95 13.38 21.54 26.99 13.03 21.20 26.45 13.06 22.95 28.4625:15:5 ANOVA-F 18.81 29.70 35.97 18.28 29.56 35.88 18.12 29.34 35.78 17.85 30.69 37.03

ANOM 18.52 29.85 36.73 17.97 29.74 36.52 17.90 29.62 36.55 17.41 30.74 37.73

Table 4. Type I error rates (%) for k = 4 when variances are not homogeneous.

N(0,1) β(10,10) t(5) χ2(3)

Variance ratios Variance ratios Variance ratios Variance ratios

n 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:205:5:5:5 ANOVA-F 7.18 9.87 11.69 6.22 8.68 10.56 7.14 10.10 11.97 7.82 12.19 15.08

ANOM 8.09 11.94 14.68 7.17 10.82 13.34 8.01 12.16 14.99 8.78 14.19 17.7910:10:10:10 ANOVA-F 6.76 9.11 10.51 6.09 8.63 9.79 6.93 9.32 10.60 7.55 11.13 13.02

ANOM 7.73 11.49 13.74 7.18 11.00 12.96 7.79 11.61 13.89 8.61 13.21 16.0315:15:15:15 ANOVA-F 6.58 8.75 10.43 6.35 8.52 9.96 6.90 8.93 10.23 7.27 10.17 11.92

ANOM 7.62 11.18 13.72 7.36 10.97 13.37 7.81 11.30 13.46 8.29 12.42 14.9220:20:20:20 ANOVA-F 6.72 8.71 9.99 6.28 8.30 9.79 6.87 8.62 9.92 7.11 10.03 11.45

ANOM 7.61 11.26 13.35 7.35 10.69 13.15 7.87 11.03 13.20 8.10 12.36 14.6330:30:30:30 ANOVA-F 6.77 8.60 9.69 6.27 8.42 9.65 6.64 8.70 9.54 6.94 9.32 10.96

ANOM 7.79 11.14 13.07 7.38 10.93 13.19 7.82 11.16 13.02 8.13 11.93 14.1750:50:50:50 ANOVA-F 6.83 8.44 9.59 6.43 8.45 9.44 6.40 8.53 9.44 6.85 9.20 10.59

ANOM 8.09 11.27 13.59 7.87 11.36 13.44 7.86 11.33 13.26 8.21 12.00 14.273:3:5:5 ANOVA-F 4.98 6.13 7.34 4.14 5.21 6.00 4.89 6.49 7.48 6.01 8.92 10.76

ANOM 5.36 7.40 9.18 4.50 6.49 7.84 5.23 7.79 9.33 6.43 9.94 12.445:8:10:15 ANOVA-F 2.75 2.64 2.83 2.34 2.24 2.37 2.58 2.57 2.86 3.79 4.34 5.05

ANOM 3.21 3.75 4.20 2.95 3.24 3.77 3.17 3.66 4.29 4.31 5.36 6.4410:20:30:40 ANOVA-F 2.47 2.32 2.29 2.21 2.02 2.16 2.37 2.39 2.38 3.02 3.09 3.24

ANOM 3.03 3.47 3.76 2.82 3.28 3.62 2.95 3.52 3.99 3.62 4.23 4.595:5:3:3 ANOVA-F 9.86 16.33 20.39 10.31 16.57 20.65 9.64 15.30 19.96 9.46 17.67 23.13

ANOM 10.17 17.75 22.72 10.48 18.13 22.95 10.03 16.70 22.40 9.82 19.10 25.2815:10:8:5 ANOVA-F 13.57 21.62 26.74 13.63 21.73 26.98 13.21 21.17 26.33 13.07 22.62 28.34

ANOM 13.86 23.31 29.53 13.77 23.34 29.85 13.36 22.97 29.19 13.42 24.43 31.1340:30:20:10 ANOVA-F 15.39 25.04 30.23 15.61 25.07 30.20 15.59 24.80 30.94 15.52 25.69 31.67

ANOM 15.27 26.57 33.02 15.54 26.43 32.92 15.54 26.38 33.70 15.46 27.26 34.33

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

Journal of Statistical Computation and Simulation 2099

Table 5. Type I error rates (%) for k = 5 when variances are not homogeneous.

N(0,1) β(10,10) t(5) χ2(3)

Variance ratios Variance ratios Variance ratios Variance ratios

n 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:20 1:1:4 1:1:10 1:1:205:5:5:5:5 ANOVA-F 7.18 10.66 12.96 6.37 9.48 11.75 7.48 10.93 12.97 7.50 12.70 15.97

ANOM 8.79 14.39 17.99 8.00 13.20 16.81 8.93 14.61 17.89 9.42 16.20 20.4410:10:10:10:10 ANOVA-F 7.13 10.17 17.22 6.50 9.32 11.21 7.06 9.96 11.89 7.57 11.61 14.09

ANOM 8.76 14.22 11.79 8.22 13.27 16.70 8.63 13.88 17.24 9.47 15.45 19.1315:15:15:15:15 ANOVA-F 7.10 9.62 11.41 6.47 9.32 11.05 7.05 9.73 11.32 7.41 10.78 13.14

ANOM 8.62 13.74 17.11 8.28 13.53 16.70 8.87 13.78 16.79 9.35 14.65 18.2920:20:20:20.20 ANOVA-F 7.12 9.60 11.20 6.61 9.12 10.76 7.01 9.66 11.17 7.21 10.40 12.39

ANOM 8.68 13.75 16.95 8.39 13.23 16.33 8.61 13.63 16.76 9.02 14.59 17.9430:30:30:30:30 ANOVA-F 7.06 9.56 10.90 6.80 9.30 11.08 6.83 9.46 11.08 7.27 10.19 11.89

ANOM 10.69 15.58 18.88 10.52 15.41 18.85 10.43 15.52 18.62 10.96 16.09 19.2150:50:50:50:50 ANOVA-F 6.78 9.39 10.57 6.76 9.40 10.70 6.86 9.38 10.64 6.96 9.93 11.51

ANOM 10.20 15.38 18.03 10.19 15.32 18.29 10.17 15.26 18.17 10.34 15.77 19.133:5:7:10:15 ANOVA-F 1.96 1.83 1.93 1.88 1.66 1.62 1.93 1.85 1.97 3.22 3.61 4.02

ANOM 2.94 3.65 4.22 2.79 3.22 3.61 2.83 3.63 4.08 4.29 5.28 6.225:10:15:20:25 ANOVA-F 2.64 2.64 3.03 2.57 2.60 2.73 2.75 2.71 3.03 3.47 4.05 4.60

ANOM 3.79 4.98 5.94 3.78 4.93 5.73 4.02 5.14 5.88 4.64 6.37 7.2710:20:30:40:50 ANOVA-F 4.91 2.70 2.75 2.51 2.51 2.71 2.60 2.77 3.03 2.98 3.45 3.78

ANOM 5.02 5.65 6.23 4.10 5.34 6.26 4.03 5.64 6.42 4.62 6.07 7.0915:10:7:5:3 ANOVA-F 15.60 27.95 35.49 15.73 27.72 35.98 15.08 26.76 35.03 14.17 27.99 36.73

ANOM 16.29 30.52 39.30 16.52 30.40 39.70 15.97 29.43 39.09 15.28 31.02 40.8225:20:15:10:5 ANOVA-F 16.66 27.99 35.53 16.34 28.17 35.96 15.81 27.62 34.95 15.30 28.85 37.03

ANOM 17.17 30.47 39.70 16.98 30.72 40.00 16.60 30.15 39.04 16.05 31.67 41.0650:40:30:20:10 ANOVA-F 16.24 27.17 33.80 16.09 27.23 33.71 15.83 26.87 33.76 15.80 27.54 34.93

ANOM 17.63 30.62 39.06 17.47 30.73 38.69 17.32 30.36 38.83 17.18 31.11 40.13

both tests were markedly exceeded nominal the level (5.00%) when sample sizes and varianceswere inversely paired (liberal). However, ANOM test is slightly robust then the ANOVA-F testunder these experimental conditions.

3.1.2. Empirical type I error rate estimates when k = 4

As long as the variances were homogeneous, increasing the number of the group to be comparedfrom 3 to 4 did not influence type I error rates with regard to both tests that much (Table 2).In these experimental conditions, regardless of the distributions, ANOVA-F and ANOM testdisplayed quite similar type I error rates. As noticed, in these conditions, type I error rates withregard to both tests were quite close to 5.00%.

ANOVA-F was not affected by different sample sizes in groups, whereas ANOM was in smallamount. In these experimental conditions, regardless of the observation combinations studied,ANOVA-F displayed type I error rates generally changing between 4.52% and 5.09%, whereasANOM displayed between 4.45% and 4.78%. Both tests were affected negatively by the variancesbecoming heterogeneous. This effect was slightly obvious in ANOM test. The increase in varianceratios (1:1:1:10 and 1:1:1:20) seriously increased the type I error rate (Table 4). However, ingeneral, type I error rates with regard to ANOM test were found to be greater than the ones ofANOVA-F. In these experimental conditions, as the influence of the inequality in the observationsizes to the type I error rates was examined, it can be seen thatANOM test generally displayed typeI error rates closer to 5.00% thanANOVA-F except for the 3:3:5:5 combination of the observation.

3.1.3. Empirical type I error rate estimates when k = 5

Increasing the number of the group to be compared from 4 to 5, when the variances were homoge-neous, regardless of the distribution shape, both tests generally displayed type I error rates around

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

2100 M. Mendes and S. Yigit

5.00%. Under the same conditions, it was found that both tests were not influenced too muchfrom making the sample size unequal in groups and, they generally performed type I error ratesaround 5.00%.

In considered experimental conditions, both ANOVA-F and ANOM test were influenced neg-atively from the variances becoming heterogeneous (Table 5). That negative effect becamemore obvious, especially when the variance ratios were 1:1:1:1:10 and 1:1:1:1:20. On theother side, ANOM test attracted attention as to be influenced more than ANOVA-F fromheterogeneous variances.

As noticed, type I error rates displayed with regard to both ANOM test and ANOVA-F were notinfluenced from increasing the number of group to be compared from 4 to 5 as long as the variancesremained the same. However, same findings were not valid when the variances were not homo-geneous. Therefore, the influence of the number of group to be compared to the type I error rate,appeared when the variances were heterogeneous. That effect became more obvious inANOM test.

In all cases, under homogeneity of variances, the empirical type I error rate at α = 0.05 were0.04893 ± 0.00022 for ANOVA-F test and 0.04924 ± 0.00037 for ANOM test respectively. Inall cases, under the unequal variances conditions, the empirical type I error rates were 0.00686 ±0.00019 for ANOVA-F test and 0.008759 ± 0.00026 for ANOM test respectively. Under the sameconditions, however, type I error rate of ANOVA-F and ANOM tests were 0.03026 ± 0.00016 and0.04052 ± 0.00020 respectively. Therefore, the differences between ANOVA-F and ANOM testemerged when the group sizes were different. The ANOM test is more robust than the ANOVA-Ftest under these conditions.

3.2. Empirical test power estimates

The relative performances of the ANOVA-F and ANOM test with respect to empirical test powersare given in Figures 1–36 for k = 3, Figures 37–72 for k = 4 and Figures 73–108 for k = 5,respectively (supplementary Appendix, available online).

Test powers of ANOVA-F and ANOM test obtained empirically under considered experimentalconditions are given in Figures 1–36. Both tests displayed similar power values when the varianceswere homogeneous and sample sizes were equal, regardless of distribution shapes, effect sizes(�) and number of groups (Figures 1–36). For instance, when k = 3, variances are homogeneous,n = 15, � = 0:0:1 and the distributions are normal (0,1), the power values of ANOVA-F andANOM test are 78.37% and 78.62%, whereas the values are 79.18% and 79.19% when thedistributions are Chi-Square with 3 d.f.As the powers of the tests are 78.60% and 78.67% regardingANOVA-F and ANOM test in case the distributions are β(10,10), as for the case t(10), they are69.01% and 69.03%. It can be easily seen that the power values are so similar under theseexperimental conditions.

Under the same conditions, in case the samples are taken from normally distributed populations,power values are 69.40% and 69.04% regarding ANOVA-F and ANOM when � = 0:0.25:1, thatis, the difference among the means decreases a bit. Besides, when the distributions are Chi-Square,the power values are displayed as 70.82% and 70.53%. In case the distributions are β(10,10), thepower values of the test are 69.15% and 68.69%, whereas the values are 59.76% and 59.45%when t(10). Similar situations are available for the other �s as well.

The variances being not homogeneous (except when the variance ratios are four times of eachother) quite negatively affected the test power regarding both tests. This negative effect becamemore obvious, especially in case the variance ratio was 1:1:20.

As the effect of the different sample sizes to the power of the test was examined, it was seenthat the test powers were influenced by the differences in sample size. Nevertheless, in parallelto the increase of the total number of observations, the power values of the two tests have begunto approach each other. For instance, when the distributions are normal, for the observation

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

Journal of Statistical Computation and Simulation 2101

combination 3:5:8 (total 16 observations), the power values regarding ANOVA-F and ANOMtests were 33.86% and 31.86%, while the test powers of ANOVA-F and ANOM test were 63.07%and 61.39% for the 5:10:15 combination (total 30 observations) and 82.87% and 83.77% for the5:15:25 combination (total 45 observations). Similar findings are valid for the other distributions.

Test powers under the experimental conditions having the same overall mean vary based on dif-ferences or distances among the means being compared. Therefore, despite having the same overallmean, increase in the differences between the means being compared will lead to an increase intest power. For example, differences among the means for the 0:0.50:0.50:1 (0:d/2:d/2:d) and0:0.25:0.75:1 (0:d/4:3d/4:d) conditions are smaller than that of the 0:0:1:1 (0:0:d:d) condition.Thus, in this case, actual test powers for the 0:0:1:1 condition will be higher than those of the0:0.50:0.50:1 and 0:0.25:0.75:1 conditions. This increase became more obvious in case of anincrease in the sample size.

Under such experimental conditions with the same overall mean, as the variances are homoge-neous, the ANOVA-F test has been generally observed to be superior when the differences amongthe means are large or in case the differences are getting to draw away from each other (e.g. incase if k = 4 and n ≥ 10, then � = 0:0.25:0.75:1 (0:d/4:3d/4:d) and � = 0:0:1:1 (0:0:d:d)).

The relations between effect size (�) and the variance ratios affected the test power, especiallywhen the sample sizes are not equal. As ANOVA-F has become to be superior in some of theexperimental conditions being considered, ANOM is superior in the others. However, generally,when the populations with large mean have larger variances as well, ANOM test has been seento be superior. On the other hand, when the populations with large mean have small variances,generally, ANOVA-F has observed to be superior. The situation became clearer when the numberof the groups is 4 or 5.

For the (0:0.50:0.50:1) condition, in case the number of observations is equal, the varianceratios being, 1:1:1:4, 4:1:1:1, 1:1:1:10 and 10:1:1:1, ANOVA-F and ANOM test have similar testpowers. In case of the sample sizes are not equal and the variances are heterogeneous,ANOM test issuperior when the populations with large means have large variances as well.As for the populationswith large means have small variances, ANOVA-F has been seen to be superior in general.

Numbers of False (+) and False (−) were also counted. The effect of False (+) and False (−)

numbers to the power of the test has changed with respect to the relationships among effect size,samples size and variance ratios. Generally, when the variances are homogeneous and sample sizesare equal, False (+) and False (−) numbers begin to approach the zero as sample size increases.The situation also exists for 4 and 5 groups in general. For instance, False (+) and False (−)

numbers conditional to � = 0:0:1 and � = 0:1:1 are quite close to zero when k = 3, variancesare homogeneous, distributions are N(0,1) and n = 30. Under the same conditions, false (−)

number is equal to 5348 when � = 0:0.25:1, whereas it has decreased to 499 when � = 0:0.50:1and to 18 when � = 0:0.75:1. As being intentioned, the decrease in the difference between twogroups’ (apart from the groups of which means are different from zero) mean, has lessened theFalse (−) number. This situation has not been affected from the shape of the distribution andnumber of the groups.

4. Discussion

An alternative test to ANOVA is known as ANOM [8,10,22,28]. In spite of some advantagesof this method, it can be noticed that this method is not used extensively in practice yet. It canbe interpreted that it is because of not comparing ANOM test performance to the ANOVA orto other methods, and therefore not being recognized enough yet. However, it can be seen thatcommonly used statistical software packages (e.g. MINITAB) have added ANOM test to theirmodules recently. From that point of view, in this study, ANOM test was compared with the

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

2102 M. Mendes and S. Yigit

ANOVA with regard to type I error rate and test power under different experimental conditions.At the end of this simulation, the study showed that the performance of both tests has beenaffected by whether the assumption of the homogeneity of the variance is fulfilled or not. As longas the variances are homogeneous, both tests have generally performed around 5.00% type I errorrate, regardless of the distribution shape and sample size combinations. Therefore, under theseexperimental conditions, it can easily be proposed that both tests can be used to compare themeans. However, this is not valid in case the variances are not homogeneous (except the varianceratios are four times of each other). In case the variances are not homogeneous, type I error rateswith regard to both tests were found to lie far from 5.00% gradually. It has become more obviouswhen the variance ratios are 10 and 20 times of each other. Therefore, under these conditions,the use of both ANOVA-F and ANOM test for comparing group means is not recommended.On the other hand, when the variances are heterogeneous, in case the sample sizes of the groupsare different (provided that k > 3), ANOM test has resulted closer type I error rates to 5.00%with respect to ANOVA-F. Thus, under these experimental conditions, it may be proposed thatpreferring ANOM test to ANOVA-F is more suitable.

With unequal sample sizes and unequal variances, the relations between sample sizes and thevariance ratios (direct and inverse pairing) affected the type I error rates. Both tests can be eitherconservative or liberal depending on the relationship between sample sizes and variance ratios.Similar results were indicated from previous studies [12,14,29–31]. Actual type I error rates ofthe ANOVA-F and ANOM tests were below 5.00% under direct pairing, while actual type I errorrates of both tests were markedly exceeded 5.00% under inverse pairing.

As expected, power values of both tests change with respect to the effect size (�), variance ratioand sample size combinations. As long as the variances are homogeneous, both tests have similarpower values except unequal sample size combinations. Both tests have been negatively affectedfrom the heterogeneity of the variances. This case became more obvious when the variance ratiosincreased.

The differences between the power values of the two tests became obvious when the variancesare heterogeneous and the sample sizes are different. The relations between � and σ quite affectedthe power values of ANOVA-F and ANOM test. The relations between effect size (�) and thevariance ratios affected the test power, especially when the sample sizes are not equal.AsANOVA-F has become to be superior in some of the experimental conditions being considered, ANOMis superior in the others. However, generally, when the populations with large mean have largervariances as well,ANOM test has been seen to be superior. On the other hand, when the populationswith large mean have small variances, generally, ANOVA-F has observed to be superior. Thesituation became clearer when the number of the groups is 4 or 5.

Test powers under the experimental conditions having the same overall mean vary based ondifferences or distances among the means being compared. It is an expected result that the powervalues of the test being different under the experimental conditions ((0:d/2:d/2:d), (0:d/4:3d/4:d)and (0:0:d:d)) such that overall mean is equal. Because, in spite of having the same overall mean,the increase in the difference among the means caused an increase in the power of the test. Itbecomes more obvious in case of expanding the sample size.

Under such experimental conditions with the same overall mean, as the variances are homoge-neous, the ANOVA-F test has been generally observed to be superior when the differences amongthe means are large or in case the differences are getting to draw away from each other (e.g. incase if k = 4 and n ≥ 10, then � = 0:0.25:0.75:1 (0:d/4:3d/4:d) and � = 0:0:1:1 (0:0:d:d)).

When type I error rates and power values of the tests are evaluated together, it can generallybe concluded that the assumption of homogeneity of the variances has a critical importancefor both tests. If the variances are homogeneous and sample sizes are equal, in comparing thedifferences of the group means, both tests can be used, regardless of the distribution shape andnumber of groups being compared. But, the same situation is not valid when the variances are not

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

Journal of Statistical Computation and Simulation 2103

homogeneous. When the sample sizes subject to study are different, not fullfilling the homogeneityof the variances assumption has seriously affected the performances of both tests.

There is no study in the literature that ANOM test was compared with ANOVA-F or other testsin terms of type I error rate and test power. Many studies, however, are available in the literaturethat ANOVA-F was compared with some alternative tests in terms of type I error rate and testpower. For example, in their study, Tomarken and Serlin [12] reported that ANOVA-F tends tokeep type I error rate around 5.00% as long as the variances are homogeneous. However, theystated that ANOVA-F has quite negatively affected pairing the sample size and variances bothnegatively and positively. They pointed out that the similar situations are valid for the powervalues of the test. Beuckelaer [13] compared ANOVA-F and some parametric alternative testsregarding type I error rate and the power of the test. As a result of the simulation experiments,as long as the variances are homogeneous, he reported that ANOVA-F is superior with respectto the other test regarding type I error and the power of the test. However, as the variances getheterogeneous, ANOVA-F was reported to display more biased results. Myers [14], comparedANOVA-F, Alexander–Govern and James-second-order tests with regard to type I error rates,under the conditions which normality assumption has not been fulfilled. As a result of the study,he reported thatANOVA-F has displayed quite lower type I error rates than 5.00% in case of pairingthe sample size and variances ratios positively and, quite higher than 5.00% pairing negatively.Pei-Chen Wu [15] reported that in case of studying with the samples taken from the populationswith high kurtosis and skewness, ANOVA-F has carried out even more than 40.00% type I errorrate depending on the heterogeneity of the variances. Under the same experimental conditions,he stated that ANOVA-F has been quite negatively affected from positively and negatively pairedsample size versus variances. Moder [16] compared ANOVA-F to some parametric and non-parametric alternatives regarding type I error rate when the variances are not homogeneous. Hereported ANOVA-F being the most affected one from the heterogeneity of the variances. Theresults of our study confirm previous results [12–16,30,32].

For all that, when the confirmed advantages of ANOM test such that ‘providing a p-value forthe significance for the test of the difference of the group means in the case of existing onlythe group means and variances’, ‘reporting the results as graphically’, ‘used also as a method ofmultiple comparison’, ‘used also for testing the homogeneity of the variances’, ‘besides statisticalsignificance, giving the importance in practice’, ‘used also for comparing the difference of theratios’, with respect to ANOVA-F are taken into consideration, it can be concluded that in practiceANOM test can be used effectively.

References

[1] J.H. Zar, Biostatistical Analysis, Prentice–Hall Inc. Simon and Schuster/A Viacom Company, New Jersey, NJ, 1999.[2] M. Mendes, E. Basınar, and F. Gürbüz, Confidence ınterval for test power in Welch, James-second order and

Alexander–Govern tests: A simulation study, Y.Y.Ü. Fen Bilimleri Enstitüsü Dergisi 10(1) (2005), pp. 16–22.[3] B.L. Welch, On the comparison of several mean values: An alternative approach, Biometrika 38 (1951), pp. 933–943.[4] M.B. Brown, A.B. Forsythe, The small sample behaviour of some statistics which test the equality of several means,

Technometrics 16 (1974), pp. 129–132.[5] G.S. James, The comparison of several groups of observations when ratios of the population variances are unknow,

Biometrika 38 (1951), pp. 324–329.[6] R.A. Alexander and D.M. Govern, A new and simpler approximation for ANOVA under variance heterogeneity, J.

Educ. Stat. 19 (1994), pp. 91–101.[7] E.R. Ott, Analysis of means: A graphical procedure, Indus. Qual. Cont. 24 (1967), pp. 101–109.[8] P.F. Raming, Application of the analysis of means, J. Qual. Technol. 15 (1983), pp. 19–25.[9] P.P. Nelson, Multiple comparisons of means using simultaneous confidence intervals, J. Qual. Technol. 21 (1989),

pp. 232–241.[10] P.P. Nelson, P.S. Wludyka, and K.A.F. Copeland, The analysis of means: A graphical method for comparing

means, rates and proportions, ASA-SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, ASA,Alexandria, VA, 2005.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013

2104 M. Mendes and S. Yigit

[11] P.P. Nelson, The analysis of means for balanced experimental designs, J. Qual. Technol. 15 (1983), pp. 45–54.[12] A.J. Tomarken and R.C. Serlin, Comparison of ANOVA alternatives under variance heterogeneity and specific

noncentrality structures, Psychol. Bullet. 99(1) (1986), pp. 90–99.[13] A.D. Beuckelaer, A closer examination on some parametric alternatives to the ANOVA F-test, Stat. Pap. 37(4) (1996),

pp. 291–305.[14] L. Myers, Comparability of the James’ second-order approximation test and the Alexander and Govern a statistic

for non-normal heteroscedastic data, J. Stat. Comput. Simul. 60(3) (1998), pp. 207–222.[15] W. Pei-Chen, Modern one-way ANOVA F methods: Trimmed means, one step M-estimators and bootstrap methods,

J. Quant. Res. 1 (2007), pp. 155–173.[16] K. Moder, Alternatives to F test in one way ANOVA in case of heterogeneity of variances (a simulation study),

Psychol. Test Assess. Model. 52(4) (2010), pp. 343–353.[17] B.J. Winer, D.R. Brown, and K.M. Michels, Statistical Principles in Experimental Design, McGraw–Hill Companies,

New York, NY, 1991.[18] E.G. Schilling, A systematic approach to the analysis of means, J. Qual. Technol. 5(4) (1973), pp. 92–108, 147–159.[19] P.R. Nelson, Power curves for the analysis of means, Technometrics 27(1) (1985), pp. 65–73.[20] P.R. Nelson, Additional uses for the analysis of means and extended tables of critical values, Technometrics 35(1)

(1993), pp. 61–71.[21] E.R. Ott and E.G. Schilling, Process Quality Control: Troubleshooting and Interpretation of Data, 2nd ed., McGraw–

Hill Companies, New York, NY, 1990.[22] P.P. Nelson and E.J. Dudewicz, Exact analysis of means with unequal variances, Technometrics 44 (2002),

pp. 152–160.[23] L.S. Nelson, Factors for the analysis of means, J. Qual. Technol. 6 (1974), pp. 175–181.[24] P.G. Mathews, Design of Experiments with MINITAB, ASQ (American Society for Quality) Press, Milwaukee, WI,

2005.[25] S. Balamurali and M. Kalyanasundaram, An ınvestigation of the effects of misclassification errors on the analysis of

means, Tamsui Oxford J. Inform. Math. Sci. 27(2) (2011), pp. 117–136.[26] W.G. Cochran, Some methods for strengthening the common χ2-tests, Biometrics 10 (1954), pp. 417–451.[27] J.C. Bradley, Robustness? Br. J. Math. Stat. Psychol. 31 (1978), pp. 144–152.[28] T.P. Ryan, Statistical Methods for Quality Improvement, 2nd ed., John Wiley & Sons, New York, NY, 2000.[29] T.H. Hsuing and S. Olejnik, Type I error rates and statistical power of the James second-order test and the univariate

F test in two-way fixed-effects ANOVA models under heteroscedasticity and/or nonnormality, J. Exp. Educ. 65(1)(1996), pp. 57–71.

[30] P.J. Schneider and D.A. Penfield, Alexander and Govern’s approximation: Providing an alternative to ANOVA undervariance heterogeneity, J. Exp. Educ. 65(3) (1997), pp. 271–286.

[31] M. Mendes, The comparison of some parametric alternative tests to one-way analysis of variance in terms of TypeI error rates and test power under non-normality and homogeneity of variance. Ph.D. thesis, Ankara UniversityGraduate School of Natural and Applied Sciences Department of Animal Science (unpublished), 2002.

[32] D.A. Penfield, Choosing a two-sample location test, J. Exper. Educ. 62 (1994), pp. 343–360.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

07:

22 0

2 D

ecem

ber

2013