Identification of Variables Associated With Group Separation in Descriptive Discriminant Analysis:...

This article was downloaded by: [University of Western Ontario]On: 08 October 2014, At: 10:52Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

The Journal of ExperimentalEducationPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/vjxe20

Identification of VariablesAssociated With GroupSeparation in DescriptiveDiscriminant Analysis:Comparison of Methodsfor Interpreting StructureCoefficientsHolmes Finch aa Ball State UniversityPublished online: 07 Aug 2010.

To cite this article: Holmes Finch (2009) Identification of Variables Associated WithGroup Separation in Descriptive Discriminant Analysis: Comparison of Methods forInterpreting Structure Coefficients, The Journal of Experimental Education, 78:1,26-52

To link to this article: http://dx.doi.org/10.1080/00220970903224602

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified with

http://www.tandfonline.com/loi/vjxe20

http://dx.doi.org/10.1080/00220970903224602

primary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

http://www.tandfonline.com/page/terms-and-conditions

The Journal of Experimental Education, 2010, 78, 26–52Copyright C© Heldref PublicationsISSN: 0022-0973 printDOI: 10.1080/00220970903224602

Identification of Variables AssociatedWith Group Separation in DescriptiveDiscriminant Analysis: Comparison of

Methods for Interpreting StructureCoefficients

Holmes FinchBall State University

Discriminant Analysis (DA) is a tool commonly used for differentiating among 2or more groups based on 2 or more predictor variables. DA works by finding 1 ormore linear combinations of the predictors that yield maximal difference amongthe groups. One common goal of researchers using DA is to characterize the natureof group difference by interpreting the contributions of the individual predictorsto this linear combination, often using structure coefficients (SC). The authors ofthis simulation study examine the utility of several methods for interpreting SCs.Results indicate that with samples greater than 100, a bootstrap confidence intervalmay be optimal, whereas with smaller samples, common rules of thumb may workbest. Furthermore, nonnormal data and unequal covariance matrixes diminish theeffectiveness of SCs as an interpretive tool.

Keywords: discriminant analysis, structure coefficients, bootstrap

DISCRIMINANT ANALYSIS (DA) is a statistical technique that researchers canuse to differentiate members in two or more groups from one another by usinga set of predictor variables. Researchers have used DA in a variety of studiesin education, psychology, and other social science disciplines. At its base, DAinvolves using a set of predictor variables to differentiate among samples fromtwo or more identified populations. Members of the sample upon which DA might

Address correspondence to Holmes Finch, Department of Educational Psychology, Ball StateUniversity, Muncie, IN 47306, USA. E-mail: [email protected]

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

VARIABLES ASSOCIATED WITH GROUP SEPARATION 27

be used would have scores on all of the predictors, and their group classificationwould be known. It should be noted that a commonly used alternative to DA isLogistic Regression (LR). LR uses group membership as the dependent variableand then models the log-likelihood of an individual being in one of the groupsas a function of the predictor variables. It has been shown to be an effectivetool in group membership prediction (Finch & Schneider, 2005; Meshbane &Morris, 1996), though it has not been compared with DA in terms of accuratelyidentifying variables that differentiate groups (for a complete description of thelogistic regression model, see Agresti, 2002).

There are two broad applications of DA that are commonly referred to asPredictive Discriminant Analysis (PDA) and Descriptive Discriminant Analysis(DDA). In the former, the goal is to find the weighted combination of the pre-dictor variables that will maximize classification accuracy for members of thegroups in question. In the latter, information about the linear combination is usedto characterize differences among the groups in terms of the predictors. It isthis latter application of DA that serves as the focus of this study. Specifically,I have studied the use of structure coefficients (SC), one of the primary toolsused to characterize differences among the groups in terms of the predictors.First, I offer a general description of DA, followed by discussion of what SCsare and what they actually represent. Next, I discuss suggested guidelines fortheir interpretation, and describe the goals and methods of the present simulationstudy.

Discriminant Analysis

DA works by identifying weighted linear combinations of the observed predictorsfor which group separation is maximized. The number of these discriminantvariables, or functions, that can possibly be generated for a particular problem isequal to the smallest of (a) the number of predictor variables or (b) the number ofgroups (i.e., 1), as long as the covariance matrix of the predictor variables is offull rank.

Although there is a maximum number of possible discriminant functions thatwill be derived for a given problem, not all of these will necessarily representstatistically significant separation among the groups. Functions that are not signif-icant are not generally interpreted by the data analyst because they do not containuseful information about group differences. Analysts use a statistical hypothesistest to determine the number of significant discriminant functions. In fact, therewill be a hypothesis test associated with each of the discriminant functions, withthe first testing the null hypothesis of no significant difference on any of the linearcombinations versus the alternative that at least one combination resulted in groupdifferences (Huberty & Olejnik, 2006). These significance tests are conducted

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

28 FINCH

using Wilks’ Lambda

� = |Swg|Swg + Sbg

(1)

where Swg is the cross-products matrix within groups and Sbg is the cross-productsmatrix between groups. In the case where Lambda is significant and more than onediscriminant function is possible (i.e. there are more than two groups and morethan two predictors), subsequent tests are constructed by partitioning the originaltest statistic. The alternative hypothesis for each of these subsequent tests is thatthere are at least k significant linear combinations of the predictor variables, wherek is (i.e., 2, 3, etc.).

Each of these discriminant functions has its own set of weights and can bewritten as the following:

Di = di1x1 + di2x2 + . . . + dij xj

wherexj = Value of variable xj

dij = Discriminant coefficient linking variable j to discriminat function iDi = Value of discriminant function i

(2)The discriminant coefficients, dij, are determined so that the groups are maxi-

mally separated on Di (Tabachnick & Fidell, 2001). The first discriminant functionrepresents the greatest difference among the groups, whereas the second functionrepresents the second greatest difference and so on. A value of Di is obtained foreach individual in the sample, and the mean of these values for each of the groupsin question (referred to as the centroid) can also be obtained.

DA rests upon three primary assumptions: (a) The predictor variables arenormally distributed, (b) the covariance matrixes for the groups are homogeneous,and (c) the subjects are independent of one another (Tabachnick & Fidell, 2001).Research on the impact of violating these assumptions on the performance of DAhas been primarily limited to the problem of prediction (PDA). Some researchershave found that nonnormally distributed predictor variables can have a deleteriousaffect on the ability of PDA to correctly classify individuals (e.g., Hess, Olejnik,& Huberty, 2001), though others report no such difficulties (Meshbane & Morris,1996). With respect to heterogeneous covariance matrixes, prior studies seem toindicate that linear DA performs poorly in terms of classification accuracy, whereasquadratic DA is largely able to overcome difficulties due to unequal covariancematrixes (Finch & Schneider, 2005; Hess et al., 2001; McLachlan, 1992). Again,it should be noted that these results are applicable only to DA’s ability to correctlypredict which category an individual subject belongs to, and they do not relatedirectly to the question of describing how the groups differ from one another.

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


As Huberty and Olejnik (2006) point out, a natural question arises from find-ing significant differences among groups on one or more discriminant functions:What do these functions represent? That is, what is being described by the linearcombinations produced by DA? Answering this question in the context of DDAcan bring insight into differences among the groups. Two primary tools havebeen mentioned in the literature regarding the interpretation of these discriminantfunctions: structure coefficients (SC) and standardized discriminant function co-efficients. In using SCs to interpret the meaning of the significant discriminantfunctions, it is important to remember that they represent maximized group dif-ference as a linear combination of the predictors (Huberty & Olejnik). The SC’s,which can be interpreted as the correlation between the discriminant variable andthe predictor, are calculated as the product of the matrix of correlations among thepredictor variables by the matrix of the discriminant coefficients, as described inEquation 2 (Stevens, 2000). There are two methods recommended for calculatingSCs. Total group SCs (Cooley & Lohnes, 1985) are calculated as the following:

SCT = RT D (3)

whereRT = matrix of correlations among the predictor variables based on

the total sampleD = matrix of standardized discriminant function coefficients

The correlation matrix has dimensions JXJ, where J is the number of observedvariables, whereas D has dimensions IXJ, where I is the number of discriminantfunctions. In this equation, the discriminant function coefficients are standardizedusing the pooled (across groups) standard deviations.

These SC values in DDA are similar in concept to the loadings commonlyused to interpret factor analysis results. For both analyses, this value is essentiallythe correlation between the linear combination of observed scores (discriminantfunction or latent factor, depending upon the analysis) and those observed scores.In addition, in both analyses this value serves to identify which of these observedmeasurements are most associated with the unobserved construct. A more com-plete description of factor loadings can be found in Thompson (2004).

A possible problem with using total group correlations that has been cited in theliterature is the correlations ignore differences among the group means (Huberty& Olejnik, 2006). A suggested alternative approach to calculating the SC is touse the within-group correlations among the predictors, RW, in place of the totalsample correlation matrix. The resulting SCW matrix measures the relationship, orcorrelation, between each of the observed variables and the discriminant functions,accounting for group differences among the means. One goal of the present studywas to compare the ability of these two types of SC values in identifying the relative

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

30 FINCH

importance of the predictor variables in group separation by using DDA. Earlierresearchers comparing the performance of the two indexes found that for threegroups and 10 normally distributed predictors, SCT and SCW yielded comparableresults (Huberty, 1975).

It has been suggested that researchers use the SCs to take note of which ofthe observed predictors are most closely associated with statistically significantdiscriminant functions, in order to gain insight into how the groups differ (e.g.,Stevens, 2000). In general, larger values of the SCs suggest greater associationwith the linear combination for which the groups are differentiated and can thusbe thought of as indicators of the relative importance of each predictor in overallgroup separation. A natural question to arise when using SCs in this way is howlarge should the values be in order for the researcher to attach some practicalimportance to the predictor variables in question? That is, when the researcher isinterested in characterizing the nature of the discriminant function(s) for whichgroups are significantly different, how large should the SC value of a specificpredictor variable be for it to be considered as playing an important role in thischaracterization?

There does not appear to be a single answer in the literature to the question ofwhat magnitude of SC values suggest meaningful contribution to group separation.Some authors have used specific cut values, whereas others have focused more onthe relative magnitude of the SC values and attributed greater weight in interpreta-tion to those predictors with relatively larger SC values, regardless of their actualmagnitude. One proposal for a cutoff value for SCs is 0.3 (Tabachnick & Fidell,2001; Pedhazur, 1997). Using this approach, a researcher might conclude that anySC value over 0.3 is in some sense important and thus can be seen as (at least par-tially) characterizing the discriminant function. Other researchers (e.g., Huberty &Olejnik, 2006; Stevens, 2000) have not made specific recommendations regardinghow large SCs should be in order to be a part of characterizing the discriminantfunctions; rather, they have recommended that variables with larger such valuesare more closely related to the discriminant function than are those with smallervalues. As is described below in more detail, these and other approaches have beenused in practice.

Dalgleish (1994) introduced bootstrap and jackknife hypothesis tests that wouldobviate the need for cut values by providing test statistics for each SC. He found thatthe jackknife method did not work particularly well, and thus it is not discussed inmore detail here. This application of the bootstrap involved the random resamplingwith replacement from the original data set to create B samples of size N, whereN is the original sample size. For each of these B resamples, DDA was conductedand the resulting SCs were retained. This resampling and analysis was replicateda large number of times (Dalgleish used both 100 and 1000 such resamples; Idiscuss this further in the next section) in order to create a distribution of SCvalues. The mean of the SCs for each predictor variable served as the estimate of

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


the SC values, with the standard deviation of the bootstrap SCs being the standarderror. They then constructed hypothesis tests for each SC:

Z = θ̂B − θ

σ̂B

(4)

whereθ̂B = Bootstrape estimate of the SCθ = Value of the SC under the nullhypothesis; 0 for exampleσ̂B = Estimate of the standard error for SC’s; standard deviationfrom the bootstrap samples

This Z is distributed as a standard normal variate under certain regularityconditions, as N → ∞ . Dalgleish referred to this as the standard bootstrap test.They also used this standardized bootstrap approach to construct a 95% confidenceinterval for each SC value as well.

In addition to their standard bootstrap, Dalgleish (1994) also presented an alter-native method for inference with SCs, which they called the percentile bootstrapconfidence interval. This approach simply took the 2.5% and 97.5% values fromthe bootstrap distribution of SCs as the endpoints for a 95% confidence inter-val. Finally, they proposed a third bootstrap approach for calculating confidenceintervals, the bias corrected bootstrap confidence interval, which did not showgreat promise in the simulation study they used to assess the performance of thesemethods and, thus, it is not discussed here.

One issue that Dalgleish (1994) addressed is the proper alignment of SC so-lutions. Because discriminant functions are extracted in order of magnitude andare of arbitrary sign, there is no guarantee that in multiple bootstrap samples theSCs will align similarly with respect to sign or function. Consequently, Dalgleishrecommended a realignment procedure based on Clarkson (1979), which involvedchanging the signs of the SCs and the function order so as to minimize the sumof squared differences between the full sample structure matrixes and those ofthe bootstrap samples. To maintain consistency with his approach, this alignmentapproach was used here as well.

On one hand, the results from Dalgleish’s (1994) small simulation study showedthat the jackknife method and the bias corrected bootstrap had inflated Type I errorrates; that is, he identified significant SCs more frequently than did the nominalalpha level of 0.05. On the other hand, he found that both the standard and percentilebootstrap approaches had somewhat conservative Type I error rates. Furthermore,they found that using cut values of 0.3, 0.4, and 0.5 yielded Type I error rates of0.024, 0.007, and 0.001, respectively. This study simulated 60 subjects dividedequally across three groups and measured on 11 predictor variables. Dalgleishfound that the bootstrap approach worked equally well with 100 B samples aswith 1,000. He recommended that, considering its good performance and ease of

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

32 FINCH

computation, practitioners use the standard bootstrap approach, which I have donein this study.

An alternative to SCs that some researchers have recommended for interpret-ing discriminant functions is the standardized discriminant coefficient (Rencher,1992). This coefficient is calculated as the following:

d∗ij = dij

√s2j (5)

wheredij = raw discrimination weight for variable j and function is2j = variance of variable j

In the literature, there is some disagreement regarding which approach forinterpreting DDA results, SCs, or standardized coefficients is preferable. Rencher(1992) showed that the SCs represent the unique relationship between each ofthe observed predictors and the discriminant function. He went on to point outthat the SC for a given predictor is not partialing out the relationships of the otherpredictors and the discriminant function. In other words, the SC does not representthe relationship between the observed variables and the discriminant function(s),given the presence of the other variables. In contrast, Huberty and Wisenbaker(1992) argued that the use of the standardized coefficients is not appropriate andshould be avoided when attempting to order observed predictors for the purposeof characterizing discriminant functions. Their objection centered on the fact thatsimply ordering observed variables in importance on the basis of their relativestandardized weights ignores the magnitude of the latter. Furthermore, when morethan one discriminant function is present, there are issues concerning combiningthe standardized weights for each function so as to come up with a reasonableordering of variable contribution to group differentiation (Huberty & Wisenbaker).Neither of two Monte Carlo studies that compared these two statistics as tools forordering the predictor variables in terms of importance was able to come toa definitive conclusion regarding the superiority of one method over the other(Barcikowski & Stevens, 1975; Huberty, 1975). This lack of incontrovertibleevidence supporting one approach over the other, coupled with the predominantrecommendation in popular textbooks to use SCs (Huberty & Olejnik, 2006;Johnson & Wichern, 2002; Pedhazur, 1997; Stevens, 2000; Tabachnick & Fidell,2001), lead to the focusing of this study on the use of SCs in identifying variablesthat were most associated with the discriminant functions that best differentiatebetween groups.

A brief review of the psychology literature over the last several years, us-ing PsycInfo with the keywords discriminant analysis and descriptive discrim-inant analysis revealed that when DDA was applied, no common approach for

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


interpreting SCs was uniformly applied. Some of the manuscripts (Dembo, Ware-ham, & Schmeidler, 2005; Sirois, Sears, & Marhefka, 2005; Sherry, Henson, &Lewis, 2003) used no cutoff values at all, but rather interpreted the discrimi-nant functions in terms of the relative magnitude of the observed variables’ SCs,whereas others did use formal cut off values, including 0.3 (Glaser, Calhoun, &Petrocelli, 2002; Russell & Cox, 2000) and 0.4 (Matters & Burnett, 2003). Thoughnot a comprehensive review of the literature, these articles from the past 6 years,which I randomly selected from the PsycInfo database, point out the general lackof agreement over interpreting SC values.

Given this apparent lack of consensus on how best to use SCs in practice, thepresent simulation study seeks to compare the usefullness of several approachesunder a variety of data conditions. My goal here was to extend upon earlier research(e.g., Barcikowski & Stevens, 1975; Huberty, 1975; Rencher, 1992) by examiningSCs under a wider variety of conditions and by comparing the performance ofseveral methods for identifying SCs that represent real, observed effects, includingcutoff values of 0.3, 0.4, and 0.5, as well as relative magnitude of SCs and theresults of bootstrap tests.

METHOD

To gain a greater understanding regarding the usefulness of SC’s for interpretingDA results under a variety of situations, including the application of a varietyof approaches for identifying observed variables that are most associated witha significant discriminant function, a Monte Carlo simulation study was used.Several conditions were varied in this study, each of which is described below.All conditions were completely crossed with one another, and 1,000 replicationsof each combination were run, using the SAS software system, Version 9.1 (SAS,2005). PROC DISCRIM was used to conduct the actual DDA and generate theSCs. The simulations were conducted using a SAS macro that I wrote. I examinedtwo overall data conditions: (a) two groups and two predictor variables and (b)two groups and six predictor variables. The other conditions that were varied inthe study include the following.

Interpretation Criteria

As previously described, there are a number of potential methods for using SCsto characterize significant discriminant functions, including cutoff values (e.g.,Tabachnick & Fidell, 2001; Pedhazur, 1997), comparison of relative magnitudes(Huberty & Olejnik, 2006; Stevens, 2000), and an inferential approach basedupon the bootstrap (Dalgleish, 1994). Given this variety of recommendations

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

34 FINCH

for using SCs to gain an understanding of group separation, and the observedvariation in actual practice previously cited, I have attempted to compare therelative performance of these approaches in terms of correctly identifying variablesthat contribute to the group separation. Specifically, the methods for interpretingSCs used here include cutoff values of 0.3, 0.4, and 0.5, the bootstrap confidenceinterval, as well as a more general criterion in which the relative magnitudes ofthe SCs for the two predictors are compared, and the one with a larger such valueis deemed to contribute more to the significant discriminant function. In the lattercase, the outcome of interest was the proportion of times the SC for the variableexhibiting the larger effect size also had the larger SC value. This criterion issimilar to that used by Huberty (1975) and Barcikowski and Stevens (1975) intheir studies of SC stability. In the case of the cutoff values, the outcome of interestwas the proportion of times that the variables with nonzero effect sizes had SCvalues greater than the cutoff. Finally, the standard bootstrap approach for creating95% confidence intervals for the SCs as outlined by Dalgleish (1994) was alsoincluded in this study. If zero was included in the interval, we concluded that the SCvalue was not statistically significant. A total of 100 bootstrap samples were drawnfor each replication. This value is in keeping with Dalgeish’s recommendationsfrom his simulation study.

Distribution of the Predictor Variables

There were two conditions for distribution of the predictor variables: normaland nonnormal. In the former, each of the predictors was drawn from a normaldistribution with a given mean and standard deviation (for details on the valuesof the mean and standard deviations, see the following discussions of effect sizeand covariance matrixes). For the nonnormal data, the predictor variables weredistributed with a skewness of 1.75 and kurtosis of 3.75, using the methodsoutlined by Fleishman (1978) for keeping the desired correlation value. I selectedthis particular nonnormal distribution because it has been shown by Hess, Olejnik,and Huberty (2001) to have a deleterious effect on the performance of PDA interms of classification accuracy. Thus, I was interested in ascertaining whetherthese problems would also appear in the context of DDA.

Effect Size

Because the focus of this study was on the ability of SCs to correctly identifyvariables that were most associated with discriminant functions for which groupswere significantly different, group separation was simulated using Cohen’s d,univariate effect size (Cohen, 1988). Specifically, three values of d were used—0,0.5, and 0.8—corresponding to no difference, moderate difference, and large

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


difference, respectively, as characterized by Cohen. These three effect sizes weresimulated in the following manner: When the underlying distribution was normaland the covariance matrices were equal, one of the groups had data generated witha mean of 0 and a standard deviation of 1, whereas data for the other group wasgenerated with a mean equal to the value of Cohen’s d (0, 0.5, or 0.8) and a standarddeviation of 1. All combinations of these effect sizes, except both being 0, werecrossed with one another to create the following set of conditions: 5/0, 8/0, 5/5,8/5, 8/8. I decided that in all cases, at least one of the variables would be simulatedwith a group difference, thus precluding the appearance of the 0/0 condition. Inthe 6 variables case, a similar set of effect size patterns was used. Specifically, theeffect sizes of variables two through six were identical, so that in the 8/5 condition,for example, the effect size for the first variable was 0.8, whereas for each of thesecond through sixth variables, the effect size was 0.5. Although many differentpatterns were possible, I decided to use this straightforward approach to settingthe effect sizes.

Sample Size

I simulated four different total sample size conditions in the present study: 30, 60,100, and 150. These values were designed to represent a range of realistic studyconditions, from very small (30) to moderately large (150). They correspond tovalues seen in the applied DA literature (e.g., Glaser, Calhoun, & Petrocelli, 2002;Matters & Burnett, 2003; Russell & Cox, 2000).

Sample Size Ratio

I used two different group size ratios: 1:1 and 1:2. In the first condition, thetwo groups were created so as to have equal numbers of simulees. Because itis recognized that in much (perhaps most) applied research, groups are not ofequivalent size, I also included the second condition, in which one group wastwice as large as the other. In the case of unequal covariance matrices, the groupwith the largest variance also had the largest sample size.

Equality of Groups’ Covariance Matrixes

There were two conditions simulated with respect to the groups’ covariance ma-trixes: equal and unequal. Given that equality of the covariance matrixes is afoundational assumption underlying the DA studied here, it was important to as-sess the utility of SCs when the assumption was violated. Specifically, inequalityof covariance matrixes was translated into unequal standard deviations generated

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

36 FINCH

during the simulation of the data, with one group having a value that was 5 timeslarger than that of the other. As stated above, when the covariance matrixes wereunequal, the larger group had the larger standard deviation.

Correlation Between the Predictor Variables

Three correlation values were used in the simulations: 0.3, 0.5, and 0.8. Thesevalues were selected to reflect low, moderate, and large relationships among thepredictors. When the data were not normally distributed, the Fleishman methodwas used for determining the correlation values among the predictors.

Structure Coefficient Type

I examined both the total groups SC (SCT) and the within-groups SC (SCW).As discussed previously, it has been recommended that SCW may be preferablebecause it accounts for the fact that the values on the predictor variables havebeen found to differ in the vector of means (Huberty & Olejnik, 2006). Althoughprior research indicates that there may be little qualitative difference in termsof their relative stability when the predictors are normally distributed and groupcovariance matrices are homogeneous (Huberty, 1975), it is not clear whether thatconsistency holds with nonnormal data and heterogeneous covariance matrixes.

Number of Predictors

I conducted simulations for cases with two and six predictor variables, in thehope that by including these two conditions, the findings of the study would begeneralizable to a wider array of real life conditions in which DDA is applied.

To ascertain which of the manipulated factors (or combinations of factors) hadan impact on the performance of the SCs, I used a full factorial analysis of variance(ANOVA), including all main effects and interactions. Significant results from thisanalysis were used to identify which of the manipulated factors should be furtherinvestigated using descriptive statistics. The outcome of primary interest in thisstudy is the rate at which the SCs detect variable(s) that are most associated withgroup differentiation.

RESULTS

Here, I present the results of the simulation study in four parts. First, are casesin which the observed variables are associated with group separation in the two

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


groups, and the second are the null cases in which the variables are not associatedwith group differences. The third section includes results for the six variablescase, and finally, results are presented for a real data example. Results for the SCT

and SCW were nearly identical across all study conditions included here. For thisreason, only the results for SCW are included in the Results section.

ANOVA was used to identify main effects and interactions that were associatedwith the methods’ abilities to correctly identify variables associated with groupseparation (power). I believe that power is reasonable terminology in this case,given that correct identification of variables associated with group separation isconceptually related to the idea of rejecting a hypothesis of no variable impact.At the same time, I recognize that for the rules of thumb, there is no hypothesistesting being done, and thus true power cannot be calculated. In the same way,I will refer to Type I error as the case in which a variable with means that areknown not to differ between groups is nonetheless identified by these methods asbeing salient in understanding group difference. Again, I recognized that, with theexception of the bootstrap test, this is not a Type I error in the true sense.

Power

The ANOVA results indicated that several two-way interactions were significantlyassociated with power for these methods, as were all the main effects. The signifi-cant interactions and their effect size values appear in Table 1. Of these interactions,it is important to note that three stand out with much larger effect sizes than theothers, including the effect size by distribution, covariance matrix equality statusby distribution, and sample size by effect size. Because these three interactionshave much larger effect size values, they are the focus of the following discus-sion. The power rates for the interactions involving distribution of the predictorvariables appear in Table 2.

Results presented in Table 2 suggest that regardless of the distribution of theobserved variables, the use of the 0.3 rule resulted in the highest power for allmethods examined in the present study. Furthermore, greater group separation(as expressed through a larger effect size) was associated with greater power forall of the methods, again regardless of the distribution. The interaction betweeneffect size and distribution appears to be primarily associated with differentialperformance for the bootstrap, and to a lesser extent the direct comparison ofthe two predictors. Specifically, when the data come from a normal distribution,the bootstrap has power that is comparable to that of the 0.5 cutoff value rule.However, in the nonnormal case, the bootstrap suffers a decrease in power of atleast 0.08, except for the cases in which the two variables have the same effect size.In contrast, none of the rules of thumb experience such similar declines, exceptfor the 0.5 rule of thumb with the 0.8 effect coupled with a 0.5 effect (the 85 1

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

38 FINCH

TABLE 1Statistically Significant Interaction Terms and ω2 Effect Size

Variable ω2 effect size

XDistributionEffect size 0.606Correlation 0.044Covariance 0.494Sample size 0.018

XCovarianceCorrelation 0.051

XEffect sizeCorrelation 0.043Sample size 0.290

setting). The direct comparison of SC’s approach resulted in the lowest poweracross all effect size values. It should be noted that when the effect sizes were thesame for both predictors (88, 55 settings), we would not expect this method toresult in particularly high power, because the impact of both variables is the same,and this was precisely what the results demonstrated: In either the all-large orall-medium effect size conditions, simply comparing the SCs resulted in roughlyhalf of the cases with larger values for the first variable and half for the secondvariable.

Last, for all of the methods, it appears that when the medium effect size wascombined with the large effect size, power was lower for the medium effectthan it was when the medium effect size was associated with either a null effector another medium effect. In other words, when a variable exhibiting a mediumeffect difference between groups occurred in the presence of a variable with a largeeffect difference, the SCs were less likely to correctly identify the contributionof the medium effect size variable to group separation than they were when themedium effect variable was accompanied by another of comparable effect or nulleffect.

In terms of power given the distribution and covariance matrix equality orinequality, results in Table 2 suggest that when the covariance matrixes of thetwo groups were equal, all of the methods had greater power for identifying groupseparation when the predictor variables were normally distributed. However, whenthe covariance matrixes were unequal, the rules of thumb all had higher powerin the nonnormal case, whereas the bootstrap had greater power for the normallydistributed predictors. For the direct comparison of values, power was highestwhen the data were normally distributed and the group covariance matrixes wereequivalent, whereas when the covariance matrixes were not equal, power was notaffected by the underlying distribution. It is interesting to note that the greatest

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

TAB

LE2

Pow

erby

Dis

trib

utio

n(N

orm

alan

dN

onno

rmal

)an

dE

ffect

Siz

e(E

S),

Cov

aria

nce

Mat

rixE

qual

ity/In

equa

lity,

Cor

rela

tion

Bet

wee

nP

redi

ctor

san

dS

ampl

eS

ize

Dir

ect

0.3

rule

0.4

rule

0.5

rule

Boo

t

Vari

able

Nor

mal

Non

norm

alN

orm

alN

onno

rmal

Nor

mal

Non

norm

alN

orm

alN

onno

rmal

Nor

mal

Non

norm

al

ESa 50

0.58

0.52

0.90

0.90

0.85

0.87

0.78

0.83

0.76

0.63

800.

670.

620.

950.

940.

920.

910.

860.

880.

890.

8085

10.

800.

550.

980.

930.

960.

890.

950.

850.

930.

8585

2b0.

200.

450.

850.

820.

770.

760.

660.

710.

670.

5788

0.49

0.47

0.96

0.95

0.94

0.93

0.92

0.91

0.91

0.92

550.

470.

440.

920.

910.

880.

880.

830.

840.

800.

84C

ovar

ianc

em

atri

xE

qual

0.74

0.50

0.97

0.85

0.95

0.79

0.91

0.73

0.92

0.89

Une

qual

0.63

0.63

0.92

0.99

0.88

0.99

0.83

0.99

0.80

0.73

Cor

rela

tion

betw

een

pred

icto

rs0.

30.

790.

600.

950.

940.

930.

910.

890.

880.

870.

870.

50.

750.

560.

950.

930.

920.

900.

890.

870.

870.

800.

80.

520.

530.

930.

910.

890.

880.

830.

840.

830.

75Sa

mpl

esi

ze30

0.59

0.51

0.89

0.91

0.85

0.88

0.79

0.84

0.75

0.67

600.

670.

550.

940.

920.

900.

890.

860.

860.

840.

7810

00.

720.

580.

960.

930.

940.

900.

860.

870.

900.

8615

00.

750.

610.

980.

940.

960.

920.

930.

880.

940.

93

a 50=

0.5/

0.0;

80=

0.8/

0.0;

851

=po

wer

for

larg

eef

fect

in0.

8/0.

5co

nditi

on;

852

=po

wer

for

med

ium

effe

ctin

0.8/

0.5

cond

ition

;88

=0.

8/0.

8;55

=0.

5/0.

5.b T

his

repr

esen

tsan

erro

rin

the

sens

eth

atth

eva

riab

lew

ithth

esm

alle

ref

fect

inth

epo

pula

tion

had

ahi

gher

SCva

lue

than

the

vari

able

with

the

larg

eref

fect

.

39

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

40 FINCH

power was associated with the rules of thumb (all at 0.99) when the assumptionsof normality and homogeneous covariance matrixes were not met. In contrast, thebootstrap performed best when both assumptions were met. One caveat to keepin mind is that these power results cannot be interpreted without also consideringthe Type I error rates, the results of which appear in the next section.

In terms of the interaction between predictor distribution and interpredictorcorrelation, it appears that for the rules of thumb power declined very slightly as thecorrelation increased, in both the normal and nonnormal cases. For the bootstrap,as the correlation increased, the power decreased much more dramatically whenthe distribution was not normally distributed. For the direct comparison of theSCs between the two variables (limited to only cases in which the effect sizeswere not equivalent), the decrease in power associated with increased values ofthe correlation was more marked than for the other methods when the data werenormally distributed.

Last, with respect to sample size, all of the methods examined here had greaterpower for larger sample sizes, for both normal and nonnormal predictor variables.Sample size appeared to have a greater impact for the bootstrap test, the directcomparison of SCs, and the 0.5 rule of thumb (in the normal condition) thanfor the other two conditions. Indeed, at the largest sample size, the power of thebootstrap was comparable to that of the 0.4 rule of thumb, which cannot be said forthe smaller sample conditions. Also of note with respect to the bootstrap methodis that the diminution in power associated with nonnormality of the predictorvariables was mitigated at the largest sample size.

Table 3 contains the power results for varying effect sizes and sample sizes.Several interesting patterns emerge from this table. For the rules of thumb, largersamples were associated with greater statistical power, except for a moderateeffect size coupled with a large effect (85 2 setting), in which case sample sizewas unrelated to power. The power of the direct comparison of SC magnitudes andthe bootstrap were, generally, more heavily influenced by sample size than waspower for the rules of thumb. The results presented in Table 3 also make manifestthe fact that for smaller sample sizes, the bootstrap and direct comparison methodsexhibited lower power than any of the rules of thumb, whereas for samples of 100or 150, the bootstrap had comparable power to the 0.3 rule of thumb. The directcomparison of values approach had consistently lower power for identifying themore important variable than did the rules of thumb across nearly all conditions.Last, it appears that for both moderate and large effects, across sample sizes, thebootstrap had greater power when the two variables were of equal effect than whenone was at a lower effect size. For example, the bootstrap power for a moderateeffect with n = 30 was 0.57 when the other variable in the analysis had a nulleffect (50 condition) and was 0.70 when both variables had a medium effect (55condition). A similar pattern was evident for the large effect cases as well. Asshown in Tables 1 and 2, the power for the medium effect variable coupled with

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


TABLE 3Power by Effect Size (ES) and Sample Size (N)

ESa N Direct 0.3 0.4 0.5 Boot

50 30 0.50 0.87 0.82 0.77 0.5760 0.54 0.89 0.85 0.80 0.67

100 0.57 0.91 0.87 0.82 0.69150 0.59 0.92 0.89 0.84 0.83

80 30 0.58 0.91 0.87 0.81 0.6660 0.63 0.94 0.90 0.86 0.84

100 0.67 0.96 0.93 0.89 0.92150 0.70 0.97 0.95 0.91 0.96

85 1 30 0.58 0.92 0.89 0.85 0.7860 0.66 0.94 0.92 0.89 0.87

100 0.71 0.96 0.94 0.92 0.95150 0.75 0.97 0.96 0.93 0.97

85 2b 30 0.58 0.83 0.76 0.69 0.5060 0.66 0.83 0.76 0.68 0.57

100 0.71 0.83 0.76 0.68 0.65150 0.75 0.84 0.77 0.68 0.78

88 30 0.46 0.92 0.89 0.85 0.8460 0.48 0.95 0.93 0.90 0.84

100 0.49 0.97 0.96 0.94 0.96150 0.49 0.99 0.98 0.96 0.99

55 30 0.44 0.89 0.84 0.79 0.7060 0.45 0.91 0.87 0.83 0.81

100 0.46 0.93 0.90 0.86 0.88150 0.47 0.95 0.92 0.89 0.90

a50 = 0.5/0.0; 80 = 0.8/0.0; 85 1 = power for large effect in 0.8/0.5 condition; 85 2 = power formedium effect in 0.8/0.5 condition; 88 = 0.8/0.8; 55 = 0.5/0.5. bThis represents an error in the sensethat the variable with the smaller effect in the population had a higher SC value than the variable withthe larger effect.

the large effect was lower than it was for the medium effect coupled with eitherthe null or another medium effect, across sample sizes.

Type I Error Rate

In this study, the Type I error rate was affected by inclusion in the DA of avariable that was simulated not to be associated with group separation. Obviously,because researchers in practice would not know beforehand which variables weredefinitely associated with group separation and which were not, this situationalmost certainly occurs in actual research. If a variable is incorrectly identified asbeing associated with the discriminant function when it is not, we can think of it

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

42 FINCH

TABLE 4Type I Error Rate by Method and Effect Size (ES) of Accompanying Variable

ESa Direct 0.3 0.4 0.5 BootTwo Variables

50 0.45 0.69 0.61 0.54 0.2380 0.35 0.57 0.49 0.42 0.16

Six Variables

50 0.51 0.51 0.42 0.34 0.1280 0.44 0.43 0.35 0.29 0.11

a50 = 0.5/0.0; 80 = 0.8/0.0

as being a Type I error, and indeed in the case of the bootstrap hypothesis test, it isa true Type I error. The ANOVA for this part of the study indicated that the mostsalient of the manipulated factors in predicting the Type I error rate was the effectsize of the accompanying variable. The overall error rate, by magnitude of groupseparation for the accompanying variable, is shown in Table 4.

These results indicate that the rate of incorrect identification of a variable asbeing associated with the discriminant function was fairly high for all of the meth-ods examined here, with the lowest by far belonging to the bootstrap hypothesistest. In addition, the error rate was greater when the null variable was paired withone of medium effect as opposed to large effect. As would be expected with therules of thumb, the more stringent the criteria for identifying an important effect(i.e., 0.5 vs. 0.4 or 0.3), the lower the Type I error.

Table 5 contains the Type I error rates for the methods examined in the presentstudy, by the effect size of the accompanying variable (0.5 or 0.8) and the othermanipulated variables. The rules of thumb and the direct comparison of the SCsfor the two variables had greater Type I error rates when the distribution of thepredictor variables was not normal. This result colors the earlier finding that therules of thumb had higher power in the nonnormal case. In actuality, they appearto be biased toward indicating that a variable is associated with the discriminantfunction regardless of whether it actually is or not. In contrast, the error rate forthe bootstrap test was lower in the nonnormal case. When the covariance matrixeswere unequal for the two groups, the Type I error rates for all 5 methods werelarger than they were when the covariance matrixes were equal, with the greatestinflation occurring for the rules of thumb. For the rules of thumb and the bootstraphypothesis test, the Type I error rate declined as the correlation between thepredictor variables increased. However, when the magnitude of the two SCs weresimply compared with one another, higher correlation values were associated withhigher Type I error rates. Last, for all of the methods included in this study, theType I error rate declined as the sample size increased.

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

TAB

LE5

Type

IErr

orR

ate

byE

ffect

Siz

eof

Acc

ompa

nyin

gV

aria

ble

and

Dis

trib

utio

n,C

ovar

ianc

eM

atrix

Equ

ality

/Ineq

ualit

y,C

orre

latio

nB

etw

een

Pre

dict

ors

and

Sam

ple

Siz

e

Dir

ect

0.3

0.4

0.5

Boo

t

Vari

able

0.5

0.8

0.5

0.8

0.5

0.8

0.5

0.8

0.5

0.8

Dis

trib

utio

nN

orm

al0.

420.

330.

510.

340.

390.

230.

300.

150.

300.

18N

onno

rmal

0.48

0.38

0.87

0.80

0.83

0.74

0.78

0.69

0.16

0.13

Cov

aria

nce

mat

rix

Equ

al0.

470.

350.

590.

430.

480.

320.

400.

230.

180.

11U

nequ

al0.

430.

360.

800.

720.

740.

650.

690.

600.

290.

21C

orre

latio

nbe

twee

npr

edic

tors

0.3

0.37

0.24

0.74

0.62

0.66

0.53

0.58

0.46

0.34

0.21

0.5

0.42

0.30

0.71

0.59

0.63

0.50

0.56

0.43

0.22

0.14

0.8

0.56

0.48

0.64

0.51

0.55

0.43

0.48

0.37

0.13

0.12

Sam

ple

size

300.

500.

420.

780.

690.

700.

600.

630.

520.

330.

2760

0.46

0.37

0.72

0.60

0.64

0.51

0.57

0.44

0.24

0.17

100

0.43

0.33

0.67

0.53

0.58

0.44

0.51

0.38

0.22

0.10

150

0.41

0.30

0.62

0.47

0.53

0.38

0.46

0.33

0.13

0.08

43

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

44 FINCH

Six Variables Case

Results of the ANOVA for the six variables condition were very similar to that fortwo variables. Indeed, the same interactions identified as statistically significantin Table 1 apply to the six variables case as well. In general, the power and TypeI error rates in the six variables condition were both lower than were the rates fortwo variables, though the patterns for the various levels of the manipulated factorswere similar in the two cases. Table 6 contains the power of the various methodsstudied here under all of the manipulated conditions for the 6 variables case. Ingeneral, the power rates for individual variables were lower when more predictorswere included in the analysis. In addition, it appears that differences in power forthe normal and nonnormal distributions were larger with more variables for therules of thumb. However, although power for the bootstrap approach was lowerin the 6 variables case, the gap between the normal and nonnormal results wasnot wider with more variables. Last, the impact of the intervariable correlationwas stronger in the 6 predictors condition. With two predictors, power declinessomewhat for the rules of thumb as the correlation increases, whereas the declinein power with a concomitant increase in correlation is much more dramatic for 6predictors.

The values in Table 4 show that the Type I error rates were lower in the sixvariables case, and as was true for two variables, the Type I error rate was lowerwhen the effect size for the accompanying variable(s) was 0.8, as opposed to 0.5.Table 7 contains these Type I error rates for the six variables condition by themanipulated variables in this study. One interesting pattern in the 6 variables casethat differed from that for two variables was the difference in error rates betweenthe normal and nonnormal distributions. Specifically, the error rates in the normalcase with 6 predictors were generally below 0.20 for these rules of thumb. Indeed,they were as low or lower than the error rates for the bootstrap approach, except forthe 0.3 rule with 0.5 accompanying variable effect size. In addition, the decreasein Type I error rates with increasing sample size was not as great with six variablesas it was with two. In terms of the bootstrap technique, the differences in Type Ierror rates between the 0.5 and 0.8 conditions was not as notable with six variablesas it was with two.

Real Data Example

To demonstrate directly the use of these various methods of interpreting DDAstructure coefficients, analysis of a real set of data was conducted. These datarepresent scores on a performance motivation scale taken from 281 college fresh-men. The scale produced three scores, each of which represent a different mo-tivation for performance, including performance avoidance (avoidance), perfor-mance approach (approach), and performance mastery (mastery). DDA was used to

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

TAB

LE6

Pow

erby

Dis

trib

utio

n(N

orm

alan

dN

onno

rmal

)an

dE

ffect

Siz

e(E

S),

Cov

aria

nce

Mat

rixE

qual

ity/In

equa

lity,

Cor

rela

tion

Bet

wee

nP

redi

ctor

san

dS

ampl

eS

ize

for

the

Six

-Var

iabl

eC

ase

Dir

ect

0.3

rule

0.4

rule

0.5

rule

Boo

t

Nor

mal

Non

norm

alN

orm

alN

onno

rmal

Nor

mal

Non

norm

alN

orm

alN

onno

rmal

Nor

mal

Non

norm

al

ESa 50

0.48

0.51

0.70

0.82

0.55

0.75

0.37

0.67

0.54

0.53

800.

550.

570.

820.

890.

660.

840.

430.

760.

800.

7985

10.

440.

500.

860.

870.

740.

810.

570.

730.

790.

7985

2b0.

320.

250.

570.

750.

380.

670.

210.

590.

580.

5188

0.47

0.47

0.76

0.90

0.57

0.86

0.35

0.80

0.79

0.77

550.

460.

450.

660.

830.

480.

770.

310.

690.

550.

50C

ovar

ianc

em

atri

xE

qual

0.37

0.34

0.82

0.73

0.66

0.63

0.45

0.52

0.81

0.81

Une

qual

0.39

0.50

0.70

0.99

0.54

0.98

0.37

0.94

0.59

0.54

Cor

rela

tion

betw

een

pred

icto

rs0.

30.

480.

470.

810.

870.

690.

800.

520.

700.

690.

700.

50.

420.

400.

790.

870.

640.

820.

460.

760.

730.

680.

80.

240.

380.

680.

850.

460.

790.

240.

730.

680.

66Sa

mpl

esi

ze30

0.40

0.43

0.61

0.82

0.45

0.74

0.30

0.64

0.48

0.48

600.

380.

420.

740.

860.

570.

800.

390.

720.

670.

6310

00.

370.

410.

820.

880.

660.

830.

450.

760.

770.

7615

00.

370.

400.

870.

900.

710.

850.

490.

790.

870.

84

a 50=

0.5/

0.0;

80=

0.8/

0.0;

851

=po

wer

for

larg

eef

fect

in0.

8/0.

5co

nditi

on;8

52

=po

wer

form

ediu

mef

fect

in0.

8/0.

5co

nditi

on;8

8=

0.8/

0.8;

55=

0.5/

0.5.

b Thi

sre

pres

ents

aner

ror

inth

ese

nse

that

the

vari

able

with

the

smal

ler

effe

ctin

the

popu

latio

nha

da

high

erSC

valu

eth

anth

eva

riab

lew

ithth

ela

rger

effe

ct.

45

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

TAB

LE7

Type

IErr

orR

ate

byE

ffect

Siz

eof

Acc

ompa

nyin

gV

aria

ble

and

Dis

trib

utio

n,C

ovar

ianc

eM

atrix

Equ

ality

/Ineq

ualit

y,C

orre

latio

nB

etw

een

Pre

dict

ors

and

Sam

ple

Siz

efo

rth

eS

ix-V

aria

ble

Cas

e

Dir

ect

0.3

0.4

0.5

Boo

t

Dis

trib

utio

nVa

riab

le0.

50.

80.

50.

80.

50.

80.

50.

80.

50.

8

Nor

mal

0.52

0.45

0.24

0.12

0.13

0.05

0.06

0.02

0.13

0.12

Non

norm

al0.

490.

430.

780.

730.

710.

650.

620.

560.

110.

09C

ovar

ianc

em

atri

xE

qual

0.54

0.46

0.38

0.27

0.26

0.17

0.18

0.11

0.12

0.11

Une

qual

0.47

0.42

0.65

0.58

0.57

0.53

0.50

0.47

0.11

0.10

Cor

rela

tion

betw

een

pred

icto

rs0.

30.

370.

260.

530.

450.

430.

360.

320.

270.

130.

100.

50.

450.

360.

510.

420.

420.

350.

360.

300.

110.

120.

80.

700.

700.

490.

400.

400.

340.

350.

300.

120.

11Sa

mpl

esi

ze30

0.53

0.49

0.56

0.49

0.44

0.39

0.34

0.30

0.16

0.13

600.

510.

450.

530.

450.

430.

360.

350.

300.

110.

0910

00.

490.

420.

500.

400.

410.

340.

340.

290.

100.

0915

00.

480.

400.

460.

370.

380.

320.

330.

280.

090.

09

46

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


TABLE 8Structure Coefficients and Bootstrap Confidence Interval for ComparingMale and Female Students on Performance Motivation Subscale Scores

Variable Structure coefficientBootstrap mean

structure coefficient 95% CI

Approach 0.46 0.43 0.04–0.82Avoidance 0.55 0.57 0.22–0.92Mastery 0.76 0.70 0.41–0.99

Note. CI = confidence interval.

determine whether and how male and female students differ in terms of perfor-mance motivation. A total of 130 male and 151 female students were included inthe sample. Wilks’ Lambda indicated that a significant difference did exist betweengroups, λ = 0.937, F(3, 277) = 6.18, p = 0.0004. The structure coefficients forthe three observed variables appear in Table 8. On the basis of the rules of thumbcriteria, all three variables could be deemed as contributing to group separation,except for approach with the 0.5 cutoff value. The bootstrap confidence intervalsalso support the significance of all three variables in differentiating the genders,given that none contained zero. The mean values for these subscales by genderare presented in Table 9. It seems that for all three variables, female respondentshad higher means than did males.

CONCLUSIONS

The results presented in the present study highlight differences in the relativeperformance of several approaches for interpreting SC values in DDA. First of all,it appears that the rates of correct and incorrect identification of salient variablesin group separation were somewhat lower for 6 variables than they were for 2. Atthe same time, the impacts of the manipulated factors and their interactions were

TABLE 9Mean and Standard Deviation of Motivation Subscale Scores by Gender

Approach Avoidance Mastery

Gender M SD M SD M SD

Male 26.58 8.13 26.86 6.67 31.30 6.15Female 28.36 7.05 28.74 6.51 33.49 5.06

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

48 FINCH

similar in the two conditions. For this reason, much of the discussion below canbe taken to apply equally to both the 6 and 2 variable cases.

One of the more notable findings reported here is the high rate of false iden-tification of variables contributing to group differences. For every method, theprobability of a null variable signaling as important in relation to significant dis-criminant functions was more than 0.1 and most often quite more than 0.2. Thus,practitioners who rely solely on any of these methods will frequently identifya variable as contributing to a discriminant function in the population when, infact, it does not. Across study conditions, the highest rate of false identificationbelonged to the 0.3 rule of thumb criteria, whereas the lowest belonged to thebootstrap test statistic. This finding, which held true across the methods studiedhere, suggests that relying solely on SCs for interpreting DDA might be fraughtwith potential problems. Using these decision heuristics to characterize the com-plex decision regarding multivariate separation for two groups seems to be overlysimplistic, frequently leading to incorrect decisions regarding the nature of thegroup differences.

Generally speaking, use of the 0.3 rule of thumb resulted in the greatest power,or probability of correctly identifying variables that were associated with groupseparation. The lowest power of the methods examined in this study belonged tothe direct comparison of the two SC values, whereas the power of the bootstrapwas generally similar to or slightly lower than that for the 0.5 rule of thumb.However, it is important to keep in mind that the Type I error rate was also highestfor the 0.3 cutoff value.

In terms of the manipulated factors, the Type I error rate was most stronglyinfluenced by the effect size of the accompanying variable, such that when theaccompanying variable was characterized by a moderate effect size, the errorrate for the null variable was higher than when the accompanying variable wascharacterized by a large effect. In addition, the Type I error rate was higher whenthe data were not normally distributed, except for the bootstrap test. Indeed, forthe rules of thumb, the error rate was inflated by 0.3 or more when the data werenot normally distributed. For unequal covariance matrixes and nonnormal data(violating two primary assumptions underlying DDA), the rules of thumb hadtheir highest rates of power, whereas at the same time, the unequal covarianceswere also associated with their highest Type I error rates. The Type I error ratesfor all of the methods, except for the direct comparison of SCs, declined as thecorrelation between the two predictors increased, whereas the relationship betweenpower and the correlation was somewhat more complex, as described in the Resultssection.

Another interesting result that emerged from this study concerned the rela-tionship between power and the effect sizes of the two variables. Power for themoderate effect size condition when the accompanying variable was character-ized by a large effect (the 85 2 setting) was lower than power for the moderate

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


effect when accompanied by another moderate effect, for all methods. The powerfor detecting a moderate effect in the presence of a null variable was generallyinbetween the other two conditions for all of the methods examined here, thoughit was closer to the power rate in the both-moderate condition. This last outcomewas not found for the direct comparison method, which is not surprising becausethe SCs for the two variables should be similar, if not equal in value when theeffect sizes of the two variables is the same. In terms of detecting a large effect,the effect size of the accompanying variable was much less important than in themoderate case, except for the direct comparison method. In this instance, whenthe data were normally distributed, power for detecting the large effect was greaterwhen the accompanying variable was a moderate effect, as opposed to a null ef-fect. In contrast, when the data were not normally distributed, power was higherfor detecting the large effect in the presence of a null variable than it was in thepresence of a moderate effect. The Type I error rates for all methods were lowerwhen the accompanying variable was a large, rather than medium, effect.

Implications for Practice

The results of the present study have several implications for practitioners makinguse of SCs to interpret discriminant functions. First, it seems that using any of themethods described here will result in a relatively high rate of incorrect identifi-cation of important variables in describing group differences. The probability ofa researcher incorrectly identifying a variable as contributing to group separationwould be higher than 0.1 in virtually every case examined here. This outcomesuggests that the SCs should be used with great care in determining salient vari-ables for group separation. These results may suggest that interpretation of DDAusing only one index, such as the SCs, oversimplifies the process and may not beadvisable.

Another important implication for practice is that the ability to use SCs foridentifying individual variables as being related to group separation is affectedby the influence of the other predictors that are being used. For example, if oneof the variables is less strongly associated with the discriminant function (i.e.,has a lower effect size) than another, it may not be as easily identified as itwould were the more strongly related variable not included in the analysis. Thus,practitioners need to have some knowledge about the individual effects of theirvariables as they interpret the results of the DDA. It is possible that the impact of animportant, but not most important, variable could be masked by the presence of aneven more important variable in the analysis. Although this tendency may not beproblematic when the researcher is interested in assessing the relative importanceof the predictors, it would seem less than optimal when the focus is on identifyingall variables that contribute substantially to a discriminant function.

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

50 FINCH

A third implication is that the assumptions underlying DA in general do have abearing on the interpretation of the SCs. Specifically, the interaction of covarianceequality or inequality and the distribution of the predictor variables seems toinfluence the ability of the SCs to correctly identify salient variables in groupseparation. Indeed, when the data are not normally distributed, the likelihoodof incorrectly identifying a variable as being associated with group differencesis greater than when the data are normal. For this reason, practitioners shouldinvestigate these assumptions to ensure that the results they obtain from the DDAare dependable. If the covariance matrixes of the groups are not equivalent or thepredictors are not normally distributed, the SC values may not accurately reflectwhich of these variables are associated with the discriminant function.

Last, these results seem to suggest that no single approach to using SCs foridentifying variables related to group separation is uniformly optimal. In terms ofdetecting differences that are there, the 0.3 rule of thumb has the highest poweracross conditions. However, this method also has the highest Type I error rate. Incontrast, although the bootstrap hypothesis test appears to have the lowest TypeI error rate of the methods studied here, its power is somewhat lower than the0.3 and 0.4 rules of thumb under most conditions. The direct comparison of SCsfor identifying variables associated with group separation was generally the leastpowerful method studied here, and it may not be as effective as the rules of thumbor the bootstrap. In fact, under some conditions, simply comparing SC values fortwo variables could result in the incorrect conclusion that a variable that is truly notassociated with group separation is more strongly associated with a discriminantfunction than is a variable that is truly associated.

Given these results, it appears that if the sample size is 100 or greater, thebootstrap test may be the best approach for identifying variables that are salientin understanding significant discriminant functions in DDA. With this numberof subjects, the power of the bootstrap is greater than 0.85, regardless of thedistribution of the predictors, and the Type I error rate is much lower than that ofthe competing methods. With sample sizes less than 100, it is not as clear whatthe optimal approach would be, given the problems with inflated Type I error forthe 0.3 and 0.4 rules of thumb.

Study Limitations

The following limitations should be kept in mind when interpreting the resultsof the present study. First, interpretation of discriminant functions was limited tothe use of SCs. The results presented herein seem to suggest that this approachto interpreting DDA results is too limiting and that additional information shouldbe taken into account when determining the nature of multivariate group separa-tion. Therefore, researchers may wish to combine information from standardized

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4


discriminant coefficients with the SCs if they wish to gain greater insight intothe nature of group separation on the variables. At the same time, future researchshould focus on how these two pieces of information could be integrated in ameaningful way.

The data structure I used includes only two groups. Although this limited thegeneralizability of the results to some degree, it did allow for a clearer and morethorough investigation of the methods and how they perform under different dataconditions. Future research should build on the present study by including morepredictors and more groups. Another limitation of the study is the inclusion ofonly three univariate effect size conditions. As previously mentioned, these wereselected to reflect what Cohen (1988) referred to as moderate and large differencesin group means, as well as a case in which the group means do not differ at all.Clearly, other values for these effects, particularly a small effect size difference,could be chosen and future research should indeed do so. The values used herehopefully reflect some realistic conditions that researchers might encounter inpractice. Univariate, rather than multivariate, effect sizes were used so that theimpact of individual predictors on group separation could be isolated and theimpact of each level of separation on the power and Type I error could be identified.Last, the values of the other manipulated factors were, out of necessity, limited toa finite range of cases, as is true in any simulation study. Nonetheless, it wouldbe informative for future researchers to examine other nonnormal distributions, aswell as other correlation values, levels of covariance inequality, and sample sizes.

AUTHOR NOTE

Holmes Finch is an associate professor of statistics and measurement in theDepartment of Educational Psychology at Ball State University. His researchinterests include nonparametric approaches to multivariate analysis, structuralequation modeling, and item response theory.

REFERENCES

Agresti, A. (2002). Categorical Data Analysis. Hoboken, NJ: Wiley.Barcikowski, R., & Stevens, J. P. (1975). A Monte Carlo study of the stability of canonical correlations,

canonical weights and canonical variate-variable correlations. Multivariate Behavioral Research,10, 353–364.

Clarkson, D. B. (1979). Estimating the standard errors of rotated factor loadings. Psychometrika, 44,297–314.

Cohen, J. (1988). Statistical power analysis for the social sciences (2nd ed.). Hillsdale, NJ: Erlbaum.Cooley, W. W., & Lohnes, P. R. (1985). Multivariate data analysis. Melbourne, FL: R. E. Krieger.

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

52 FINCH

Dalgleish, L. I. (1994). Discriminant analysis: Statistical inference using the jackknife and Bootstrapprocedures. Psychological bulletin, 116, 498–508.

Dembo, R., Wareham, J., & Schmeidler, J. (2005). Evaluation of the impact of a policy change on adiversion program. Journal of Offender Rehabilitation, 41, 1–27.

Finch, W. H., & Schneider, M. K. (2005). Misclassification rates for four methods of group classifica-tion: Impact of predictor distribution, covariance inequality, effect size, sample size and group sizeratio. Educational and Psychological Measurement, 66, 240–257.

Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43,521–532.

Glaser, B. A., Calhoun, G. B., & Petrocelli, J. V. (2002). Personality characteristics of male juvenileoffenders by adjudicated offenses as indicated by the MMPI-A. Criminal Justice Behavior, 29,183–201.

Hess, B., Olejnik, S., & Huberty, C. J. (2001). The efficacy of two improvement-over-chance effect sizesfor two-group univariate comparisons under variance heterogeneity and nonnormality. Educationaland Psychological Measurement, 61, 909–936.

Huberty, C. J. (1975). The stability of three indices of relative variable contribution in discriminantanalaysis. Journal of Experimental Education, 2, 59–64.

Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA and discriminant analysis. New York: Wiley.Huberty, C. J., & Wisenbaker, J. M. (1992). Variable importance in multivariate group comparisons.

Journal of Educational Statistics, 17, 75–91.Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). Upper

Saddle River, NJ: Prentice Hall.McLachlan, G. J. (1992). Discriminant analysis and pattern recognition. New York: Wiley.Matters, G., & Burnett, P. C. (2003). Psychological predictors of the propensity to omit short-response

items on a high-stakes achievement test. Educational and Psychological Measurement, 63, 239–256.Meshbane, A., & Morris, J. D. (1996). Predictive discriminant analysis versus logistic Regression in

two-group classification problems. Paper presented at the annual meeting of the American Educa-tional Research Association, New York, NY.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction (3rded.). Fort Worth, TX: Harcourt Brace College Publishers.

Rencher, A. C. (1992). Interpretation of canonical discriminant functions, canonical variates, andprincipal components. The American Statistician, 46, 217–225.

Russell, W. D., & Cox, R. H. (2000). Construct validity of the Anxiety Rating Scale-2 with individualsport athletes. Journal of Sports Behavior, 23, 379–388.

Sherry, A., Henson, R. K., & Lewis, J. G. (2003). Evaluating the appropriateness of college-age normsfor use with adolescents on the NEO Personality Inventory—Revised. Assessment, 10, 71–78.

Sirois, B. C., Sears, S. F., & Marhefka, S. (2005). Do new drivers equal new donors? An examinationof factors influencing organ donation attitudes and behaviors in adolescents. Journal of BehavioralMedicine, 28, 201–212.

Stevens, J. (2000). Applied multivariate statistics for the social sciences. Mahwah, NJ: LawrenceErlbaum.

Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston: Allyn andBacon.

Thompson, B. (2004). Exploratory and confirmatory factor analysis: Understanding concepts andapplications. Washington, DC: American Psychological Association.

Dow

nloa

ded

by [

Uni

vers

ity o

f W

este

rn O

ntar

io]

at 1

0:52

08

Oct

ober

201

4

Identification of Variables Associated With Group Separation in Descriptive Discriminant Analysis:...

Documents

Transcript of Identification of Variables Associated With Group Separation in Descriptive Discriminant Analysis:...