Ordination Methods

8/14/2019 Ordination Methods

1/20

6/1/10

1

Ordination Methods

Correspondence Analysis (CA)

Detrended Correspondence Analysis (DCA)

Factor Analysis (FA)Principles of Canonical Analysis

Redundancy Analysis (RA)

Canonical Correspondence Analysis (CCA)

Canonical Correlation Analysis (CCorA)

Discriminant Analysis (DA)

1. Direct Gradient Analysis................................2

2. Few species...........................................4

4. Monotonic responses to gradients (low beta).......Linear regression

4. Nonmonotonic responses to gradients.(high beta)......Generalized linear models

2. Many species..........................................5

5. Monotonic responses .......... ........... .......RDA

5. Nonmonotonic responses.............................6

6. concerned about arch effect..................DCCA

6. not concerned about arch effect...............CCA

1. Indirect Gradient Analysis..............................3

3. Only distance values are available....................7

7. Monotonic responses ............................PCoA

7. Nonmonotonic responses..........................NMDS

3. Raw data available....................................8

8. Monotonic responses ...............................9

9. Variables noncommensurate......PCA - corr. matrix

9. Variables commensurate..........PCA - cov. matrix

8. Nonmonotonic responses............................10

10. Feel OK about prespecifying number of dimensions,

not worried about local optima, not interested in

species scores..............NMDS

10. Not as above, but willing to accept either ar ch

effect or detrending/rescaling................11

11. Don't like arch, detrending OK ..........DCA

11. Arch OK, or only interested in axis 1.....CA

Dichotomous Key forOrdination Methods

Not 100% accurate, but agood place to start.

(see Palmer 1998)


Correspondence was developed independently by severalauthors over a period of ca. 30 years and given many differentnames in the literature:

Contingency table analysis

RQ-techniqueReciprocal averagingCorrespondence analysisReciprocal ordering

Dual scalingHomogeneity analysis


2/20

6/1/10

2


Correspondence analysis was first proposed for analyzingtwo-

way contingency tables. In such tables, the states of the firstdescriptor (rows) are compared to the states of the seconddescriptor (columns). Data in each cell of the table are

frequencies. These frequencies are positive integers orzeroes.

In EEB, the most common application of CA is for the analysisof species data (0/1, or abundance) at different sampling sites.A species-site table essentially contains frequencies.


In general, CA may be applied to any data table that isdimensionally homogeneous (i.e., the physical dimensions ofall variables are the same) and only contains positive

integers or zeroes.

The !2distance (D16), which is a coefficient that excludes

double-zeroes, is used to quantify the relationship amongrows and columns. NB: Some authors have questioned the

efficacy of !2distance for certain types of data.


CA is primarily a method of ordination. As such, it is similar to PCA; it

preserves in the space of the principal axes (i.e., after rotation), theEuclidean distance between profiles of weighted conditional probabilities.

In other words, CA preserves the !2distance between the rows and the

columns of the contingency table.

Correspondence analysis proceeds along three steps:

(1) the contingency table is transformed into a table of contributions to the

Pearson chi-square statistic after fitting a null model to the table.

(2) Singular value decomposition (SVD) to that table and the eigenvalues

and eigenvectors are computed (as in PCA).

(3) Further matrix manipulations lead to the tables required for plotting in

ordination space.


3/20

6/1/10

3

Let's use an example other than the stand "species situation wehave been looking at (although we could do this here too) and

consider the relative abundance (0, +, ++) of a particular speciesobserved at 100 sites. The temperature at each site wasrecorded and coded (1,2,3):

Temp.

(Descr.-1)

(Descr.-2)

Sp. is:Rare

(0)

Abund.

(+)

Very Abund.

(++)

Row

Sums

Cold (1) 10 10 20 40

Med. (2) 10 15 10 35

Warm (3) 15 5 5 25

Col. Sums 35 30 35 100

CA- Example -

MatrixQcontains the proportions pijand the marginal totals pi+

and p+jof the rows and columns, respectively. Identifiers of therows and columns are given outside the matrix brackets inparentheses:

CA- Example -

The eigenvalues of Q'Qare: #1= 0.09613 (70.1%) and#2 = 0.04094 (29.9%) and

#3= 0 (because of centering)

CA- Example -


4/20

6/1/10

4

The normalized eigenvectors of Q'Q are then:

And the normalized eigenvectors of QQ' are then:

CA- Example -

In Scaling Type-1, Fand Vare determined to produce a CA joint plot:

Now, to put the rows (matrix F) at the centroids of the columns, the

position of each row along an ordination axis is computed as the mean ofthe column positions, weighted by the relative frequencies of the

observations in the various columns of that row...

CA- Example -

Consider the first row of the original data. The relative frequenciesof that row are 0.25, 0.25, 0.50. Multiplying matrix Vby that vector

provides the coordinates of the first row of the ordination diagram:

continuing:

CA- Example -


5/20

6/1/10

5

CA- Example -

Now, using Fand V,we can construct the

ordination plot:

CA using R

CA.csv

CA using R


6/20

6/1/10

6

CA using R

CA using R

Data Tables

Correspondence analysis has been applied to many types ofdata tables other than contingency tables.

However, as a caveat, recognize again that in order for CA towork correctly, the data table must be dimensionally

homogeneous(i.e., in the same physical units) and non-negative($0).

If the data do not meet these assumptions, they may be

transformed or recoded. This is a critical step in CA.


7/20

6/1/10

7

Arch Effect

Let's return to notion of coenocline distortion that we first

considered in PCA. Recall that most of the these proceduresrequire linear(or at least monotonic) responses. Speciesdata, in particular, is usually unimodallydistributed across a

gradient.

Recall that this problem usually manifests itself in the form of

an archor horseshoein the data projection.

Some ecologists are willing to tolerate this distortion while

others feel that an attempt should be made to recover theoriginal gradient via detrending.

Arch Effect

The most extreme form of the arch effect usually occurs

while attempting to apply a Euclidean distance measure tospecies abundance data. A horseshoeis formed because the ends actually

contract and fold inwards at the endsof Axis-1 and bend along Axis-3. Thisis because ED considers the extreme sites

to be very near each other.

In most instances, CA does not exhibit such a dramatic

folding towards the terminal portions, but rather just bendsalong Axis-1 to form an arch.

Detrended Correspondence

Analysis (DCA)

When a single axis is is enough to order the sites and speciescorrectly, a second axis, which is independent of the first, can be

obtained by folding the first axis in the middle and bringing theends together.

Subsequent independent axes can be obtained by folding the firstaxis in three parts, four parts, etc.

This process is referred to as Detrended Correspondence Analysis(DCA).


8/20

6/1/10

8


Analysis (DCA)

Recall this data set used to evaluate PCA & PCO where 3species were unimodally distributed across a coenocline:

-1

0

1

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Axis2

Axis 1

-0.1

-0.2

-0.4

-0.5

-0.6

0.1

0.2

0.4

0.5

0.6

-0.1-0.2-0.4-0.5-0.6 0.1 0.2 0.4 0.5 0.6

PCA w/Euclidean Distance

vs.

DCA via quadratic polynomial


Analysis (DCA)

Two main approaches have been proposed to remove the archeffect:

detrending by polynomials(previous example), anddetrending by segments.

Both methods lead to detrended correspondence analysis.


9/20

6/1/10

9

Detrending by Segments

When detrending by segments (Hill and Gauch 1980), axis-I is

divided in to a number of "segments" and, within each one, themean of the scores along axis II is made equal to zero; in otherwords, data points in each segment are moved along axis-II to

make their mean coincide with the abcissa.

Proximities among points should in no casebe interpreted as

meaningful! Segments can generate large differences in scores forpoints that are near each other in the original ordination but happento be on either side of a segment division.

The number of segments is arbitrary. Different numbers ofsegments lead to different ordinations.

Detrending by Segments

Various software packages use 10 as a minimum number of segmentsand 46 as a maximum; 26 being a recommended starting or default

value. This of course necessitates data sets with considerably moreobservations than 26. There are no empirical rules for the "correct"number of detrending segments.

After detrending by segments, the DCA ordination has the interestingproperty that the axes are scaled in units of the average standard

deviation (SD) of species turnover (Gauch 1982). Along a regulargradient, a species typically appears, rises to a modal value, anddisappears in 4 SD; similarly, a complete turnover in species

composition often occurs over 4 SD. Thus, the length of axis-I is oftenused as a measure of the length of the ecological gradient.

Detrending by Polynomials

Detrending bypolynomials(Hill and Gauch 1980) directly follows fromthe fact that an arch is produced when a gradient of sufficient length ispresent in the data. When a sufficient number of species are present

and replace each other along the gradient, the second axis of the CAapproaches a quadratic functionof the first one (i.e., a second degreepolynomial).

When detrending is sought, detrending by polynomials is an attractivemethod because it results in a continuous functionof the previous

axes, without the discontinuities generated by detrending-by-segments. However, the downside of this method is that it imposes avery specific polynomial modelthat the data must correspond to. It

also does not solve terminal gradient compressions at the ends of theordination axes.


10/20

6/1/10

10

1

2

3

4

567

8

9

10

DCA Axis 1

DCAAxis2

DCA of WI Forest Data

R2Axis-I = 0.866R2Axis-II = 0.07610 Segments

NB: Wedge Effect

To Detrend or Not To Detrend

That is the Question...

The controversy over detrending has raged inthe literature for well over 20 years now.

Wartenburg et al. (1987) argue that the arch is an important andinherent attribute of the distances among sites, not simply a

mathematical artifact. The only effect of DCA is to flatten thedistribution of points onto axis-I. They also argue that detrending bysegments is completely arbitrary and has no theoretical justification.

Peet et al. (1988) still support DCA on the grounds that detrendingand rescaling may facilitate ecological interpretation & called for

improved algorithms.

To Detrend or Not To Detrend

That is the Question...

Minchin (1987) produced a nice comparison of several

ordination techniques and found DCA to perform poorly onmost accounts.

He found that DCA actually removed real pattern from the data

and produced significant distortion which he referred to as a"tongue" or subsequently a "wedge" in the data and this was a

simple artifact of the algorithm.

However, Palmer (2010) argues for the viability of DCA,

especially in certain applications. He highlights thecharacteristics of both DCA and NMDS at his Ordinationwebsite


11/20

6/1/10

11

Factor Analysis

In the social sciences, analysis of the relationships among thedescriptors of a multidimensional data matrix is frequently carried out

via Factor Analysis (FA).

Recall that the goal of PCA is to account for a maximum amount of

the variancein the data, whereas the goal of factor analysis is toaccount for the covarianceamong descriptors.

Put another way, PCA is directed towards reducing the diagonalelements of R. Factor analysis is directed more towards reducing theoff-diagonal elements of R. Since reducing the diagonal elements

reduces the off-diagonal elements and vice versa, both methodsachieve much the same thing.

Factor Analysis

To do this, FA assumes that the observed descriptors are linearcombinations of hypothetical underlying variables (i.e., the factors).

Originally FA was used to evaluate such things as intelligence. Manyvariables could be measured such as age, parental education, family

income, etc. Multiple variables might play out to show that Factor-1

was determined by all variables related to education and Factor-2 tosocio-economic conditions (for example).

There are few applications of FA in EEB, so I will not cover it in

depth. An excellent treatment can be found in Tabachnick and Fidell(1996).


12/20

6/1/10

12

Canonical Analysis

Canonical analysis is the simultaneous analysis of two, or

eventually several data tables. It permits biologists to do a directcomparisonof two data matrices. Hence, canonical analysis andits derivatives are known as direct ordination methods.

Often, in ecology, one is interested in the relationship between afirst table describing species composition and a second table of

environmental descriptors, observed at the same locations(i.e.,objects or samples).

Canonical Analysis

Previous to this, we have considered indirect ordinationmethods

(PCA, PCO, NMDS, CA, DCA) in that we would ordinate a species "stand matrix and then conduct some form of correlation or regressionanalysis on the ordination vectors to relate objects or descriptors to

externally obtained environmental information. This procedure isperformed a posteriori.

In canonical analysis, with two matrices (XandY), one is constrainedby the other, and both are examined simultaneously. This permits oneto directly test a priorihypotheses by bringing all of the variance ofY

that is directly related to Xand allowing formal tests of thehypotheses.

Canonical Form

In mathematics, a canonical formis the simplest and mostcomprehensive form to which certain functions, relations, orexpressions can be reduced without loss of generality.

For example, the canonical form of a covariance matrix is its matrix ofeigenvalues.

In general, most methods of canonical analysis employ eigenanalysis(some extensions have been described using NMDS).

Canonical analysis combines the concepts of ordination andregression. It involves a response matrixYand an explanatory

matrix X. (See next slide.)

Like previous ordination methods, canonical analysis produces

orthogonal axes from which scatter diagrams may be plotted.


13/20

6/1/10

13

Variables y1...yp

Objects1t

on

Y

Objects1

ton

Variablesx1...xp

Xy

Var. y

Variables y1...yp

Objects1ton

Y

Variablesx1...xp

X

Simple

ordinationof matrix

Y: PCA,CA, etc.

Ordination

of y(single

axis) undertheconstraint of

X: aka

multipleregression

Ordination ofYunder

the constraint of X:Redundancy Analysis

(RDA) or CanonicalCorrespondence

Analysis (CCA)

Problems of canonical analysis can be r epresented via a partitionedcovariance matrix resulting from the fusion ofYand Xdata sets and

producing a joint dispersion matrix SY+X...

SubmatricesSYY(orderp !p) and SXX(m !m) concern each oftwo sets of descriptors, respectively, where SYX(p !m) and itstranspose S'YX= SXY(m !p) account for the covariances among

the descriptors of the two groups.

Redundancy Analysis

In redundancy analysis (RDA), each canonical ordinationaxis corresponds to a direction, in the multivariate scatter of

objects (Y), which is maximally related to a linearcombination of the explanatory variables X. A canonical axis

is thus similar to a principal component.

Two ordinations of the objects are obtained: (1) linearcombinations of theYvariables (matrix Fin PCA), (2) linearcombinations of the fittedY-hat variables (matrix Z), which

are thus also linear combinations of the Xvariables.

RDA preserves the Euclidean distance among objects in

matrixY-hat containing values ofYfitted by regression tothe explanatory variables X.


14/20

6/1/10

14

Canonical Correspondence Analysis

Canonical correspondence analysis (CCA) is similar to

RDA. The difference is that it preserves the !2distance

(as in CA), instead of the Euclidean distance amongobjects.

Calculations are a bit more complex sinceYcontains

fitted values obtained by weighted linear regression of

matrix Qof correspondence analysis on the explanatoryvariablesX. As in RDA, two ordinations of the objects

are obtained.

^

Canonical Correlation Analysis

In canonical correlation analysis (CCorA), the

canonical axes maximize the correlation between

linear combinations of the two sets of variablesYand X.

This is obtained by maximizing the among-variable-

group covariance (or correlation) with respect to the

within-variable-group covariance.

Two ordinations of the objects are again obtained.

Canonical Discriminant Analysis

In canonical discriminant analysis, the objects are

divided in to kgroups, described by a qualitative

descriptor.

The method maximizes the dispersion of the

centroids of the kgroups. This is obtained by

maximizing the ratio of the among-object-group

dispersion over the pooled within-object-groupdispersion.


15/20

6/1/10

15

Canonical Analysis

Unfortunately, we do not have the time to develop

the details of the algebra of each of the 4 methods

of canonical analysis previously described. But,you have now gained all of the necessary skills

necessary to interpret the details on your own

should you need to pursue one of these analyses.

Two excellent sources of of information on thesemethods can be found in Legendre and Legendre

(1998), ter Braak and !milauer (1998), and Lep"

and !milauer (2003).

Canonical Analysis

As an alternative to a detailed treatment of

mathematics behind each method, I would like to

develop a worked example.

Let's develop a data set using the number of fish

observed at 10 sites along a transect running from

the beach of a Caribbean island, with water depths

going from 1 to 10 m. The first three sites are onsand and the others alternate between coral and

"other substrate" (coded as 0/1).

Tropical Fish Data Set

Site

No.

Sp-

1

Sp-

2

Sp-

3

Sp-

4

Sp-

5

Sp-

6

Sp-

7

Sp-

8

Sp-

9

Depth

(m)

Coral Sand Other

1 1 0 0 0 0 0 2 4 4 1 0 1 0

2 0 0 0 0 0 0 5 6 1 2 0 1 0

3 0 1 0 0 0 0 0 2 3 3 0 1 0

4 1 4 0 0 8 1 6 2 0 4 0 0 1

5 1 5 17 7 0 0 6 6 2 5 1 0 06 9 6 0 0 6 2 10 1 4 6 0 0 1

7 9 7 13 10 0 0 4 5 4 7 1 0 0

8 7 8 0 0 4 3 6 6 4 8 0 0 10

9 7 9 10 13 0 0 6 2 0 9 1 0 1

10 5 10 0 0 2 4 0 1 3 10 0 0 0

% 60 50 40 30 20 10 45 35 25


16/20

6/1/10

16

Tropical Fish Data Set

Because we wish to conduct a direct gradient analysis (i.e., wehave both species data and environmental data from the same

samples), and we have numerous species (9), with roughlymonotonic responses (although one may be unimodal; e.g., 7) weselect RDA as the method of choice.

RDA is particularly appropriate when the gradients are short andspecies distributions are linear (or generally monotonic).

The software of choice for this type of analysis has for the lastdecade been CANOCO. Mathematically, this software is excellent

but its ease of use is not the best and graphics are poor. R nowhas applications to handle most of these ordination procedures(i.e., RDA and CCA).

RDA using R- Tropical Fish Data Set -



17/20

6/1/10

17





18/20

6/1/10

18

RDA using R- Tropical Fish Data Set

Discriminant Analysis

A common situation arises in EEB applications where

one starts with an already known grouping of objects,

and one wishes to assess how well a group ofquantitative descriptors can explain the object

groups. Thus, the problem is no longer how to define

or delineate groups, but rather how to interpret them.

This is the realm of discriminant analysis.

Discriminant analysis is a method of linear modeling,

like analysis of variance, multiple regression, and

canonical correlation analysis. DA is frequently used

in systematics.


DA proceeds in two steps:

(1) It first tests for the differences in the explanatoryvariables (X), among the predefined groups. This part

of the analysis is identical to the overall test performed

in the MANOVA.

(2) If the test supports the alternative hypothesis ofsignificant differences among groups in the Xvariables,

the analysis proceeds to f ind the linear combinations

(called discriminant functions) of the Xvariables that

best discriminate the groups.


19/20

6/1/10

19


Like one-way ANOVA, discriminant analysis considers asingle classification criterion (i.e., division of the objects into

groups) and allows one to test whether the explanatoryvariables can discriminate among the groups. Testing for

differences among group means in DA is identical toANOVA for a single explanatory variable and to MANOVAfor multiple explanatory variables.

When it comes to modeling, i.e., finding the linear

combinations of the variables (X) that best discriminateamong the groups, DA is a form of "inverse analysis" wherethe classification criterion is considered to be the response

variable (y) whereas the quantitative variables areexplanatory (matrix X).


Note that discriminant analysis (DA) is also called

canonical variates analysis(CVA). This method was

first proposed by Fisher (1936) where he publishedthe now famous data set where he described the

morphology of 150 specimens of irises (Iridaceae)

using 4 measured flower characters (lengths and

widths of sepals and petals) belonging to three

species.

Again, in the interest of time, we will bypass the

mathematical treatment of DA and work through the

iris data set using a software application (NCSS).


Shown here are Fisher'sdata for the 150 plants

(first 31 shown), fourvariables, and three

species (coded 1,2,3:1 = Iris setosa, 2 = Irisversicolor, and 3 = Iris

virginica . Note that I.versicoloris actually a

polyploid hybrid of theother two species.Datawere originally collected

by the botanist EdgarAnderson of the Missouri

Botanical garden andused with permission.


20/20

6/1/10

20

This chart plots the valuesof the first and seconddiscriminant function

scores. By looking at thisplot you can see what theclassification rule would be.The first function appearsto be the most important in

separating the threespecies.

Ordination Methods

Documents

Transcript of Ordination Methods