Ordination Methods

download Ordination Methods

of 20

Transcript of Ordination Methods

  • 8/14/2019 Ordination Methods

    1/20

    6/1/10

    1

    Ordination Methods

    Correspondence Analysis (CA)

    Detrended Correspondence Analysis (DCA)

    Factor Analysis (FA)Principles of Canonical Analysis

    Redundancy Analysis (RA)

    Canonical Correspondence Analysis (CCA)

    Canonical Correlation Analysis (CCorA)

    Discriminant Analysis (DA)

    1. Direct Gradient Analysis................................2

    2. Few species...........................................4

    4. Monotonic responses to gradients (low beta).......Linear regression

    4. Nonmonotonic responses to gradients.(high beta)......Generalized linear models

    2. Many species..........................................5

    5. Monotonic responses .......... ........... .......RDA

    5. Nonmonotonic responses.............................6

    6. concerned about arch effect..................DCCA

    6. not concerned about arch effect...............CCA

    1. Indirect Gradient Analysis..............................3

    3. Only distance values are available....................7

    7. Monotonic responses ............................PCoA

    7. Nonmonotonic responses..........................NMDS

    3. Raw data available....................................8

    8. Monotonic responses ...............................9

    9. Variables noncommensurate......PCA - corr. matrix

    9. Variables commensurate..........PCA - cov. matrix

    8. Nonmonotonic responses............................10

    10. Feel OK about prespecifying number of dimensions,

    not worried about local optima, not interested in

    species scores..............NMDS

    10. Not as above, but willing to accept either ar ch

    effect or detrending/rescaling................11

    11. Don't like arch, detrending OK ..........DCA

    11. Arch OK, or only interested in axis 1.....CA

    Dichotomous Key forOrdination Methods

    Not 100% accurate, but agood place to start.

    (see Palmer 1998)

    Correspondence Analysis (CA)

    Correspondence was developed independently by severalauthors over a period of ca. 30 years and given many differentnames in the literature:

    Contingency table analysis

    RQ-techniqueReciprocal averagingCorrespondence analysisReciprocal ordering

    Dual scalingHomogeneity analysis

  • 8/14/2019 Ordination Methods

    2/20

    6/1/10

    2

    Correspondence Analysis (CA)

    Correspondence analysis was first proposed for analyzingtwo-

    way contingency tables. In such tables, the states of the firstdescriptor (rows) are compared to the states of the seconddescriptor (columns). Data in each cell of the table are

    frequencies. These frequencies are positive integers orzeroes.

    In EEB, the most common application of CA is for the analysisof species data (0/1, or abundance) at different sampling sites.A species-site table essentially contains frequencies.

    Correspondence Analysis (CA)

    In general, CA may be applied to any data table that isdimensionally homogeneous (i.e., the physical dimensions ofall variables are the same) and only contains positive

    integers or zeroes.

    The !2distance (D16), which is a coefficient that excludes

    double-zeroes, is used to quantify the relationship amongrows and columns. NB: Some authors have questioned the

    efficacy of !2distance for certain types of data.

    Correspondence Analysis (CA)

    CA is primarily a method of ordination. As such, it is similar to PCA; it

    preserves in the space of the principal axes (i.e., after rotation), theEuclidean distance between profiles of weighted conditional probabilities.

    In other words, CA preserves the !2distance between the rows and the

    columns of the contingency table.

    Correspondence analysis proceeds along three steps:

    (1) the contingency table is transformed into a table of contributions to the

    Pearson chi-square statistic after fitting a null model to the table.

    (2) Singular value decomposition (SVD) to that table and the eigenvalues

    and eigenvectors are computed (as in PCA).

    (3) Further matrix manipulations lead to the tables required for plotting in

    ordination space.

  • 8/14/2019 Ordination Methods

    3/20

    6/1/10

    3

    Let's use an example other than the stand "species situation wehave been looking at (although we could do this here too) and

    consider the relative abundance (0, +, ++) of a particular speciesobserved at 100 sites. The temperature at each site wasrecorded and coded (1,2,3):

    Temp.

    (Descr.-1)

    (Descr.-2)

    Sp. is:Rare

    (0)

    Abund.

    (+)

    Very Abund.

    (++)

    Row

    Sums

    Cold (1) 10 10 20 40

    Med. (2) 10 15 10 35

    Warm (3) 15 5 5 25

    Col. Sums 35 30 35 100

    CA- Example -

    MatrixQcontains the proportions pijand the marginal totals pi+

    and p+jof the rows and columns, respectively. Identifiers of therows and columns are given outside the matrix brackets inparentheses:

    CA- Example -

    The eigenvalues of Q'Qare: #1= 0.09613 (70.1%) and#2 = 0.04094 (29.9%) and

    #3= 0 (because of centering)

    CA- Example -

  • 8/14/2019 Ordination Methods

    4/20

    6/1/10

    4

    The normalized eigenvectors of Q'Q are then:

    And the normalized eigenvectors of QQ' are then:

    CA- Example -

    In Scaling Type-1, Fand Vare determined to produce a CA joint plot:

    Now, to put the rows (matrix F) at the centroids of the columns, the

    position of each row along an ordination axis is computed as the mean ofthe column positions, weighted by the relative frequencies of the

    observations in the various columns of that row...

    CA- Example -

    Consider the first row of the original data. The relative frequenciesof that row are 0.25, 0.25, 0.50. Multiplying matrix Vby that vector

    provides the coordinates of the first row of the ordination diagram:

    continuing:

    CA- Example -

  • 8/14/2019 Ordination Methods

    5/20

    6/1/10

    5

    CA- Example -

    Now, using Fand V,we can construct the

    ordination plot:

    CA using R

    CA.csv

    CA using R

  • 8/14/2019 Ordination Methods

    6/20

    6/1/10

    6

    CA using R

    CA using R

    Data Tables

    Correspondence analysis has been applied to many types ofdata tables other than contingency tables.

    However, as a caveat, recognize again that in order for CA towork correctly, the data table must be dimensionally

    homogeneous(i.e., in the same physical units) and non-negative($0).

    If the data do not meet these assumptions, they may be

    transformed or recoded. This is a critical step in CA.

  • 8/14/2019 Ordination Methods

    7/20

    6/1/10

    7

    Arch Effect

    Let's return to notion of coenocline distortion that we first

    considered in PCA. Recall that most of the these proceduresrequire linear(or at least monotonic) responses. Speciesdata, in particular, is usually unimodallydistributed across a

    gradient.

    Recall that this problem usually manifests itself in the form of

    an archor horseshoein the data projection.

    Some ecologists are willing to tolerate this distortion while

    others feel that an attempt should be made to recover theoriginal gradient via detrending.

    Arch Effect

    The most extreme form of the arch effect usually occurs

    while attempting to apply a Euclidean distance measure tospecies abundance data. A horseshoeis formed because the ends actually

    contract and fold inwards at the endsof Axis-1 and bend along Axis-3. Thisis because ED considers the extreme sites

    to be very near each other.

    In most instances, CA does not exhibit such a dramatic

    folding towards the terminal portions, but rather just bendsalong Axis-1 to form an arch.

    Detrended Correspondence

    Analysis (DCA)

    When a single axis is is enough to order the sites and speciescorrectly, a second axis, which is independent of the first, can be

    obtained by folding the first axis in the middle and bringing theends together.

    Subsequent independent axes can be obtained by folding the firstaxis in three parts, four parts, etc.

    This process is referred to as Detrended Correspondence Analysis(DCA).

  • 8/14/2019 Ordination Methods

    8/20

    6/1/10

    8

    Detrended Correspondence

    Analysis (DCA)

    Recall this data set used to evaluate PCA & PCO where 3species were unimodally distributed across a coenocline:

    -1

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

    Axis2

    Axis 1

    -0.1

    -0.2

    -0.4

    -0.5

    -0.6

    0.1

    0.2

    0.4

    0.5

    0.6

    -0.1-0.2-0.4-0.5-0.6 0.1 0.2 0.4 0.5 0.6

    PCA w/Euclidean Distance

    vs.

    DCA via quadratic polynomial

    Detrended Correspondence

    Analysis (DCA)

    Two main approaches have been proposed to remove the archeffect:

    detrending by polynomials(previous example), anddetrending by segments.

    Both methods lead to detrended correspondence analysis.

  • 8/14/2019 Ordination Methods

    9/20

    6/1/10

    9

    Detrending by Segments

    When detrending by segments (Hill and Gauch 1980), axis-I is

    divided in to a number of "segments" and, within each one, themean of the scores along axis II is made equal to zero; in otherwords, data points in each segment are moved along axis-II to

    make their mean coincide with the abcissa.

    Proximities among points should in no casebe interpreted as

    meaningful! Segments can generate large differences in scores forpoints that are near each other in the original ordination but happento be on either side of a segment division.

    The number of segments is arbitrary. Different numbers ofsegments lead to different ordinations.

    Detrending by Segments

    Various software packages use 10 as a minimum number of segmentsand 46 as a maximum; 26 being a recommended starting or default

    value. This of course necessitates data sets with considerably moreobservations than 26. There are no empirical rules for the "correct"number of detrending segments.

    After detrending by segments, the DCA ordination has the interestingproperty that the axes are scaled in units of the average standard

    deviation (SD) of species turnover (Gauch 1982). Along a regulargradient, a species typically appears, rises to a modal value, anddisappears in 4 SD; similarly, a complete turnover in species

    composition often occurs over 4 SD. Thus, the length of axis-I is oftenused as a measure of the length of the ecological gradient.

    Detrending by Polynomials

    Detrending bypolynomials(Hill and Gauch 1980) directly follows fromthe fact that an arch is produced when a gradient of sufficient length ispresent in the data. When a sufficient number of species are present

    and replace each other along the gradient, the second axis of the CAapproaches a quadratic functionof the first one (i.e., a second degreepolynomial).

    When detrending is sought, detrending by polynomials is an attractivemethod because it results in a continuous functionof the previous

    axes, without the discontinuities generated by detrending-by-segments. However, the downside of this method is that it imposes avery specific polynomial modelthat the data must correspond to. It

    also does not solve terminal gradient compressions at the ends of theordination axes.

  • 8/14/2019 Ordination Methods

    10/20

    6/1/10

    10

    1

    2

    3

    4

    567

    8

    9

    10

    DCA Axis 1

    DCAAxis2

    DCA of WI Forest Data

    R2Axis-I = 0.866R2Axis-II = 0.07610 Segments

    NB: Wedge Effect

    To Detrend or Not To Detrend

    That is the Question...

    The controversy over detrending has raged inthe literature for well over 20 years now.

    Wartenburg et al. (1987) argue that the arch is an important andinherent attribute of the distances among sites, not simply a

    mathematical artifact. The only effect of DCA is to flatten thedistribution of points onto axis-I. They also argue that detrending bysegments is completely arbitrary and has no theoretical justification.

    Peet et al. (1988) still support DCA on the grounds that detrendingand rescaling may facilitate ecological interpretation & called for

    improved algorithms.

    To Detrend or Not To Detrend

    That is the Question...

    Minchin (1987) produced a nice comparison of several

    ordination techniques and found DCA to perform poorly onmost accounts.

    He found that DCA actually removed real pattern from the data

    and produced significant distortion which he referred to as a"tongue" or subsequently a "wedge" in the data and this was a

    simple artifact of the algorithm.

    However, Palmer (2010) argues for the viability of DCA,

    especially in certain applications. He highlights thecharacteristics of both DCA and NMDS at his Ordinationwebsite

  • 8/14/2019 Ordination Methods

    11/20

    6/1/10

    11

    Factor Analysis

    In the social sciences, analysis of the relationships among thedescriptors of a multidimensional data matrix is frequently carried out

    via Factor Analysis (FA).

    Recall that the goal of PCA is to account for a maximum amount of

    the variancein the data, whereas the goal of factor analysis is toaccount for the covarianceamong descriptors.

    Put another way, PCA is directed towards reducing the diagonalelements of R. Factor analysis is directed more towards reducing theoff-diagonal elements of R. Since reducing the diagonal elements

    reduces the off-diagonal elements and vice versa, both methodsachieve much the same thing.

    Factor Analysis

    To do this, FA assumes that the observed descriptors are linearcombinations of hypothetical underlying variables (i.e., the factors).

    Originally FA was used to evaluate such things as intelligence. Manyvariables could be measured such as age, parental education, family

    income, etc. Multiple variables might play out to show that Factor-1

    was determined by all variables related to education and Factor-2 tosocio-economic conditions (for example).

    There are few applications of FA in EEB, so I will not cover it in

    depth. An excellent treatment can be found in Tabachnick and Fidell(1996).

  • 8/14/2019 Ordination Methods

    12/20

    6/1/10

    12

    Canonical Analysis

    Canonical analysis is the simultaneous analysis of two, or

    eventually several data tables. It permits biologists to do a directcomparisonof two data matrices. Hence, canonical analysis andits derivatives are known as direct ordination methods.

    Often, in ecology, one is interested in the relationship between afirst table describing species composition and a second table of

    environmental descriptors, observed at the same locations(i.e.,objects or samples).

    Canonical Analysis

    Previous to this, we have considered indirect ordinationmethods

    (PCA, PCO, NMDS, CA, DCA) in that we would ordinate a species "stand matrix and then conduct some form of correlation or regressionanalysis on the ordination vectors to relate objects or descriptors to

    externally obtained environmental information. This procedure isperformed a posteriori.

    In canonical analysis, with two matrices (XandY), one is constrainedby the other, and both are examined simultaneously. This permits oneto directly test a priorihypotheses by bringing all of the variance ofY

    that is directly related to Xand allowing formal tests of thehypotheses.

    Canonical Form

    In mathematics, a canonical formis the simplest and mostcomprehensive form to which certain functions, relations, orexpressions can be reduced without loss of generality.

    For example, the canonical form of a covariance matrix is its matrix ofeigenvalues.

    In general, most methods of canonical analysis employ eigenanalysis(some extensions have been described using NMDS).

    Canonical analysis combines the concepts of ordination andregression. It involves a response matrixYand an explanatory

    matrix X. (See next slide.)

    Like previous ordination methods, canonical analysis produces

    orthogonal axes from which scatter diagrams may be plotted.

  • 8/14/2019 Ordination Methods

    13/20

    6/1/10

    13

    Variables y1...yp

    Objects1t

    on

    Y

    Objects1

    ton

    Variablesx1...xp

    Xy

    Var. y

    Variables y1...yp

    Objects1ton

    Y

    Variablesx1...xp

    X

    Simple

    ordinationof matrix

    Y: PCA,CA, etc.

    Ordination

    of y(single

    axis) undertheconstraint of

    X: aka

    multipleregression

    Ordination ofYunder

    the constraint of X:Redundancy Analysis

    (RDA) or CanonicalCorrespondence

    Analysis (CCA)

    Problems of canonical analysis can be r epresented via a partitionedcovariance matrix resulting from the fusion ofYand Xdata sets and

    producing a joint dispersion matrix SY+X...

    SubmatricesSYY(orderp !p) and SXX(m !m) concern each oftwo sets of descriptors, respectively, where SYX(p !m) and itstranspose S'YX= SXY(m !p) account for the covariances among

    the descriptors of the two groups.

    Redundancy Analysis

    In redundancy analysis (RDA), each canonical ordinationaxis corresponds to a direction, in the multivariate scatter of

    objects (Y), which is maximally related to a linearcombination of the explanatory variables X. A canonical axis

    is thus similar to a principal component.

    Two ordinations of the objects are obtained: (1) linearcombinations of theYvariables (matrix Fin PCA), (2) linearcombinations of the fittedY-hat variables (matrix Z), which

    are thus also linear combinations of the Xvariables.

    RDA preserves the Euclidean distance among objects in

    matrixY-hat containing values ofYfitted by regression tothe explanatory variables X.

  • 8/14/2019 Ordination Methods

    14/20

    6/1/10

    14

    Canonical Correspondence Analysis

    Canonical correspondence analysis (CCA) is similar to

    RDA. The difference is that it preserves the !2distance

    (as in CA), instead of the Euclidean distance amongobjects.

    Calculations are a bit more complex sinceYcontains

    fitted values obtained by weighted linear regression of

    matrix Qof correspondence analysis on the explanatoryvariablesX. As in RDA, two ordinations of the objects

    are obtained.

    ^

    Canonical Correlation Analysis

    In canonical correlation analysis (CCorA), the

    canonical axes maximize the correlation between

    linear combinations of the two sets of variablesYand X.

    This is obtained by maximizing the among-variable-

    group covariance (or correlation) with respect to the

    within-variable-group covariance.

    Two ordinations of the objects are again obtained.

    Canonical Discriminant Analysis

    In canonical discriminant analysis, the objects are

    divided in to kgroups, described by a qualitative

    descriptor.

    The method maximizes the dispersion of the

    centroids of the kgroups. This is obtained by

    maximizing the ratio of the among-object-group

    dispersion over the pooled within-object-groupdispersion.

  • 8/14/2019 Ordination Methods

    15/20

    6/1/10

    15

    Canonical Analysis

    Unfortunately, we do not have the time to develop

    the details of the algebra of each of the 4 methods

    of canonical analysis previously described. But,you have now gained all of the necessary skills

    necessary to interpret the details on your own

    should you need to pursue one of these analyses.

    Two excellent sources of of information on thesemethods can be found in Legendre and Legendre

    (1998), ter Braak and !milauer (1998), and Lep"

    and !milauer (2003).

    Canonical Analysis

    As an alternative to a detailed treatment of

    mathematics behind each method, I would like to

    develop a worked example.

    Let's develop a data set using the number of fish

    observed at 10 sites along a transect running from

    the beach of a Caribbean island, with water depths

    going from 1 to 10 m. The first three sites are onsand and the others alternate between coral and

    "other substrate" (coded as 0/1).

    Tropical Fish Data Set

    Site

    No.

    Sp-

    1

    Sp-

    2

    Sp-

    3

    Sp-

    4

    Sp-

    5

    Sp-

    6

    Sp-

    7

    Sp-

    8

    Sp-

    9

    Depth

    (m)

    Coral Sand Other

    1 1 0 0 0 0 0 2 4 4 1 0 1 0

    2 0 0 0 0 0 0 5 6 1 2 0 1 0

    3 0 1 0 0 0 0 0 2 3 3 0 1 0

    4 1 4 0 0 8 1 6 2 0 4 0 0 1

    5 1 5 17 7 0 0 6 6 2 5 1 0 06 9 6 0 0 6 2 10 1 4 6 0 0 1

    7 9 7 13 10 0 0 4 5 4 7 1 0 0

    8 7 8 0 0 4 3 6 6 4 8 0 0 10

    9 7 9 10 13 0 0 6 2 0 9 1 0 1

    10 5 10 0 0 2 4 0 1 3 10 0 0 0

    % 60 50 40 30 20 10 45 35 25

  • 8/14/2019 Ordination Methods

    16/20

    6/1/10

    16

    Tropical Fish Data Set

    Because we wish to conduct a direct gradient analysis (i.e., wehave both species data and environmental data from the same

    samples), and we have numerous species (9), with roughlymonotonic responses (although one may be unimodal; e.g., 7) weselect RDA as the method of choice.

    RDA is particularly appropriate when the gradients are short andspecies distributions are linear (or generally monotonic).

    The software of choice for this type of analysis has for the lastdecade been CANOCO. Mathematically, this software is excellent

    but its ease of use is not the best and graphics are poor. R nowhas applications to handle most of these ordination procedures(i.e., RDA and CCA).

    RDA using R- Tropical Fish Data Set -

    RDA using R- Tropical Fish Data Set -

  • 8/14/2019 Ordination Methods

    17/20

    6/1/10

    17

    RDA using R- Tropical Fish Data Set -

    RDA using R- Tropical Fish Data Set -

    RDA using R- Tropical Fish Data Set -

  • 8/14/2019 Ordination Methods

    18/20

    6/1/10

    18

    RDA using R- Tropical Fish Data Set

    Discriminant Analysis

    A common situation arises in EEB applications where

    one starts with an already known grouping of objects,

    and one wishes to assess how well a group ofquantitative descriptors can explain the object

    groups. Thus, the problem is no longer how to define

    or delineate groups, but rather how to interpret them.

    This is the realm of discriminant analysis.

    Discriminant analysis is a method of linear modeling,

    like analysis of variance, multiple regression, and

    canonical correlation analysis. DA is frequently used

    in systematics.

    Discriminant Analysis

    DA proceeds in two steps:

    (1) It first tests for the differences in the explanatoryvariables (X), among the predefined groups. This part

    of the analysis is identical to the overall test performed

    in the MANOVA.

    (2) If the test supports the alternative hypothesis ofsignificant differences among groups in the Xvariables,

    the analysis proceeds to f ind the linear combinations

    (called discriminant functions) of the Xvariables that

    best discriminate the groups.

  • 8/14/2019 Ordination Methods

    19/20

    6/1/10

    19

    Discriminant Analysis

    Like one-way ANOVA, discriminant analysis considers asingle classification criterion (i.e., division of the objects into

    groups) and allows one to test whether the explanatoryvariables can discriminate among the groups. Testing for

    differences among group means in DA is identical toANOVA for a single explanatory variable and to MANOVAfor multiple explanatory variables.

    When it comes to modeling, i.e., finding the linear

    combinations of the variables (X) that best discriminateamong the groups, DA is a form of "inverse analysis" wherethe classification criterion is considered to be the response

    variable (y) whereas the quantitative variables areexplanatory (matrix X).

    Discriminant Analysis

    Note that discriminant analysis (DA) is also called

    canonical variates analysis(CVA). This method was

    first proposed by Fisher (1936) where he publishedthe now famous data set where he described the

    morphology of 150 specimens of irises (Iridaceae)

    using 4 measured flower characters (lengths and

    widths of sepals and petals) belonging to three

    species.

    Again, in the interest of time, we will bypass the

    mathematical treatment of DA and work through the

    iris data set using a software application (NCSS).

    Discriminant Analysis

    Shown here are Fisher'sdata for the 150 plants

    (first 31 shown), fourvariables, and three

    species (coded 1,2,3:1 = Iris setosa, 2 = Irisversicolor, and 3 = Iris

    virginica . Note that I.versicoloris actually a

    polyploid hybrid of theother two species.Datawere originally collected

    by the botanist EdgarAnderson of the Missouri

    Botanical garden andused with permission.

  • 8/14/2019 Ordination Methods

    20/20

    6/1/10

    20

    This chart plots the valuesof the first and seconddiscriminant function

    scores. By looking at thisplot you can see what theclassification rule would be.The first function appearsto be the most important in

    separating the threespecies.