Tree Diversity Analysis

download Tree Diversity Analysis

of 28

Transcript of Tree Diversity Analysis

  • 8/6/2019 Tree Diversity Analysis

    1/28

    TREE DIVERSITY ANALYSIS

    A manual and software for common statistical methods

    for ecological and biodiversity studies

    Using the BiodiversityR software within the R 2.10.0 environment

    R Kindt, World Agroforestry Centre, Nairobi (Kenya)

    1. Introduction

    The software accompanying the Tree Diversity Analysismanual was developed for the R 2.1.1environment. Since the publication of the manual at the end of 2005, new versions of thebase Renvironment and its accompanying packages have become available. These changesmade it necessary to modify the BiodiversityR software. At the same time that these changeswere implemented, the opportunity was taken to develop BiodiversityR into a package thatcan be installed and is documented in the same way as other R packages. Some newfunctions were integrated in the new version of the package. Functions that were not directlyassociated with the graphical user interface (GUI) provided by BiodiversityR weredocumented separately.

    The GUI of BiodiversityR is supported by the R Commander package (Figure 1). Optionsselected from a specific menu generate commands that are placed in the "Script Window" of

    the R Commander. These commands return results that are shown in the "Output Window"and potentially as graphs (figure 2 and 3).

    Highlighting some commands and clicking on the "Submit" button re-submits commandsand generates new results. We encourage users to read the help files to understand thevarious functions and their arguments and also to explore modifying some of the options.

    This document shows where the Tree Diversity Analysis manual has becomeoutdated. Please read the manual for more information on community ecology andbiodiversity analysis.

  • 8/6/2019 Tree Diversity Analysis

    2/28

    Figure 1. The graphical user interface (GUI) of the BiodiversityR package is integrated inthe R Commander and is available from the right-hand side of the top menu.

  • 8/6/2019 Tree Diversity Analysis

    3/28

    Figure 2. The GUI of Biodiversity allows to select different options for calculations, forexample to use the exact calculation method for species accumulation curves (these optionsare available from the BiodiversityR menu option of: BiodiversityR > Analysis of diversity >Species accumulation curves). Clicking on the "OK" button generates commands that aresubmitted to the Script Window of the R Commander (see Figure 3). Clicking on the "Plot"

    button generates commands that produce specific graphics based on the results that wereobtained earlier (in the example shown below, graphical results use data from theDune.accum result).

  • 8/6/2019 Tree Diversity Analysis

    4/28

    Figure 3. Selection of specific options from a BiodiversityR window result in commandsthat are shown in the Script Window of the R Commander (the commands shown here weregenerated by the options shown in Figure 2 after clicking the "OK" button). It is possible tomodify these commands in the Script Window and to obtain new results by clicking on the"Submit" button.

  • 8/6/2019 Tree Diversity Analysis

    5/28

    2. Main changes in the software

    The main changes in the software include the following:

    Installation Using the package and the graphical user interface after the package was installed Importing data via Excel workbooks or Access databases

    2.1. Installation

    In the new version of the software, the package is installed and loaded as any other packagedeveloped within the R statistical environment. An accompanying document (Installationof BiodiversityR in Windows) provides instructions how BiodiversityR can be installedand used under MS Windows. This accompanying document replaces most of theinformation that is available in Chapter 3: Doing biodiversity analysis withBiodiversity.R of the manual. An important change from the previous instructions for

    installation is that the step of copying the Biodiversity.R and Rcmdr-menus.txt is not neededanymore (page 34 in the manual).

    You need the following packages for all options of BiodiversityR. Between brackets I haveindicated the version of the packages that I am currently using. The first four packages areessential, although most of the other packages are also frequently used.

    BiodiversityR (1.4.2) car (1.2-16) Rcmdr (1.5-3) [Note that this may be the only version of the R-Commander compatible with

    BiodiversityR]

    vegan (1.15-4)

    abind (1.1-0) akima (0.5-3) aplpack (1.2-2) colorspace (1.0-1) effects (2.0-11) ellipse (0.3-5) Hmisc (3.7-0) lmtest (0.9-24) maptree (1.4-5) mgcv (1.5-6) multcomp (1.1-2)

    mvtnorm (0.9-8) relimp (1.0-1) rgl (0.87) RODBC (1.3-1) sp (0.9-44) [only for one function for training purposes] splancs (2.01-25) [only for one function for training purposes]

  • 8/6/2019 Tree Diversity Analysis

    6/28

    2.2. Using the package and the graphical user interface after the package wasinstalled

    As the software is now a standard package, use the following command to load the package(obviously after the package was installed; alternatively you could use the following menuoptions: Packages > Load package):

    library(BiodiversityR)

    Note that Ris case-sensitive, so never use capitals where these are not shown.

    To access the graphical user interface of the package (still based on the Rcmdrpackage), use:

    BiodiversityRGUI()

    To learn more about the features of the BiodiversityR package, use menu options of:BiodiversityR > Help about BiodiversityR > Help about BiodiversityR, or type:

    help("BiodiversityRGUI", help_type="html")

  • 8/6/2019 Tree Diversity Analysis

    7/28

    2.3. Importing data via Excel workbooks or Access databases

    A new feature of the updated package is that data can be imported from Excel workbooksor Access databases. To be able to import data for the community and environmentaldatasets (read Chapter 2: Data preparation for more information about these datasets),data for the environmental data set needs to be available from an Excel worksheet

    (alternatively an Access table) named environmental (Figure 4). Data for the communitydata set should either be imported as a matrix (formatted as sites species, with speciesabundances as cell entries) from an Excel worksheet (alternatively an Access table) namedcommunity (Figure 5), or these data should be available in a stacked format (withseparate columns for sites, species and abundances) from an Excel worksheet (alternativelyan Access table) named stacked (Figure 6). Both data sets should be available from thesame Excel workbook (or Access database).

    More information on importing data from Excel or Access is available from the helpprovided for the import.from.Excel and import.from.Access functions:

    help("import.from.Excel", help_type="html")

    help("import.from.Access", help_type="html")

    The accompanying document (Installation of BiodiversityR in Windows) also providesinstructions on preparing Excel files, as well as some suggestions to avoid problems inimporting data. For users that do not have access to MS Excel, we suggest that they use theOpenOffice Calc program to prepare the data and save the data as MS Excel file. TheOpenOffice can be obtained fromwww.openoffice.org.

    Some users have experienced problems to import data. Some suggestions to avoid problems

    with importing data are the following: Avoid as much as possible to have spaces in names of variables. Use variable names

    such as soil_texture or soil.texture rather than soil texture.

    Try not to use special characters in data sets such as . Avoid capital letters for the names of worksheets or tables Prior to importing data from the stacked data, also replace spaces in names of

    species (since these will become variable names)

    Use a strict scheme of using capital letters or not. Especially check whether thenumber of species after importing data from the stacked format is what you

    expected. Since R is case sensitive, species names such as Olea_capensis andolea_capensis will be interpreted as different species in R. You can determine thenumber of species from the number of columns in the community data set or via themenu option of: Biodiversity R > Analysis of diversity > Diversity indices and thenopting to calculate the species richness with the calculation method for all sites.

    Rather than using a scheme of naming sample units as S1, S10 or S100, use anumbering system with leading zeroes such as S001, S010 and S100.

    http://www.openoffice.org/http://www.openoffice.org/
  • 8/6/2019 Tree Diversity Analysis

    8/28

    In case that names of sites are not in the same sequence or do not contain the samesubset of sample units, use the menu option of: BiodiversityR > Community matrix> Same sites for community/environmental.

    In some situations, MS Excel imports data from a larger number of columns or rowsthan the current data range (this seems to be a result of previous presence of data in

    those columns or rows even if the data was deleted later). You may therefore wishto open a new workbook and copy the desired data ranges in the community,environmental and stacked worksheet.

    One method to check what could have gone wrong when trying to import data is to importdata via the Rcmdr option of: Data > Import Data > From Excel, Access or dBase data set.

    Figure 4. Required format of the Excel workbook with the environmental data set: the

    name of the worksheet is "environmental" (without capitals), row 1 gives the names of thevarious variables (preferably without spaces or special characters), column A contains namesof sample units, other columns (B-F) document characteristics of sample units. The name ofthe variable with the labels for sample units (cell A1) should be the same as the name for thevariables with labels in sheet "community" (Figure 5) or "stacked" (Figure 6).

  • 8/6/2019 Tree Diversity Analysis

    9/28

    Figure 5. Required format of the Excel workbook with the community data set: the name ofthe worksheet is "community" (without capitals), names for variables (preferably withoutspaces or special characters) are given in row 1, column A contains labels of sample units,whereas other columns document abundances of species. The name of the variable with thelabels for sample units is given in cell A1. Except for the names of sample units, this data set

    only contains continuous (numeric) variables.

  • 8/6/2019 Tree Diversity Analysis

    10/28

    Figure 6. Required format of the Excel workbook with the stacked data set: the name of theworksheet is "stacked" (without capitals), names for variables (preferably without spaces orspecial characters) are given in row 1, one variable contains labels of sample units (column A), a second variable contains names of species (preferably without spaces or specialcharacters, column B) and a third columns document abundances of species (column C).

  • 8/6/2019 Tree Diversity Analysis

    11/28

    3. Main changes in the examples of the manual

    The main change in the examples is that the menu options should now be accessed viaBiodiversityRand not Biodiversity. Make sure that you select a community data set and anenvironmental data set (see above and chapter 2 of the manual) before embarking onanalysis.

    Most of the menu options and commands remain the same or they only changed slightly(for example, the option of calculating the first-order Jackknife gamma diversity estimator isnow "jack1" whereas it was documented as "Jack.1" in the manual the change reflects achange that was made in the vegancommunity ecology package that is used to calculate theresult). In case that is not clear what option to choose from, please check the changes in thecommands.

    Remember that menu options result in calculations by clicking on the "OK" button. Menuoptions related to graphical output are invoked by the "Plot" button.

    I suggest to check the commands that are listed below rather than the commands listed inthe guide. Commands should be pasted into the "Script Window" of the R Commander,

    highlighted and the "Submit" button should be clicked to obtain results.

    The commands are also available as scripts that are listed in a separate directory (Manual \Scripts). These scripts can be accessed via the menu option of: File > Open script fromthe R Console or the menu option of: File > Open script file from the R-Commander.

    We encourage that users explore importing data into R. However, all data sets can also beimported by loading the workspace of TreeDiversity.RData (available from the Datadirectory) via the menu option of: File > Load Workspace from the R Console.

  • 8/6/2019 Tree Diversity Analysis

    12/28

    Commands for Chapter 1: Sampling

    #To load polygons with the research areas:

    area

  • 8/6/2019 Tree Diversity Analysis

    13/28

    #To select sample plots on a grid (alternative):

    plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical

    position", lwd=2, bty="l")

    polygon(landuse1)

    polygon(landuse2)

    polygon(landuse3)

    spatialsample(area, method="grid", xwidth=1, ywidth=1, plotit=T, xleft=12,

    ylower=7, xdist=4, ydist=4)

    #To randomly select sample plots from a grid:

    plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical

    position", lwd=2, bty="l")

    polygon(landuse1)

    polygon(landuse2)

    polygon(landuse3)

    spatialsample(area, method="random grid", n=20, xwidth=1, ywidth=1, plotit=T,

    xleft=10.5, ylower=5.5, xdist=1, ydist=1)

    #To randomly select sample plots from a grid (alternative):

    plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical

    position", lwd=2, bty="l")

    polygon(landuse1)

    polygon(landuse2)

    polygon(landuse3)

    spatialsample(area, method="random grid", n=20, xwidth=1, ywidth=1, plotit=T,

    xleft=12, ylower=7, xdist=4, ydist=4)

    #To select sample plots from a grid with random start:

    plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical

    position", lwd=2, bty="l")

    polygon(landuse1)

    polygon(landuse2)

    polygon(landuse3)

    spatialsample(area, method="random grid", n=20, xwidth=1, ywidth=1, plotit=T,

    xdist=4, ydist=4)

    #To randomly select maximum 10 sample plots from each type of landuse:

    plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical

    position", lwd=2, bty="l")

    polygon(landuse1)

    polygon(landuse2)

    polygon(landuse3)

    spatialsample(landuse1, n=10, method="random", plotit=T)

    spatialsample(landuse2, n=10, method="random", plotit=T)

    spatialsample(landuse3, n=10, method="random", plotit=T)

  • 8/6/2019 Tree Diversity Analysis

    14/28

    #To randomly select sample plots from a grid within each type of landuse.

    Within each landuse, the grid has a random starting position:

    plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical

    position", lwd=2, bty="l")

    polygon(landuse1)

    polygon(landuse2)

    polygon(landuse3)

    spatialsample(landuse1, n=10, method="random grid", xdist=2, ydist=2, plotit=T)

    spatialsample(landuse2, n=10, method="random grid", xdist=4, ydist=4, plotit=T)

    spatialsample(landuse3, n=10, method="random grid", xdist=4, ydist=4, plotit=T)

    #To calculate sample size requirements:

    power.t.test(n=NULL, delta=1, sd=1, sig.level=0.05, power=0.8,

    type="two.sample")

    power.t.test(n=NULL, delta=0.5, sd=1, sig.level=0.05, power=0.8,

    type="two.sample")

    power.anova.test(n=NULL, groups=4, between.var=1, within.var=1, power=0.8)

    power.anova.test(n=NULL, groups=4, between.var=2, within.var=1, power=0.8)

    #To calculate the area of a polygon:

    areapl(landuse1)

  • 8/6/2019 Tree Diversity Analysis

    15/28

    Commands for Chapter 2: Data preparation

    #To load data from an external file:

    new.data

  • 8/6/2019 Tree Diversity Analysis

    16/28

    Commands for Chapter 4: Analysis of species richness

    #To calculate the total number of species:

    Diversity.1

  • 8/6/2019 Tree Diversity Analysis

    17/28

    #To compare species richness between various subsets in the data using species

    accumulation curves

    Accum.6

  • 8/6/2019 Tree Diversity Analysis

    18/28

    Commands for Chapter 5: Analysis of diversity

    #To calculate and plot a rank-abundance curve:

    RankAbun.1

  • 8/6/2019 Tree Diversity Analysis

    19/28

    Commands for Chapter 6: Analysis of counts of trees

    #Load the dataset Faramea.txt and give it the name faramea.

    faramea

  • 8/6/2019 Tree Diversity Analysis

    20/28

    #To calculate a generalized linear regression model (GLM):

    Count.model3

  • 8/6/2019 Tree Diversity Analysis

    21/28

    Commands for Chapter 7: Analysis of presence or absence of species

    faramea

  • 8/6/2019 Tree Diversity Analysis

    22/28

    Commands for Chapter 8: Analysis of differences in species composition

    #Calculating distance matrices

    euclidean.distance

  • 8/6/2019 Tree Diversity Analysis

    23/28

    Commands for Chapter 9: Analysis of ecological distance by clustering

    #Calculate and plot agglomerative clustering:

    library(cluster)

    distmatrix

  • 8/6/2019 Tree Diversity Analysis

    24/28

    #Calculating non-hierarchical clusters:

    distmatrix

  • 8/6/2019 Tree Diversity Analysis

    25/28

    Commands for Chapter 10: Analysis of ecological distance by ordination

    #Calculating a principal component analysis (PCA)

    Ordination.model1

  • 8/6/2019 Tree Diversity Analysis

    26/28

    #Calculating a principal coordinates analysis (PCoA)

    distmatrix

  • 8/6/2019 Tree Diversity Analysis

    27/28

    #Calculating the correlation between distance in an ordination graph and total

    distance

    distmatrix

  • 8/6/2019 Tree Diversity Analysis

    28/28

    #Plotting categorical environmental variables onto an ordination graph

    distmatrix