Tutorial GCDkit ver 3.00

88
Menu GCDkit

description

using geochemical data & tutorial GCDKit ver 3.00

Transcript of Tutorial GCDkit ver 3.00

  • GCDkit I.Loads ASCII file or imports clipboard (e.g. data copied from Excel)Data are separated by tabulators, commas or semicolonsThe 1st line contains unique labels for the data columns (e.g. SiO2, Fe2O3, Rb, Nd), the 1st column unique sample IDsDecimal commas are converted to decimal points if appropriateMissing values are allowed anywhere in the data file; as such are interpreted also values 0, or any of NA,N.A.,-, b.d., bd

  • GCDkit I.Total iron as ferrous oxide: FeOt or FeO*Structurally bound water: H2O.PLUS, H2O+, H2OPLUS or H2O_PLUSColumn Symbol (if any): plotting symbols (as codes or single characters)A column whose name starts with Col (if any): code for colour of the symbolsAvoid special symbols in the column names, and accented characters throughout the file!

  • GCDkit I.Appends new samples (= new rows) to the data in memory. The structures of both data files are, as much as possible, matched. If necessary, empty columns are introduced to either of the data sets.

    File 1File 2

  • GCDkit I.Adds new data (columns) to the samples stored in the memory. No new samples are introduced that would occur solely in one of the files.File 1File 2

  • GCDkit I.Saves the modified data set stored in memory under a specified filename.The data can be retrieved again into GCDkit using the Load data file command.

  • GCDkit I.Information about the current dataset:levels and frequencies for each of the labels,list + no. of numeric columns, for each of the numeric variables no. of available values,total no. of samples,list of samples in the selected subset (or all samples if none is defined), current grouping information.

  • GCDkit I.Prints a cross table (contingency table) for 1-3 labels and plots corresponding barplots.

  • Contingency tablesAn example of a contingency table involving two labels

  • GCDkit I.Restricts the textual output to an absolute minimum (which is useful for large data files)

  • GCDkit II.

  • GCDkit II.

  • Intermezzo 1: Specifying a variable in GCDkitEnter complete name of a variable (e.g., SiO2)Type only part of the variable name. If the result is ambiguous, the desired variable has to be selected from the list of the multiple matches by mouse (applies also for empty patterns)Specify the variable sequence number (2 for the second one).Often if a formula is entered, the results are interpreted and computed by the calculation core.

    S

  • Intermezzo 2: Formulae & calculation coreFormula can involve any combination of names of existing numerical columns, with the constants, brackets, arithmetic operators +-*/^ and R functions.

    (Na2O+K2O)/CaORb^2log10(Sr)mean(SiO2)/10

    Examples of valid formulae:

  • Data handling I.Displays a single numeric variable or a result of a calculation# Works as a simple R shell too!summary(Rb,na.rm=T)cbind(SiO2/2,TiO2,Na2O+K2O)cbind(major) hist(SiO2,col="red")boxplot(Rb~factor(groups))

  • Intermezzo 3: Specifying multiple variablesList of column name(s), in full, separated by commasSequence numbers of variables or their ranges (1,10:15)Name of a built-in list, such as LILE, REE, major and HFSE or their combinations with the column namesUser-defined list = simple character vector. Currently only a single, stand-alone user-defined list can be employed as a search criterionFor empty patterns, the correct name(s) has to be selected by mouse click(s) ( Shift Ctrl) from the list of the available variables

  • Intermezzo 3: Specifying multiple variables - examplesSearch pattern = major SiO2, TiO2, Al2O3, Fe2O3, FeO, MnO, MgO, CaO, Na2O, K2O, P2O5 Search pattern = LILE Rb, Sr, Ba, K, Cs, LiSearch pattern = HFSE Nb, Zr, Hf, Ti, Ta, La, Ce, Y, Ga, Sc, Th, USearch pattern = REE La, Ce, Pr, Nd, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, LuSearch pattern = 1:5,7 Numeric data columns number 1, 2, ...5, 7# User-defined list my.elems
  • Intermezzo 4: Searching and subsettingThe search pattern is first tested whether it could be interpreted as a query of the sample name(s). The list of exact sample names separated by commas is allowed.The pattern is assumed to correspond to a selection of sample sequence numbers.Lastly the search pattern is interpreted as a Boolean condition.Entering empty pattern usually returns all the samples in the data set.

  • Intermezzo 4: Searching and subsetting - examples1. By sample nameSearch pattern = oz Samples with names Koz, KozD-5, Roz-5 Search pattern = Bl-1,Bl-2,Koz-3 Samples with names Bl-1,Bl-2,Koz-3 Regular expressions (advanced technique, see later)

  • Intermezzo 4: Searching and subsetting - examples2. By sample rangeSearch pattern = 1:5 # First to fifth samples in the data setSearch pattern = 1,10 # First and tenth samplesSearch pattern = 1:5, 10:11, 25 # Samples number 1, 2, ...5, 10, 11, 25In this case the search pattern is treated as a selection of sample sequence numbers (effectively a list separated by commas that may also contain ranges expressed by colons).

  • Intermezzo 4: Searching and subsetting - examples3. By Boolean conditionsPatterns may employ variable names and in R common comparison operators (see Table). The character strings should be quoted.The conditions can be combined together by logical and, or and brackets.Logical and can be expressed as .and. .AND. &Logical or can be expressed as .or. .OR. | Regular expressions can be employed to search in the textual labels. (advanced technique, see later )

  • Intermezzo 4: Searching and subsetting - examples3. By Boolean conditionsSearch pattern: Intrusion="Rhum # Finds all analyses from RhumSearch pattern: Intrusion="Rhum".and.SiO2>65 Search pattern: Intrusion="Rhum".AND.SiO2>65 Search pattern: Intrusion="Rhum"&SiO2>65 # All analyses from Rhum with silica greater than 65 # (all three expressions are equivalent) Search pattern: MgO>10&(Locality="Skye"|Locality="Islay") # All analyses from Skye or Islay with MgO greater than 10

  • Displays specified combination of numeric variable(s) and/or labels for selected range of samples.So far only names of existing numeric data columns and not formulae can be handled.Data handling I.

  • Data handling I.Deletes a single numeric variable or a label. Some fields are mandatory and cannot be removed.

  • Data handling I.Appends an empty numeric data column or new label to the current data set.

  • Data handling I.Simultaneous editing of all labels for individual samples using a spreadsheet-like interface.When the desired changes have been performed, close button is to be clicked.

  • Data handling I.

  • Data handling I.Global replacement of selected discrete values (levels) for a given label.

  • Data handling I.Simultaneous editing of all numeric data using a spreadsheet-like interface.

  • Intermezzo 5: Regular expressionsMany enquiries in the GCDkit employ regular expressions. This is a quite powerful searching mechanism more familiar to people working in Unix. Most characters, including letters and digits, are regular expressions that match themselves.Dot . matches any character.Metacharacters with a special meaning ? + { } | ( ) ) must be preceded by a backslash.Brackets can be used to group subexpressions.

  • Intermezzo 5: Regular expressions

  • Intermezzo 5: Regular expressions

  • Intermezzo 5: Regular expressions - examples# Searched is list of localities: Mull, Rhum, Skye, Coll, Colonsay, Hoy, Westray, Sanday, Stronsay, Tiree, IslaySearch pattern = ol Coll, ColonsaySearch pattern = n.a Colonsay, Sanday, StronsaySearch pattern = ^S Skye, Sanday, StronsaySearch pattern = e$ Skye, TireeSearch pattern = [ds]ay Colonsay, Sanday, StronsaySearch pattern = [p-s]ay Colonsay, Westray, Stronsay

  • Intermezzo 5: Regular expressions - examples# Searched is list of localities: Mull, Rhum, Skye, Coll, Colonsay, Hoy, Westray, Sanday, Stronsay, Tiree, IslaySearch pattern = ol|oy Coll, Colonsay, HoySearch pattern = l{2} Mull, Coll# Sample names are: Bl-1, Bl-3, Koz-1, Koz-2, Koz-5, Koz-11, KozD-1, Ri-1Search pattern = oz-|Bl- Bl-1, Bl-2, Bl-3, Koz-1, Koz-2, Koz-5, Koz-11 Search pattern = oz-[1-3] Koz-1, Koz-2, Koz-11Search pattern = oz-1{1,} Koz-1, Koz-11

  • Data handling I.Selecting subsets of the data stored in memory by searching sample names or levels of a single label.regular expressions implementedLokalita

  • Data handling I.Selecting subsets of the data stored in memory by their range.1:5

  • Data handling I.Selecting subsets of the current dataset using Boolean conditions. queried can be both numeric fields and labels (or combinations thereof)regular expressions can be employed to search the labelsSuita=Ricany

  • Data handling I.Restores data for all samples in the same form as they were loaded from a data file.

  • Data handling II.Grouping the data according to the levels of a single label.Suita

  • Data handling II.Grouping the data according to the interval a single numerical variable falls into.Enter a comma-delimited list of one or more breakpoints defining the intervalsThe default includes the mean, that would be supplemented by 0 and maximum (i.e. two intervals)The names of individual groups can be specifiedThe vector containing the information on the groups can be appended to the labels.

  • Data handling II.SiO252,63Basic,Intermediate,Acid

  • Data handling II.Grouping the data using selected classification diagram.The vector containing the information on the current groups can be appended to the labels.

  • Data handling II.Grouping the data using the cluster analysis.After the dendrogram is drawn, the user is asked how many clusters is the dataset to be broken into.The vector containing the information on the current groups can be appended to the labels. The groups are initially numbered but the names can be changed readily using the function Edit labels as factor.

    5

  • Data handling II.Enables merging several groups into a single one.The vector containing the information on the current groups can be appended to the labels.OldOldYoungYoung

  • Intermezzo 4: Plotting symbolsUse codes from the table or single character vectors as *,B,s

  • Intermezzo 5: Plotting coloursNB that only numeric codes can be used to specify plotting colours so far.

  • Data handling III.Assigns plotting symbols and colours simultaneously according to the levels of the defined groups.34

  • Data handling III.Assign plotting symbols or colours according to the levels of a single label.

  • Data handling III.Assign uniform plotting symbols or colours to all the analyses in the current data set.

  • Data handling III.

  • Data handling III.Displays a graphical legend(s) with current assignment of plotting symbols and colours used by most of the diagrams.If necessary, two legends are created, for symbols and colours separately.

  • Calculations I.Computes a single numeric variable and appends it, under specified name, to the numeric data in memory.SiO2/5My.param

  • Calculations I.Adds a formula for a single numeric variable to the specified R script (*.r ).The user is prompted for the variable name and any comments that should appear in the file.The script can be executed later using the R command File|Source. Alternatively, it can be placed among the plugins into the subdirectory \Plugin.

  • Calculations I.Recasts the selected data to a fixed sum.

  • Calculations I.Calculates millications as used for many plots of the French school, e.g. by De la Roche et al. (1980) or Debon & Le Fort (1983, 1988). The calculated values are Si, Ti, Al, Fe3, Fe2, Fe, Mn, Mg, Ca, Na, K, P.Where: MW = molecular weight of the Oxidei, n = number of atoms of Elementi in the formula

  • Calculations I.

  • Calculations II. (stats)Prints statistical summary for a single variable and the current dataset (or its part). Formulae are ok.The statistical summary involves no. of observations, missing values, mean, std. deviation, minimum, 25% quartile, median (= 50% quartile), 75% quartile and maximum. The function also plots a summary boxplot and a histogram.

  • Calculations II. (stats)

  • Calculations II. (stats)Like previous, but respecting grouping.

  • Calculations II. (stats)

  • Calculations II. (stats)Prints statistical summary for selected list of elements (majors or traces, respectively) and the current dataset (or its part).The statistical summary involves no. of observations, missing values, mean, std. deviation, minimum, 25% quartile, median (= 50% quartile), 75% quartile and maximum. Optionally the function also plots summary boxplots and/or histograms.

  • Calculations II. (stats)

  • Calculations II. (stats)Like previous, but respecting grouping.

  • Calculations II. (stats)

  • Calculations II. (stats)Displays a binary diagram of two elements/oxides, in which are plotted averages for the individual groups with whiskers corresponding to the standard deviations.

  • Calculations II. (stats)

  • Calculations II. (stats)Plots a matrix of scatterplots in the lower panel and one of other pre-defined panel functions in the upper.

  • Calculations II. (stats)

  • Calculations II. (stats)

  • Calculations II. (stats)Produces, for each group a separate, set of plots of correlation coefficient patterns (Rollinson 1993 and references therein).Similarity in patterns between two or more elements indicates their analogous geochemical behaviour, potentially controlled by the same geochemical process (fractional crystallization, partial melting, weathering, hydrothermal alteration...)

  • Calculations II. (stats)

  • Calculations II. (stats)Performs principal components analysis (scaled variables, covariance or correlation matrix) and plots a biplot (Gabriel, 1971).The length of the individual arrows is proportional to the relative variation of each of the variables. Comparable direction of two arrows implies that both variables are positively correlated; the opposite one indicates a strong negative correlation. (Buccianti & Peccerillo, 1999)

  • Calculations II. (stats)

  • Calculations II. (stats)Hierarchical cluster analysis on a set of dissimilarities.The user is asked to specify a label for the individual samples, default are their names.After the dendrogram is drawn, the individual clusters can be identified.For each sample falling into the given group, specified information (e.g. Locality, Rock type and/or Author) can be printed.

  • Calculations II. (stats)

  • Calculations III.

  • Calculations III.Least-squares approximations of the mode given major-element compositions of the rock and its main mineral constituents.Both unconstrained and constrained solutions are produced (see Albarede 1995 and the help file). Mineral compositions are to be provided in a separate tab-delimited text file. The output includes computed modal proportions of the individual minerals, the calculated composition of the rock and residuals.

  • Calculations III.

  • Calculations III.

  • Calculations III.Various petrochemical indexes, such as:Total iron as Fe2O3Fe2O3/FeO, Na2O/K2O and K2O/Na2O ratiosDifferentiation index (Larsen 1938)Solidification index (Kuno 1959)

  • Calculations III.Saves the variable results , returned by most calculation algorithms, to a tab-delimited text (ASCII) file.

  • Calculations III.Appends the most recently calculated values (variable results) to the numeric data stored in memory, e.g. for plotting or statistical evaluation.

  • Calculations III.Copies the most recently calculated results to a clipboard.

  • Plugin Saturation.rZircon saturation: yields temperatures for the observed major-element and Zr contents. Returns also Zr saturation levels for the given major-element compositions and assumed magma temperature (Watson & Harrison, 1983).Monazite saturation computes saturation temperatures for given major-element compositions and LREE contents of the magma (Montel, 1993).Apatite saturation calculates saturation temperatures for observed whole-rock major-element compositions (including P2O5 contents). Returns also a phosphorus saturation levels for the given major-element compositions and assumed magma temperature (Harrison & Watson, 1984; Bea et al., 1992; Pichavant et al., 1992).(Menu Calculations)

  • Plugin html_tables.rBoth functions output the specified data with (optional) labels into HTML. This format is useful for importing into spreadsheets, word processors or publishing on the WWW.The plugin attempts to format sub- and superscripts in the variable names.The created file htmltable.html is placed in the subdirectory \R2HTM of the main GCDkit directory; when finished, it is previewed in a browser. The style for the table is determined by the cascade style file R2HTML.css in the subdirectory \Plugin. The plugin exploits R2HTML library by Eric Lecoutre, which must be downloaded from CRAN and properly installed. (Menu Calculations)(Menu Data handling)

  • Plugin html_tables.r