Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
-
Upload
dag-endresen -
Category
Technology
-
view
533 -
download
1
description
Transcript of Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
![Page 1: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/1.jpg)
![Page 2: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/2.jpg)
• Domes'ca'on bo-leneck • U'liza'on of gene'c diversity • Core collec'on subset selec'on • Trait mining selec'on • Computer modeling
• Example 1: • Nordic Barley Landraces (2005) • N-‐PLS regression (in MATLAB)
• Example 2: • Net blotch in barley (ICARDA, USDA) • Discriminant analysis (DA)
2
![Page 3: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/3.jpg)
corn, maize
wild tomato
tomato
teosinte
![Page 4: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/4.jpg)
C A
B
Tradi'onal landraces
A A
B
Crop Wild Rela'ves
A A
A
Modern cul'vars
Gene/c bo1lenecks during crop domes/ca/on and modern plant breeding. The circles represent allelic varia'on. The funnels represents allelic varia'on of genes found in the crop wild rela'ves, but gradually lost during domes'ca'on, tradi'onal cul'va'on and modern plant breeding.
![Page 5: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/5.jpg)
![Page 6: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/6.jpg)
• Scien'sts and plant breeders want a few hundred germplasm accessions to evaluate for a par'cular trait.
• How does the scien'st select a small subset likely to have the useful trait?
• Example: More than 560 000 wheat accessions in genebanks worldwide.
6 Slide adopted from a slide by Ken Street, ICARDA (FIGS team)
![Page 7: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/7.jpg)
• The scien'st or the breeder need a smaller subset to cope with the field screening experiments.
• A common approach is to create a so-‐called core collec/on.
7
Sir O-o H. Frankel (1900-‐1998) proposed a limited set or "core collec'on” established from an exis'ng collec'on with minimum similarity between its entries.
The core collec'on is of limited size and chosen to represent the gene/c diversity of a large collec'on, a crop, a wild species or group of species (1984) .
![Page 8: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/8.jpg)
• Given that the trait property you are looking for is rela'vely rare:
• Perhaps as rare as a unique allele for one single landrace cul'var...
• Geeng what you want is largely a ques'on of LUCK!
8 Slide adopted from a slide by Ken Street, ICARDA (FIGS team)
![Page 9: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/9.jpg)
9
![Page 10: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/10.jpg)
Wild rela'ves are shaped by the environment
Primi've cul'vated crops are shaped by local climate and humans
Tradi'onal cul'vated crops (landraces) are shaped by climate and humans
Modern cul'vated crops are mostly shaped by humans (plant breeders)
Perhaps future crops are shaped in the molecular laboratory…? 10
![Page 11: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/11.jpg)
Objec/ve of this study:
– Explore climate data as a predic'on model for “pre-‐screening” of crop traits BEFORE full scale field trials.
– Iden'fica'on of landraces with a higher probability of holding an interes'ng trait property.
11
![Page 12: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/12.jpg)
• Primi/ve crops and tradi/onal landraces are an important source for novel traits for improvement of modern crops.
• Landraces are oien not well described for the economically valuable traits.
• Iden'fica'on of novel crop traits will oien be the result of a larger field trial screening project (thousands of individual plants).
• Large scale field trials are very costly, area and human working hours.
12
![Page 13: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/13.jpg)
The underlying assump'on of FIGS selec'on is that the climate at the original source loca'on, where the landrace was developed during long-‐term tradi'onal cul'va'on, is correlated to the trait.
The aim is to build a computer model explaining the crop trait score (dependent variables) from the climate data (independent variables).
13
![Page 14: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/14.jpg)
1) Landrace samples (genebank seed accessions) 2) Trait observa'ons (experimental design) 3) Climate data (for the landrace loca'on of origin)
• The accession iden'fier (accession number) provides the bridge to the crop trait observa'ons. • The longitude, la/tude coordinates for the original collec'ng site of the accessions (landraces) provide the bridge to the environmental data.
14
![Page 15: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/15.jpg)
15
Lima, Peru
Benin
Alnarp, Sweden
Svalbard
![Page 16: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/16.jpg)
h-p://barley.ipk-‐gatersleben.de
16 Powdery Mildew, Blumeria graminis
Leaf spots Ascochyta sp.
Yellow rust Puccinia strilformis
Black stem rust Puccinia graminis
Faba bean, Finland Field trials, Gatersleben, Germany
Forage crops, Dotnuva, Lithuania Radish (S. Jeppson)
Potato Priekuli Latvia
Linnés äpple
![Page 17: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/17.jpg)
The climate data is extracted from the WorldClim dataset. h-p://www.worldclim.org/
Data from weather sta'ons worldwide are combined to a con'nuous surface layer.
Climate data for each landrace is extracted from this surface layer. Precipita'on: 20 590 sta'ons
Temperature: 7 280 sta'ons 17
![Page 18: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/18.jpg)
FIGS selec'on is a new method to predict crop traits of primi've cul'vated material from climate variables by using mul'variate sta's'cal methods.
18
![Page 19: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/19.jpg)
Origin of Concept (1980s): Wheat and barley landraces from marine soils in the Mediterranean region provided genetic variation for boron toxicity.
What is
Slide made by Michael Mackay 1995
h-p://www.figstraitmine.org/
19
South Australia
Mediterranean region
![Page 20: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/20.jpg)
20
FIGS The FIGS technology takes much of the guess work out of choosing which accessions are most likely to contain the specific characteris'cs being sought by plant breeders to improve plant produc'vity across numerous challenging environments. h-p://www.figstraitmine.org/
20
![Page 21: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/21.jpg)
Slide made by Michael Mackay 1995
21
![Page 22: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/22.jpg)
22
• No sources of Sunn pest resistance previously found in hexaploid wheat.
• 2 000 accessions screened at ICARDA without result (during last 7 years).
• A FIGS set of 534 accessions was developed and screened (2007, 2008).
• 10 resistant accessions were found! • The FIGS selec'on started from 16 000 landraces
from VIR, ICARDA and AWCC • Exclude origin CHN, PAK, IND were Sunn pest only
recently reported (6 328 acc). • Only accession per collec'ng site (2 830 acc). • Excluding dry environments below 280 mm/year • Excluding sites of low winter temperature below 10
degrees Celsius (1 502 acc)
Slide adopted from Ken Street, ICARDA (FIGS team)
![Page 23: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/23.jpg)
23
![Page 24: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/24.jpg)
– The ini'al model is developed from the training set
– Fine tuning of model parameters and seengs
– No model can ever be absolutely correct – A simula'on model can only be an approxima'on – A model is always created for a specific purpose
– The simula'on model is applied to make predic'ons based on new fresh data
– Be aware to avoid extrapola'on problems 24
![Page 25: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/25.jpg)
– For the ini'al calibra'on or training step.
– Further calibra'on, tuning step – Oien cross-‐valida'on on the
training set is used to reduce the consump'on of raw data.
– For the model valida'on or goodness of fit tes'ng.
– New external data, not used in the model calibra'on.
25
![Page 26: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/26.jpg)
26
![Page 27: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/27.jpg)
27
![Page 28: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/28.jpg)
28
![Page 29: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/29.jpg)
Sta/on Al/tude La/tude Longitude
Priekuli, Latvia 83 m 57.3167 25.3667
Bjørke forsøksgård, Norway 149 m 60.7667 11.2167
Landskrona, Sweden 3 m 55.8667 12.8333
29
Two years: • 2002 • 2003
![Page 30: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/30.jpg)
accide AccNum Country Locality Eleva/on La/tude Longitude Coordinate
7436 NGB27 Finland Sarkalahti, Luumäki 95 m 61.0333 27.3333 SESTO
9717 NGB456 Norway Dønna, Nordland 71 m 66.1167 12.5 Georeferenced
9601 NGB468 Norway Trysil 400 m 61.2833 12.2833 Georeferenced
9600 NGB469 Norway BJØRNEBY 400 m 61.2833 12.2833 Georeferenced
7966 NGB775 Sweden Överkalix, Allsån 45 m 66.4 22.9333 SESTO
8510 NGB776 Sweden Överkalix 100 m 66.4 22.7667 SESTO
7810 NGB792 Finland Luusua, Kemijärvi 145 m 66.4833 27.35 SESTO
9538 NGB2072 Norway Finset 1220 m 60.6 7.5 Georeferenced
8482 NGB2565 Sweden Öland 11 m 56.7333 16.6667 Georeferenced
9102 NGB4641 Denmark Støvring, Jylland 55 m 56.8833 9.8333 Georeferenced
9015 NGB4701 Faroe Islands Faroe Islands 81 m 62.0167 -6.7667 Georeferenced
9039 NGB6300 Faroe Islands Faroe Islands 81 m 62.0167 -6.7667 Georeferenced
8531 NGB9529 Denmark Lyderupgaard 9 m 56.5667 9.35 Georeferenced
7344 NGB13458 Finland Koskenkylä, Rovaniemi 91 m 66.5167 25.8667 Georeferenced 30
![Page 31: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/31.jpg)
From a total of 19 landrace accessions included in the dataset, only 4 of the landrace accessions included geo-‐referenced coordinates in the NordGen SESTO database.
10 accessions were geo-‐referenced from the reported place name and descrip'ons of the original gathering site included in SESTO and other sources.
For 5 accessions there were not enough informa'on available to locate the original gathering loca'on.
Right side illustra.on Example of georeferencing for NGB9529, landrace reported
as originaGng from Lyderupgaard using KRAK.dk and maps.google.com
31
![Page 32: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/32.jpg)
32
![Page 33: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/33.jpg)
12 monthly means
14 land
races
(loca'o
n of origin)
12 14
3
Min. temperature
14 samples
Climate data (mode 3): • Minimum temperature • Maximum temperature • Precipita'on • … (many more layers can be added)
Jan, Feb, Mar, …
Max. temperature
Jan, Feb, Mar, …
Precipita'on
Jan, Feb, Mar, …
33
![Page 34: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/34.jpg)
6 traits
14 land
races (x2)
6 28
6
6 traits
Bjørke (N) 2002
6 traits 6 traits 6 traits 6 traits 6 traits
28 records
Mode 2 (Traits) * Heading days * Ripening days * Length of plant * Harvest index * Volumetric weight * Grain weight (tgw)
Bjørke (N) 2003
Landskrona (S) 2003
Landskrona (S) 2002
Priekuli (Lv) 2002
Priekuli (Lv) 2003
Mode 3 * LVA 2002 * LVA 2003 * NOR 2002 * NOR 2003 * SWE 2002 * SWE2003
34
![Page 35: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/35.jpg)
35
![Page 36: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/36.jpg)
36
![Page 37: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/37.jpg)
Mode 3 (climate variables) have very different range of numerical values (tmin, tmax, and prec). Scaling across mode 3 is thus applied to the mul'-‐way models.
Lei is displayed the box-‐plot for the 3-‐way data unfolded as to keep the dimensions of mode 3.
The 3-‐way climate data was reasonably well described by a PARAFAC model of two components.
tmin tmax prec
Scaling across mode 3
37
tmin tmax prec
Box plot, raw data
![Page 38: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/38.jpg)
PARAFAC split-‐half (mode 1) analysis:
The two PARAFAC models each calibrated from two independent split-‐half subsets, both converge to a very similar solu'on as the model calibrated from the complete dataset.
The PARAFAC model is thus a general and stable model for the scope of Scandinavia.
38
![Page 39: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/39.jpg)
39
![Page 40: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/40.jpg)
• Oien the cri'cal levels (α) for the p-‐value significance is set as 0.05, 0.01 and 0.001.
• For the modeling of 14 samples (landraces) gives: – 12 degrees of freedom for the correla'on tests (mean x, y) – One-‐tailed test (looking only at posi've correla'on of predic'ons versus the reference values).
– A coefficient of determina'on (r2) larger than 0.56 is significant at the 0.001 (0.1%) level for 14 values/samples.
Many introductory text books on sta's'cs include a table of Cri'cal Values for Pearson’s r. 40
![Page 41: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/41.jpg)
41
Heading Ripening Length H-‐Index Vol wgt TGW Priekuli (L) Bjorke (N) Landskrona (S)
![Page 42: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/42.jpg)
42
LVA (2002)
NOR (2002)
LVA (2003)
NOR (2003)
SWE (2002)
SWE (2003)
![Page 43: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/43.jpg)
43
• Latvia 2002 (LY11) – May 2002 was extreme dry in Priekuli. – June 2002 was extreme wet in Priekuli. – The wet June caused germina'on on the spikes for many of the early varie'es.
• Landskrona 2003 (LY32) – June 2003 was extreme dry in Landskrona. – June was the 'me for grain filling here.
• Too extreme for the genotype to be “normally” expressed ?
• Too large effect from “G by E” interac'on ?
![Page 44: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/44.jpg)
Sta/on Year Sowing week
Rainfall (mm)
May June July August
Bjørke forsøksgård, Norway 2002 17 82.9 67.4 128.5 136.5
2003 21 75.1 85.7 67.1 53.2
Landskrona, Sweden 2002 13 53.5 75.3 76.4 68.9
2003 15 70.7 40.4 76.0 45.7
Priekuli, Latvia 2002 17 38.2 111.1 67.0 11.3
2003 19 88.0 59.2 87.8 175.8
44
![Page 45: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/45.jpg)
45
![Page 46: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/46.jpg)
46
![Page 47: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/47.jpg)
47
![Page 48: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/48.jpg)
48
• The first dataset I started to work with is a “FIGS” dataset with genebank accessions of Barley (Hordeum vulgare ssp. vulgare) collected from different countries worldwide and tested for susceptibility of net blotch infection. Net blotch is a common disease of barley caused by the fungus Pyrenophora teres.
• The barley plants were inoculated with the fungus and the percentage of the leaves infected with the disease was normalized to an interval scale (1 to 9).
• 1-3 are basically resistant group 1 • 4-6 are intermediate group 2 • 7-9 are susceptible group 3
![Page 49: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/49.jpg)
49
• Agro-‐clima'c Zone (UNESCO classifica'on) • Soil classifica'on (FAO Soil map) • Aridity (dryness) • Precipita'on • Poten'al evapotranspira'on (water loss) • Temperature • Maximum temperatures • Minimum temperatures
(mean values for month and year)
![Page 50: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/50.jpg)
50
• The correctly classified groups for the training dataset was 45.9%, and we would expect a similar success rate for the predic'on of the “blinded” values.
• Remember that random classifica'on of three groups are: 33.3%
• A test set of 9 samples showed a propor'on correct classifica'ons of 44.4%
Discriminant Analysis: obs_nb versus acz_moisture; ...
Quadratic Method for Response: obs_nb
Predictors: acz_moisture; acz_winter_temp;
acz_summer_temp; arid_annual; pet_annual;
prec_annual; temp_annual; tmax_annual;
tmin_annual
Group 1 2 3
Count 1049 1190 234
Summary of classification
Put into Group 1 2 3
1 523 427 48
2 287 451 25
3 238 314 163
Total N 1048 1192 236
N correct 523 451 163
Proportion 0,499 0,378 0,691
N = 2476 N Correct = 1137
Proportion Correct = 0,459
![Page 51: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/51.jpg)
51
Eddy De Pauw Climate data
Harold Bockelman Net blotch data
Ken Street FIGS project leader
Michael Mackay FIGS coordinator
Dag Endresen Data analysis
![Page 52: Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)](https://reader033.fdocuments.net/reader033/viewer/2022052522/554dde68b4c905cc0e8b528b/html5/thumbnails/52.jpg)
52