Potential of two-dimensional electrophoresis in routine identification of closely related durum...

8
174 P. Picard, M. Bourgoin-GrenBche and M. Zivy Elecrrophoresis 1997, 18, 174-181 Philippe Picard' Michel Zi$ Potential of two-dimensional electrophoresis in routine identification of closely related durum wheat lines Mireille Bourgoin-Grenhche' 'Groupe d'Etude et de ContrSle des VariBtBs et des Semences, Guyancourt, France 'Station de GBnhtiqne VBgBtale, CNRS-URA 2154/INRA/UPS/INA-PG, Gif-sur-Yvette, France Four closely related durum wheat varieties were compared by computer- assisted analysis of two-dimensional electrophoretic maps of leaf proteins. A low inter-varietal polymorphism was revealed and seven reliable qualitatively varying proteins allowed rapid visual identification of genotypes. For nume- rous spots, presence/absence or quantitative variations were greatly affected by a batch effect. Several criteria that should be used to discard unreliable spots or gels a priori were reviewed. Nevertheless, it was shown that, provided that the experimental design allows the integration of the batch effect, scree- ning for discriminant markers as well as computing distances based on protein quantity variations are possible and allow variety identification. Euclidean and Mahalanobis distances allowed variety discrimination and single gel classifica- tion with a minimum risk of error, not only by taking into account the quanti- tative variations in discriminant proteins selected by analysis of variance, but also by taking into account all reproducible spots. The possible applications of two-dimensional electrophoresis in variety identification are discussed. 1 Introduction Because of common purposes of selection, genetic diver- sity of cultivated durum wheat (Tritium turgidurum L.) has been strongly reduced and some cultivated varieties are phenotypically similar. Varieties are usually described according to 41 botanical, physiological and morpholog- ical criteria in France and the same criteria can be used for identification. As for most economically important crops, new biochemical and molecular tools were used in order to improve the identification process. Gliadin and glutenin storage proteins have been widely studied and the one-dimensional high molecular weight (HMW) glu- tenin and gliadin electrophoretic patterns are used as identification criteria for durum and bread wheat [l, 21. However, because of the low genetic diversity of wheat varieties [3], these methods proved to be inefficient for durum wheat identification. For instance, in 1994, the most frequently occurring HMW glutenin allelic forms grouped together 80% of French cultivated varieties. The discrimination could be improved by combining gliadin and glutenin descriptions, or by using more resolving systems such as IPG [4] or 2-DE [5, 61. RFLP and PCR studies increased the number of polymorphic markers but only with probes from storage protein genes [7, 81. Considering the number of markers revealed, 2-DE of total proteins is the most powerful identification tool based on protein polymorphism. Several studies have shown that it allows the detection of genetic variations in structural genes encoding the expressed proteins [9], and of loci controlling their quantity [lo]. Thus, compared to tools commonly used to assess relation- Correspondence: Dr. Mireille Bourgoin-GrCnBche, G.E.V.E.S. labora- toire de biochimie, B.P. 52, 17700 Surgeres, France (Tel: +33-5-4668- 3031; Fax: +33-5-4668-3024) Nonstandard abbreviations: ANOVA, analysis of variance; CV, coeffi- cient of variation; 1-D, one-dimensional; Deucli, Euclidean distance; Dmaha, Mahalanobis distance; PCA, principal component analysis; UPGMA, unweighted pair-group method using arithmetic averages Keywords: Durum wheat / Two-dimensional polyacrylamide gel elec- trophoresis I Varietal identification ships between genotypes such as RFLP and random amplified polymorphic DNA (RAPD), 2-DE has the advantage of revealing only variations related to expressed genes. In the present study, we tested th.e potential of this technique for variety identification in unfavorable conditions, using Tritium turgidurum, a tetraploid species, and studying closely related varieties. With the long-term view of using results for genetic dis- tance calculation, total protein was preferred to storage protein analysis. Only a few loci are involved in gliadin and glutenin synthesis [ll, 121 and even if they often give rise to sufficient polymorphism, biochemical genetic distances computed accordingly are not correlated with pedigrees [13]. 2-DE of leaf total proteins allows the revelation of numerous proteins with different functions [14] and encoded genes are spread along the genome [lo]. 2 Materials and methods 2.1 Plant material Four durum wheat varieties (Cando, Regal, Capdur, Arcour) were provided by G.E.V.E.S (Groupe d'Etude et de ContrBle des Vari6tes et des Semences, Surgkres). The varieties were closely related, in that Arcour and Capdur are half-sister lines and Cando, Regal, and Capdur have a common line in their parental pedigree. These four varieties were chosen according to their one- dimensional (1-D) electrophoretic glutenin components; Cando and Arcour are therefore similar to Regal and Capdur, respectively. 2.2 2-De protocol Seedlings were allowed to germinate in Petri dishes on water-imbibed filter paper in the dark at 25°C for 7-8 days. The first etiolated leaf was collected without coleoptiles once they were 10-13 cm high. Total proteins were extracted as in Damerval et al. [15] with a slightly simplified procedure. All extraction steps, from col- lecting to protein resolubilization, were performed in a 0 VCH Verlagsgesellschaft mbH, 69451 Weinheirn, 1997 0173-0835/97/0101-0174 $10.00+.25/0

Transcript of Potential of two-dimensional electrophoresis in routine identification of closely related durum...

174 P. Picard, M. Bourgoin-GrenBche and M. Zivy Elecrrophoresis 1997, 18, 174-181

Philippe Picard'

Michel Zi$

Potential of two-dimensional electrophoresis in routine identification of closely related durum wheat lines Mireille Bourgoin-Grenhche'

'Groupe d'Etude et de ContrSle des VariBtBs et des Semences, Guyancourt, France 'Station de GBnhtiqne VBgBtale, CNRS-URA 2154/INRA/UPS/INA-PG, Gif-sur-Yvette, France

Four closely related durum wheat varieties were compared by computer- assisted analysis of two-dimensional electrophoretic maps of leaf proteins. A low inter-varietal polymorphism was revealed and seven reliable qualitatively varying proteins allowed rapid visual identification of genotypes. For nume- rous spots, presence/absence or quantitative variations were greatly affected by a batch effect. Several criteria that should be used to discard unreliable spots or gels a priori were reviewed. Nevertheless, it was shown that, provided that the experimental design allows the integration of the batch effect, scree- ning for discriminant markers as well as computing distances based on protein quantity variations are possible and allow variety identification. Euclidean and Mahalanobis distances allowed variety discrimination and single gel classifica- tion with a minimum risk of error, not only by taking into account the quanti- tative variations in discriminant proteins selected by analysis of variance, but also by taking into account all reproducible spots. The possible applications of two-dimensional electrophoresis in variety identification are discussed.

1 Introduction

Because of common purposes of selection, genetic diver- sity of cultivated durum wheat (Tri t ium turgidurum L.) has been strongly reduced and some cultivated varieties are phenotypically similar. Varieties are usually described according to 41 botanical, physiological and morpholog- ical criteria in France and the same criteria can be used for identification. As for most economically important crops, new biochemical and molecular tools were used in order to improve the identification process. Gliadin and glutenin storage proteins have been widely studied and the one-dimensional high molecular weight (HMW) glu- tenin and gliadin electrophoretic patterns are used as identification criteria for durum and bread wheat [l, 21. However, because of the low genetic diversity of wheat varieties [3], these methods proved to be inefficient for durum wheat identification. For instance, in 1994, the most frequently occurring HMW glutenin allelic forms grouped together 80% of French cultivated varieties. The discrimination could be improved by combining gliadin and glutenin descriptions, or by using more resolving systems such as IPG [4] or 2-DE [5, 61. RFLP and PCR studies increased the number of polymorphic markers but only with probes from storage protein genes [7, 81.

Considering the number of markers revealed, 2-DE of total proteins is the most powerful identification tool based on protein polymorphism. Several studies have shown that it allows the detection of genetic variations in structural genes encoding the expressed proteins [9], and of loci controlling their quantity [lo]. Thus, compared to tools commonly used to assess relation-

Correspondence: Dr. Mireille Bourgoin-GrCnBche, G.E.V.E.S. labora- toire de biochimie, B.P. 52, 17700 Surgeres, France (Tel: +33-5-4668- 3031; Fax: +33-5-4668-3024)

Nonstandard abbreviations: ANOVA, analysis of variance; CV, coeffi- cient of variation; 1-D, one-dimensional; Deucli, Euclidean distance; Dmaha, Mahalanobis distance; PCA, principal component analysis; UPGMA, unweighted pair-group method using arithmetic averages

Keywords: Durum wheat / Two-dimensional polyacrylamide gel elec- trophoresis I Varietal identification

ships between genotypes such as RFLP and random amplified polymorphic DNA (RAPD), 2-DE has the advantage of revealing only variations related to expressed genes. In the present study, we tested th.e potential of this technique for variety identification in unfavorable conditions, using Tritium turgidurum, a tetraploid species, and studying closely related varieties. With the long-term view of using results for genetic dis- tance calculation, total protein was preferred to storage protein analysis. Only a few loci are involved in gliadin and glutenin synthesis [ l l , 121 and even if they often give rise to sufficient polymorphism, biochemical genetic distances computed accordingly are not correlated with pedigrees [13]. 2-DE of leaf total proteins allows the revelation of numerous proteins with different functions [14] and encoded genes are spread along the genome [lo].

2 Materials and methods

2.1 Plant material

Four durum wheat varieties (Cando, Regal, Capdur, Arcour) were provided by G.E.V.E.S (Groupe d'Etude et de ContrBle des Vari6tes et des Semences, Surgkres). The varieties were closely related, in that Arcour and Capdur are half-sister lines and Cando, Regal, and Capdur have a common line in their parental pedigree. These four varieties were chosen according to their one- dimensional (1-D) electrophoretic glutenin components; Cando and Arcour are therefore similar to Regal and Capdur, respectively.

2.2 2-De protocol

Seedlings were allowed to germinate in Petri dishes on water-imbibed filter paper in the dark at 25°C for 7-8 days. The first etiolated leaf was collected without coleoptiles once they were 10-13 cm high. Total proteins were extracted as in Damerval et al. [15] with a slightly simplified procedure. All extraction steps, from col- lecting to protein resolubilization, were performed in a

0 VCH Verlagsgesellschaft mbH, 69451 Weinheirn, 1997 0173-0835/97/0101-0174 $10.00+.25/0

Electrophoresis 1997, 18, 174-181 Wheat line identification by 2-DE 175

single 4.5 mL NUNC tube. All centrifugations were car- ried out at 4500 g for 5 min at room temperature in hori- zontal centrifuge baskets. After precipitation, proteins were rinsed twice (1 h and 30 min) and resuspended in 50 pL of UKS buffer (urea-potassium-mercaptoethanol) per mg of dry pellet. We used the ISO-DALT system for analysis [16]. Each run represented a set of up to 20 2-DE gels (240 X 200 X 1.5 mm) from the same IEF and SDS-PAGE runs. IEF was done according to Leo- nardi et a/. [17], except that piperazine diacrylamide (PDA) replaced Bis, in the same proportions [18]. IEF was performed for 35 000 Vh. SDS-PAGE was run as in Damerval et al. [19], except that the IEF gels were not equilibrated, and 2-D gels were not supported by Gel- Bond polyester films. Colloidal Coomassie Brilliant Blue staining was according to Neuhoff et al. [20], with only one rinse in phosphoric acid for 1 h, and methanol was replaced by ethanol. Gels were allowed to stain in her- metic plastic boxes, with gentle shaking for 72 h.

2.3 Automatic analysis of 2-D gels

The Kepler 2-D analysis package (LSB, Rockville, MD, USA) was used. Gels were digitized in 2200 X 2200 pixel images, using an Eikonix 1412 scanner with a spa- tial resolution of 100 pm per pixel. Optical density was translated to 256 grey levels. After background subtrac- tion, spots were detected according to the method of Zivy (in preparation) and quantified; the quantitative measurement of spot intensity was computed as the volume of a 2-DE Gaussian model fitted onto the image resulting from background subtraction. The synthetic “master gel” was built from coelectrophoresis with a mix- ture of the four genotypes. Before quantitative analysis, spot volumes were scaled by multiplying them by a coef- ficient computed for each gel: the sum of the volumes of 181 spots in the master gel divided by the sum of the volumes of the same spots in the analytical gel. Statis- tical analyses were carried out using the SAS package software [21] from data computed with the Kepler soft- ware.

2.4 Experimental design

Batches A8, B34, and B35 contained gels from the four varieties Cando, Regal, Arcour and Capdur; batches A7 and A10 contained gels from Cando and Regal; batches A3 and A9 contained gels from Capdur only. Except for the A9 gels, which were run from the same extract, every gel was obtained from different seedling (Table 1). The seven batches were obtained over a time of three years, each one consisting of a set of 16-20 gels run and stained simultaneously.

2.5 Computation of distances

Euclidean (Deucli) and Mahalanobis (Dmaha) distances were computed, once the variation due to the batch effect (estimated by analysis of variance for each spot) was subtracted from initial data. In addition, before Deucli computation, volumes were first standardized in each batch and spots were weighted by the inverse of

Table 1. Number of gels produced for each variety in eight 2-DE batches

Date Batch Cando Regal Capdur Arcour 12/93 835 4 5 5 4 12/93 834 5 4 4 5 04/91 A8 3 5 5 3 02/92 A10 8 9 04/91 A1 5 7 02/91 A3 8 09/91 A9a’ 9

a) Nine gels produced h o m the same protein extract

their respective coefficients of variation (CVs). For the computation of Dmaha, when the number of spots taken into account was higher than the number of gels, a prin- cipal component analysis (PCA) was first carried out, and distances were computed on the n-1 first principal components of the PCA ( n standing for the number of groups studied) [22]. Dendrograms were built from the distance matrices according to the group average unweighted pair-group method using arithmetic averages (UP GMA).

3 Results

3.1 Reproducibility between independent 2-D runs

As the A8, B34, and B35 batches included gels of the four varieties and because the master gel originated from another batch, we could compare the number of matched spots without bias (Table 2). The number of spots identified per gel in batches A8, B34, and B35 did not differ significantly (P < 0.05). On average, the propor- tion of the most reliable spots according to the presence- absence criterion, i.e. the spots present in every gel of each batch, was 21.6%, whereas the proportion of unreli- able spots, present in only one gel, was 3.5%. More than two years separated A8 from B34 and B35; however, this had no noticeable effect on the number of matched spots. Variations in matched spot numbers depended on an extract effect to some extent: in batch A9, where every gel was issued from one Capdur extract, the pro- portion of spots present in every gel was 73% high than in batch A3, where extracts corresponded to different individuals from the same variety. Although the same genotypes were used in batches AS, B34, and B35, few spots were found to be batch-specific: four spots present in A8 gels were absent from batch B35 while 27 and 28 spots, present in B34 and B35, were absent from batch A8.

Spot volume comparisons were done on 181 spots which did not exhibit more than one missing data point for each genotype in the three batches (A8, B34, and B35), i.e., spots present in at least n-1 gels in each of the twelve groups defined by a variety/batch combination (n being the number of gels per group). Volume stabili- ties estimated by their CV appeared similar for the A3, A8, B34, and B35 batches (from 27% to 30.2%, Table 2) and shifted slightly to lower values for the A9 batch (22.6%). Compared to A3, the lower median CV of A9, whose gels originated from the same extract, reflected the extraction influence upon volume stability.

176 P. Picard, M. Bourgoin-Greneche and M. Zivy Electrophoresis 1997, 18, 174-181

Table 2. Volume stability and number of matched spots in three batches Batch Total number Mean number of Spots present in Spots present in Median

of spots spots per gela) every gelb) one gel onlyb) cvc) A8 732 494 B34 820 52 1 B35 790 528

22.8% 3.4% 27.0°/o 22.1% 3.1% 30.2% 20.2% 2.846 29.8%

~

a) See Table 1 for number of gels per batch b) Expressed as percentage of the total number of matched spots c) Computed on the volumes of 181 reproducible spots

in each batch

3.2 Screening for discriminant markers: qualitative and quantitative polymorphism between the varieties

Screening for qualitative variations between the four varieties was performed in three steps. First, an intra- batch screening using A8, B34, and B35 was carried out with the criterion of 100% presence-absence. In total, 36 spots from the whole 2-DE patterns were selected. Second, the three batches were merged and the 36 puta- tive qualitative markers were re-examined on every gel. Afterwards, batches A3, A7, and A10 were integrated into the experiment. Only two spots confirmed the 100% presence-absence criterion in the seven batches. When one unexpected absence or presence was tolerated per genotype, seven spots out of the 36 first selected were retained (Fig. 1). As the number of possible technical problems (in 2-D gel or in image processing) usually in- creases with the number of studied gels (81 gels in this study), we considered this level of inconsistency as toler- able.

Possible causes for the inconsistent records on 29 (36-7) spots were examined. The three criteria used were (i) spot location: the absence of a spot could be explained by its location in streaking zones or on gel edges; (ii) local protein charge: a faint spot could disap- pear in gels containing less protein than others. To take into account possible zone-to-zone variations, local coef- ficients of protein charge were computed. According to this criterion, the abnormal absence of a spot was explained when the local charge was smaller than the smallest charge of gels possessing the spot, and conver- sely the abnormal presence of a spot was considered as explained when the local charge was higher than the highest charge of gels missing this spot; (iii) spot volume stability: abnormal presence/absence data were considered as possible when the 90% interval of confi- dence of the spot volume included zero. These three cri- teria allowed us to explain abnormal records for 14 of the 29 spots studied. Thus, 21 qualitative markers (14 + 7) could be scored to identify the four lines (Table 3). Some of the 15 unreliable spots discarded exhibited batch-specific behavior and could lead to identification mistakes when screening is based on a single batch anal- ysis. For instance, spot 1072 was present in Regal and absent from Arcour in batch B35, but was, on the con- trary, absent from Arcour and present in Regal in A8.

Quantitative polymorphism between varieties was revealed by two-way analysis of variance carried out on the 181 spots in the three batches, A8, B34, and B35. Eighty-three, 38, and 10 spots showed a significant effect (P < 0.01) for the factors “batch”, “variety” and the inter- action “batch X variety”, respectively. We kept the 35

Table 3. Qualitatively and quantitatively varying spot numbers discrim- inating the four genotypes (listed above and under the diag- onal, respectively)

Cando Regal Caodur Arcour ~

Cando 5 9 16 Regal 23 8 14 Capdur 14 15 13 Arcour 19 23 30 -

spots showing a variety effect but no significant interac- tion effect. Ten additional spots, with more missing data than in the initial 181 spot file, showed a highly signi- ficant variety effect and were consequently added to the discriminant marker list. The Student-Newman-Keul multiple range test was then performed on those 45 spots (Table 3). According to the numbers of spots signi- ficantly different in the pairwise comparison, Capdur and Cando appeared quite close and Arcour seemed to be the farthest variety of the three others.

Single factor analyses of variance (ANOVAs) were also carried out separately on batches A8, B34, and B35 for the 181 reproducible spots. Only nine of the spots found to be significant in the 2-factor ANOVA were also signif- icant (P < 0.05) in the three batches (Fig. 1) and only four of them could be detected by visual inspection. Nevertheless, it should be noted that batch A8 was re- sponsible for 75% of the significant “variety X batch” interactions in the two-way ANOVAs. The number of replicates n necessary to detect a given difference between genotypes by analysis of variance can be related to spot CVs and fixed type I and type I1 risks as follow [23]:

n = [2(U,-a,* + UJ CV21/6*, i1)

6 , being the minimum detectable volume difference between the four genotypes, expressed as a percentage of the general average and u, being the associated proba- bilities with a and p values found in the Student T-test table. Accordingly, the probability of finding a 50% 6, dif- ference at P < 0.01 was 75.5% on average in this experi- ment; the mean 6, between the genotype showing the highest value and the genotype with the lowest value was 49% for spots showing a significant genotype effect at P < 0.01.

3.3 Quantitative distances

Dmaha and Deucli mean distances were computed between each variety in each batch (variety/batch). The spots were selected for these computations by using the same criterion as for ANOVAs: 154 spots met this crite-

Electrophoresis 1997, 18, 174-181 Wheat line identificatlon by 2-DE 177

rion when gels from batches A7, AS, AlO, B34, and B35 were simultaneously considered. When the 154 spots were taken into account, the dendrograms showed clus- ters grouping variety/batches according to the variety. In the dendrogram built according to Dmaha, only one abnormal clustering was found: Capdur/AS was outside of the Capdur cluster (Fig. 2); in the dendrogram from Deucli, Capdur/AS and Cando/A8 were not in their respective variety clusters (Fig. 3). When distances were based on spot volumes showing significant genotype effect (P < 0.01) in the Al/A10/AS/B34/B35 experiment (32 spots), the dendrogram built from Dmaha correctly clustered all variety/batch in their respective variety clus- ters, and the dendrogram from Deucli showed only one exception, for Arcour/AS gels (not shown). Whatever dis- tance index used, the average within-variety distance was always smaller than the average between-varieties dis- tance (Table 4). However, depending on the number of spots selected for computation, Euclidean and Mahala- nobis distances did not show the same potential to dis- criminate varieties.

Figure 1. Coelectrophoresis of the four varieties. Black and white arrows show most reliable quantitatively and quantitatively varying markers, respectively. Molecular weights are indicated on the left of the pattern.

3.4 Validation of quantitative distances: single gel classification

Gels were classified by using the “candisc” procedure of SAS based on Dmaha, and cross validations were then done: a discriminant analysis based on Dmaha was car- ried out on every gel minus one. The one excluded from the computation was then considered as an extra-indivi- dual and classified without indication concerning its orig- inal variety. Following this procedure, every gel was tested one at a time and the number of cases where the unknown gel was classified with the right variety was counted. In order to value the influence of the spot set and of the batch on the classification efficiency, three analyses were performed. When the 154 spots were taken into account for Dmaha computation, 95% of the gels were correctly identified. When the 32 significant spots were taken into account, the percentage reached 99%, i.e. one classification mistake out of 81 gels. When the Dmaha was based on significant spots not selected in the whole experiment but in batches B34/B35 only,

178 P. Picard, M. Bourgoin-Greneche and M. Zivy Electrophoresis 1997, 18, 174-181

C A P A E

C A N 8 3 5

+ C A N 0 3 4

the percentage of correct identification among the 81 gels was 92.5 %.

4 Discussion

The first aim of this study was to test the possibility of finding differences between inbred varieties of durum wheat cultivated in France, knowing that the genetic diversity used for breeding in this species is limited. The four studied varieties consisted of two pairs of genotypes showing no difference in their official 1-D glutenin pat- terns, with even those showing different patterns sharing common ancestor varieties, as in the case for most com- mercialized varieties. The first approach was to look for spots showing a qualitative variation (presencelabsence) between lines. Four varieties were compared in three dif- ferent batches, and qualitative variations were traced sep- arately in each of them. The consistency of these varia- tions was then tested by comparing the three lists of selected spots, and by looking at their behavior in two additional batches. Seven spots showed a consistent pres- ence/absence variation, and they were sufficient to dis- criminate the four genotypes. Nevertheless, the number of selected spots in the three original lists was between 13 and 22. This discrepancy was directly related to the fact that most spots initially selected were faint spots. Indeed, the search for qualitative variations on small spots is complicatd by the following two problems. (i) Detection threshold: it is possible that in some gels

from a genotype actually having the spot, the latter is not detected because its intensity is below the detection threshold. (ii) It is likely that some of the variations scored as qualitative are actually quantitative, the small quantity being scored as an absence. In this case, the spot can show a qualitative variation in some batches and be sporadically present in other batches. This behavior make it complicated to study their variation in terms of presence/absence, but also in terms of quantita- tive variation, because the spot is too often absent to be studied by analysis of variance. Criteria related to pro- tein loading, the position of the spot in the gel, its inten- sity and its coefficient of variation, were used to try 1.0 eliminate these spots a priori from the initial sets of putative qualitatively varying spots. These criteria suc- cessfully eliminated half of them. Note that some of the remaining inconsistencies could be due to residual genetic heterogeneity within the varieties (e.g. for spot 1072). Whatever the source of inconsistencies, it is clear from these data that more than one batch is needed to ascertain the qualitative variation of spots in durum wheat. This statement is also revelant to quantitative data: the mean intensity difference between the geno- type having the highest and the lowest value was 49% of the general mean for the 45 spots showing a significant (P < 0.01) variety effect. Only four could be clearly ob- served by visual inspection of the gels. Given this small amplitude of variation, it is not surprising that most of them could not be identified by analyzing each batch separately.

Elecrrophoresis 1991. 18, 174-181 Wheat line identification by 2-DE 179

CAPO35

4 cArn3n

I CANAR

BCAR 7 I

RECl33.I I RBGD.35

RECA7

I R F C A 10

C A N 0 3 5

-CANU34

CANA?

CANAIO

ARC1134

ARCAH Figure 3. Clustering of the 16 variety/ batch groups according to Deucli (square roots) and UPGMA method. Distances were computed from 154 spot volumes

Distance I I I I I

1C 0 12 0 n o I 0 0 0 (cc Fig. 2 for abbreviation).

By analysis of variance, a significant batch effect was found for most spots, and a batch X genotype interac- tion for a few of them. A posteriori, the results from the A8 batch were less reliable than those from other batches: in addition to the fact that it was the batch re- sponsible for most of the genotype X batch interaction, its patterns were less well classified than those of other batches (see Figs. 2, 3). It is the only batch in which one or two genotypes were not in the right cluster, and the other genotypes from this batch were the most distant from their respective clusters. Because discarding this batch would have improved the results of our experi- ments, quality control criteria that would have allowed us to discard it a priori were examined. Patterns from this batch were not different from those of other batches according to the number of detected spots, total spot intensities or coefficients of variation, and no striking dif- ference was found by visual inspection of the gels. No particular effect of this A8 batch was detected by prin- cipal component analysis (data not shown). Finally the only criterion that allowed us to discriminate A8 from the other batches was the spot resolution along the IEF axis: in the A8 patterns, spots were spread along a larger distance than in all other patterns. This might be due to an abnormal stretching of the focusing gel during loading on the SDS gels. Probably because of this, spot intensity estimation was of poorer quality in these gels than in the others, confirmed by a parameter provided by the Kepler package. Thus, strict quality control should be run on each batch before genetic analysis, including a

control on the geometrical parameters of 2-DE gels. Using immobilized pH gradient strips rather than rod gels for IEF would be an alternative solution, which would improve the reproducibility of IEF [4] but would also increase the cost of 2-DE analysis.

The distance between two varieties can be computed as the number of spots significantly different between them [22]. An advantage of this distance is that it is easy to test whether two genotypes are significantly different or not. As 181 spots were tested, the number of differences expected randomly is 9 at P < 0.05. Pairwise compari- sons were done using the Student-Newman-Keul test for the spots showing a variety effect. The minimum number of significant differences was 14 (Capdur/Cando, see Table 3). Thus it can be concluded that according to this criterion the four varieties are different. However, this method depends on the significance threshold used: for example, two genotypes could be considered as not different because of the absence of spots significantly dif- ferent at P < 0.01, while several spots would be different at P < 0.05. Thus we found it interesting to compare the varieties according to distances, taking into account the continuous variation of spots. Deucli and Dmaha dis- tances were computed on the basis of the spots showing a genotype effect and no batch X genotype interaction. Quantitative variations allowed good discrimination between genotypes, except for some gels from A8 (see above discussion). The classification was robust since we showed that genotypes from batches not taken into

180 P. Picard, M. Bourgoin-Grenkche and M. Zivy Elecrrophorais 1997, 18. 174-181

Table 4. Average within- and between-genotype distancesa1 Computation type Spot set used Between-genotype Within-genotype

Euclidean 34 spots 121.22 61.31 154 spots 321.84 214.62

Mahalanobis 34 spots 45.94 29.74 154 spots 34.22 9.71

a) Deucli versus Dmaha computed on 32 quantitatively varying spot volumes or on 154 reproducible

b) Mean values of the 6 distances computed for each genotype painvise comparison c) Mean values of the 4 distances computed for each genotype

distancesb) distances''

spot volumes

account for the selection of spots could also be well clas- sified. Cross-validation studies also showed that even individual 2-D gels could be well classified, with a small percentage of errors (1.2%). This particular result shows that during the consultation of a database, a single gel could be sufficient for its classification. However, it must be pointed out that (i) gels from reference genotypes should be run in the same batch to allow the computa- tion of the batch effect, and (ii) the right genotype should exist in the database, since this method will only classify the gel in pre-defined groups.

The use of a set of spots showing a significant genotype effect allowed good discrimination. However, these spots could be inappropriate for the classification of other genotypes; it cannot be excluded that differences with new genotypes exist only for other spots. Taking this in consideration, we computed Deucli and Dmaha dis- tances between groups based on all spots found repro- ducible in all batches, i.e. without including any a priori information. Unlike Euclidean calculations, Dmaha ap- peared more convenient for the differentiation of geno- types since, relative to within-variety distance, between- genotype Dmaha was greater than by using only the spots showing a significant genotype effect. The pheno- gram also showed good discrimination (see Fig. 2) and cross-validation tests for the classification of individual gels were kept at a good level (5% of errors).

We have shown that, despite the low genetic variability in cultivated varieties of durum wheat, 2-DE allows the distinction between them. Qualitative and quantitative differences were found between varieties showing no dif- ference in their 1-D glutenin patterns. The small number of qualitative variations can be due to a narrow genetic diversity, but it can also be related to tetraploidy: it has been shown that most spots are actually double-dose spots corresponding to the overlapping of homoeoallelic products in this species [25]. On the contrary to what was observed in more polymorphic species such as maize [22, 24, 261 or sunflower (in preparation), within- variety variations were relatively high compared to between-variety variations, and a large number of gels was necessary to select significant and reliable quantita- tive and qualitative variations. To a certain extent, this could limit the development of 2-DE as a routine tool for wheat variety distinction but it should be noted that any technique needs a considerable initial input to create a database supplying good levels of discrimina- tion. Nevertheless, 2-DE could be used in short delays for comparisons between small numbers of genotypes: for example, to test a seed lot belonging to a variety, or to look for differences between varieties very similar

according to other agro-morphological criterion. Moreover, we established that neither spot selection nor genetic hypothesis was necessary to the computation of Mahalanobis distances allowing efficient identification between varieties. Regarding the number of markers involved in calculations, 2-DE could then be a useful tool for the estimation of genetic distances and also a straightforward way of looking for differences in gene expression.

Received August 19, 1996

5 References

[ l ] Payne, P. I . , Lawrence, G. L., Cereal Res. Commun. 1983, 11,

[2] Berger, M., Lebrun, J. , Instirut de recherche technologique agro-

[3] Branlard, G., Chavalet, C., Agronomie 1984, 4, 933-938. [4] Gorg, A, , Postel, W., Baumer, M., Weiss, W., Electrophoresis 1992,

(51 Payne, P. I . , Holt, L. M., Jarvis, M. G., Jackson, E. A, , Cereal

[6] Dougherty, D. A,, Zeece, M. G., Wehling, R. L., J. Chromatogr.

[7] Vaccino, P., Accerbi, M., Corbellini, M., Theor. Appl. Genet. 1993,

[ 8 ] DOvidio, R., Tanzarella, 0. A,, Proceddu, E., Plant Mol. B i d .

191 De Vienne, D., Burstin, J . , Gerber, S., Leonardi, A., Le Guilloux, M., Murigneux, A., Beckert, M., Bahrmann, N., Damerval, C., Zivy, M., Heredity 1996, 76, 166-177.

[ lo] Damerval, C., Maurice, A,, Josse, J . M., de Vienne, D., Genetics 1994, 137, 289-301.

[ll] Payne, P. I., Law, C. N., Mudd, E. E., Theor. Appl. Genet. 1980, 58, 113-120.

[12] Hart, G. E., Gale, M. D., in: O'Brien, S. J. (Ed.), Genetics Malls, Cold Spring Harbor Laboratory Press, New York 1987, 4, 7, pp. 670-684.

[13] Picard, B., Branlard, G., Oury, F. X., Rousset, M., Agronomie 1992,

[14] Touzet, P., Riccardi, F., Morin, C., Damerval, C., Huet, J . C., Per- nollet, J.-C., Zivy, M., De Vienne, D., Theor. Appl. Genet. 1996, 93, 997-1005.

[15] Damerval, C., de Vienne, D., Zivy, M., Thiellement. H., Elect,w- phoresis 1986, 7 , 52-54.

[16] Tollasken, S. L., Anderson, N. L., Anderson, N. G., Operation of the ISO-DALTSystern, Large Scale Biology Press, Washington, D C 1988, p. 162.

[17] Leonardi, A,, Damerval, C., de Vienne D., Genet. Res. Camlr. 1987,

[18] Hochstrasser, D. F., Harrington, M. G., Hochstrasser, A. C., Miller, M. I., Merril, C. R., Anal. Biochem. 1988, 173, 424-435.

[19] Damerval, C., Le Guilloux, M., Blaisonneau, J . , de Vienne, D., Electrophoresis 1987, 8, 158-159.

I201 Neuhoff, V., Stamm, R., Eibl, H., Electrophoresis 1988, 9,255-262. [21] SAS/STATiM User's Guide, Release 603 Edition Cary, SAS Insti-

29-33.

alimentaire des ce'rc'ales, Paris 1988, p. 18.

13, 192-203.

Chem. 1985, 62, 319-326.

1989, 480, 359-369.

86, 833-836.

1990, 15, 169-171.

12, 611-622.

50, 1-5.

tute Inc., Cary, NC 1988.

Electrophoresis 1997, 18, 174-181 Wheat line identification by 2-DE 181

1221 Burstin, J., Zivy, M., de Vienne, D., Damerval, C., Electrophoresis

[23] Dagnelie, P., Thhories et mhrhodes sta~istigues, Presses agrono-

[24] Burstin, J., de Vienne, D., Dubreuil, P., Damerval, C., Theor. Appl.

[25] Thiellement, H., Seguin, M., Bahrman, N., Zivy, M., J. Mol. Evol.

[26] Higginbotham, J. W., Smith, J. S. C., Smith, 0. S., Electrophoresis 1993, 14, 1067-1073.

miques de Gembloux, Gembloux 1975, Vol. 2, pp. 138-152. 1991, 12, 425-431.

Genet. 1994, 89, 943-950.

1989, 29, 89-94.