DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

53

description

DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification. Bill Birky Department of Ecology and Evolutionary Biology The University of Arizona. - PowerPoint PPT Presentation

Transcript of DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Page 1: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification
Page 2: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

DNA Barcoding the Right Way: A Theory-based Method for Species

Detection and Identification

Bill BirkyDepartment of Ecology and Evolutionary Biology

The University of Arizona

Page 3: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Biological Diversity is DiscontinuousMy goal is to understand a remarkable and general feature of nature: that the diversity of organisms does not present to us as a continuum but as more or less distinct clusters of individuals with different phenotypes that we call species.

Page 4: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Why We Should Care About Species

Species are treated as fundamental units of biological diversity in areas of biology including• systematics• conservation• population genetics• evolutionary biology• biogeography• any research paper where we need to specify the experimental organism(s)

How we define species and distinguish one from another really matters.

Page 5: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Darwin’s Conflicted Views of Species 1868 The Origin of Species 5th edition, p. 415

“Hereafter we shall be compelled to acknowledge that the only distinctionbetween species and well-marked varieties is, that the latter are known, orbelieved to be connected at the present day by intermediate gradations,whereas species were formerly thus connected. Hence, without rejecting theconsideration of the present existence of intermediate gradations betweenany two forms, we shall be led to weigh more carefully and to value higherthe actual amount of difference between them.…”

At this point Darwin had got it right. In this talk I will follow his advice and weigh more carefully the actual amount of difference between species, relative to the differences within species.

If only Darwin had stopped here, but he didn’t…

Page 6: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Fast-forward to 2011…The good news:1. We now have a proliferation of models of what species are

(theoretical/conceptual definitions, often called species concepts) and analytic tools to assign individuals to species (operational definitions, often called species criteria).

2. DNA sequences provide powerful tools for systematics.

The bad news: 1. We have a proliferation of models and operational definitions. There is a state

of “…warfare’ among adherents to different systematic doctrines...and …astonishingly combative language and behavior of some partisans.”. (Doug Futuyma)

2. Some biologists believe that species aren’t real.3. Systematics is laissez faire when it comes to publishing actual species

descriptions. Most such papers make no mention of species concepts or operational definitions.

My approach to delimiting Eukaryotic species…

Page 7: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Darwin: the Gap’s the Thing

Gap in

Phenotypes

sort

Cluster of similar phenotypes Cluster of similar phenotypes

Page 8: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

The Gap’s the Thing…But How Big a Gap?

Gap in

Phenotypes

Gap in

Phenotypes

Gap in

Phenotypes

Can be addressed using very sophisticated morphometric, physiological, or behavioral analyses but this is much too time-consuming for routine use… and it is no help with environmental sequences.

Page 9: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Clades in Phylogenetic Trees of DNA Sequences Often Reflect Phenotypic Clusters That We See

What we see…

phenotypic gap

What we infer from sequences…

genotypic gap

Page 10: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Species Clusters in Phylogenetic Trees of DNA Sequences Reflect Phenotypic Clusters That We

See…But Also Detect Clusters That We Can’t See

What we see

What we infer from sequences

But does this sequence gap separate species, or just varieties within a species?

Page 11: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

A Population/Evolutionary Genetic Perspective on Species

Page 12: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Causes of Gaps and Clusters in DNA Sequence Trees

Accidental variation in the numbers of offspring (random drift) produces transient, shallow gaps and clusters of average depth 2Ne generations.

Physical isolation, reproductive isolation, or adaptation to different niches produces deep gaps and clusters of mean depth > 2Ne generations.

Page 13: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

The Evolutionary Genetic Species Model

This led me to the Evolutionary Genetic Species Model (EGSM): Evolutionary Genetic Species are inclusive populations that can be shown to be evolving independently from each other. They are independent arenas for mutation, selection, and random genetic drift. Their independence can be the result of adaptation to different niches, or physically isolation, or both. [Erratum: or reproductive isolation.]This is a variant of the Evolutionary Species Concept that is (1) explicitly genetic so we can use it with DNA sequences; (2) does not require the species be adapted to different niches; and (3) does not require knowing that independence is permanent. Note that it is often difficult to tell whether two populations are evolving independently due to niche divergence or physical isolation.

Page 14: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

A Species Criterion or Operational Definition

The Evolutionary Genetic Species Model is a conceptual definition; it needs an operational definition or species criterion to say whether two or more individuals belong to one species or to two or more. This can be done in a number of ways. For example: in sexual organisms, using the Biological Species definition and testing individuals for reproductive isolation by trial matings or indirect inference from population genetic data, morphology, etc. I am focusing on DNA sequence data.

Page 15: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

We Can Use Genes to Delimit EG Species, but Which Gene(s) Should We Use?

Ideal: gene responsible for reproductive isolation or adaptation to different niches or first gene to complete coalescence after physical isolation. Gene responsible for isolation completes lineage sorting when isolation is complete. Usually a nuclear gene(s). Problem: we rarely know what this gene is. Never know with environmental sequences.

Page 16: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

What Gene Should We Use?

Second best: organelle gene (mitochondrial or chloroplast). Inherited uniparentally (usually maternally in animals and plants), and effectively haploid. Therefore effective population size is ≈ Nf. This is 1/4 the effective size for nuclear genes, so completes lineage sorting 4 times faster.

Organelle genes detect speciation earlier.

Other nuclear genes: in sexual organism, different genes sort at different times by chance, ranging from about the time of speciation through average of 2Ne ≈ 4Nf and higher.

Page 17: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

What Gene Should We Use?Organelle genes have other practical advantages:

All copies of a gene in a cell or organism are identical at most sites. Consequently one can PCR-amplify an organelle gene and sequence the amplification products directly without cloning.

The mitochondrial “barcode” gene (cox1 or CO1) can be amplified from most animals with a universal pair of primers and has an ideal amount of diversity for identifying species.

Page 18: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Gaps: How Deep Is Deep Enough?

?

?

?

?

We have a gene that can detect early stages of speciation. A tree of such a gene should show a gap between species. But how deep must it be to distinguish gaps between species from gaps between clades within species?

Page 19: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Gaps: How Deep is Deep Enoughto Differentiate Species?

P = 0.5

P = 0.81

This is the question I will answer by calculating probabilities that the specimens came from independently evolving populations.

P = 0.98

P > 0.99 Note: this is a purely hypothetical case, probabilities are very rough approximations.

My favorite cutoff is 95%, so the probability of single species is ≤ 5%.

Page 20: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Gaps: How Deep is Deep Enoughto Differentiate Species?

We need something to compare the between-clade distances to. Solution: compare to within-clade distances. We can get the probabilities from the ratio of sequence difference between two sister clades (K) to the mean sequence difference within the clades ().

K = average +

= average = )

K = f(t,u)

Express t in units of Ne generations:

K = f(Ne,u) = f(Ne,u)

Therefore K/ is dimensionless because Ne and u cancel.

Good because Ne and u are usually unknown!

t

Page 21: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

A

B

C

We do not see the tree I showed you earlier (A) because most lineages are extinct.

Tree B is the phylogeny of the surviving individuals.

But we don’t even see all of the survivors. We make a tree (C) based on a very small sample of individuals from an immense population.

One More Problem: Sampling

Page 22: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

The problem is to use this to infer this and then define species.

We have to distinguish between gaps and clusters formed by random drift, and gaps and clusters formed by physical isolation or adaptation to different niches or reproductive isolation. And we must do this based on very small samples of very large populations of the few survivors of evolution.

Page 23: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Fortunately, Noah Rosenberg showed how one can calculate the probability that two populations are reciprocally monophyletic (and therefore have been evolving independently), given that the samples are reciprocally monophyletic and we know the ratio K/.(Rosenberg 2003 Evolution 57:1465)

Page 24: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Conceptual and Operational Definitions of Species

Now we have a conceptual definition or model of species, the EGSM, and an operation definition or species criterion using K/. In fact K/ together with the sample sizes tells us the probability that a sample includes specimens from two species.

Briefly:1. Make a bootstrapped distance tree of DNA sequences from

the specimens to identify robust clades.2. Get the pairwise sequence differences between the specimens.3. Starting at the tip of the tree, find pairs of well-supported

sister clades and for each pair calculate K/. 4. Use Noah Rosenberg’s table with K/ and the sample sizes to

get the probability that that the samples came from independently evolving populations, i.e. from different species.

5. Going toward the root of the tree, repeat until species are found.

Page 25: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

First Applied K/ to Delimit Species in Asexual Organisms

QuickTime™ and a decompressor

are needed to see this picture.

Bdelloid rotifersBirky et al. 2005

Birky and Barraclough 2010

Oribatid mitesNothrus, Platynothrus

Birky and Barraclough 2010Oligochaete

Lumbriculus variegatus

Heterotrophic marine flagellatesGreen algaOstreococcus

Fungus Penicillium

Birky et al. 2010 PLoS One 5(5):1-11

Page 26: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Some of Mike Robeson’s Soil Bdelloids3 Cases Involving Singlets

K/ = 2.6n1, n2 = 21,1P ≈ 0.84

K/ = 7.3n1, n2 = 8,1P > 0.98

K/ = 3.97n1, n2 = 3,1P = 0.94

Page 27: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

I published paper with Tim Barraclough and Austin Burt showing that asexual organisms can undergo speciation, without using the word “species”. Only later discovered that Austin wasn’t sure that species are real.

I just realized that this might have some advantages…

If species aren’t real, then they can’t go extinct.

We don’t need the Endangered Species Act.

Page 28: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Another Ancient Asexual Organism

Darwinulid ostracodsSchön, Pinto, Halse, Martens, Birky

(in preparation)

Copepod HemidiaptomisFederico Marrone et al. 2010

QuickTime™ and a decompressor

are needed to see this picture.

First Application to Delimit Species in Sexual Organisms

Page 29: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification
Page 30: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Applying K/ Method To Sexual Organisms

We require data in which cox1 or another organelle gene has been amplified from a sample of individuals, sequenced in both directions to minimize sequencing errors, and sequences trimmed to same length to avoid comparing apples and oranges.

Page 31: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Example 1: Pterapod (Sea Butterfly) Limacina helicina

(Hunt et al. 2010 Poles apart: the “Bipolar” pterapod species Limacina helicina is genetically distinct between the Arctic and Antarctic Oceans. PLoS ONE 5:e9835.)

Phylogenetic tree of cox1 sequences shows that north and south circumpolar populations form well-supported clades. Hunt et al. proposed that these represented different species. I verified this, using K/ to show that these are different evolutionary genetic species.

Page 32: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Implementation of K/ Ratio Test1. Align and proofread sequences, trim to same length, remove gaps, etc.2. Make Neighbor-joining (NJ) and bootstrapped NJ trees to identify pairs

of sister clades with robust support which are candidates for EG species.3. Make matrix of pairwise sequence differences and calculate K/ for

candidates.4. Or better, get some or all of this information from other people.

1 2 3 4 5 6 7 8 9 10 11

1 GQ861830 - = 0.0092

2 GQ86828 0.05 K = 8

GQ86827 0.06 0

GQ86826 0.06 0 0

5 AY22779 0.008 0.05 0.06 0.07

6 GQ8682 0.5 0.6 0.6 0.6 0. = 0.007

7 GQ868 0. 0.5 0.6 0.6 0. 0.007

8 GQ86825 0. 0.5 0.6 0.6 0. 0.005 0.007

9 GQ8682 0.5 0.6 0.6 0.6 0. 0.007 0.0055 0.008

0 GU7280 0. 0.5 0.5 0.5 0.2 0.0052 0.005 0.008 0.00

AY22778 0. 0.5 0.6 0.6 0.2 0.007 0.0052 0.0078 0.0065 0.0097

K = 0.5

Page 33: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Implementation of K/ Ratio Test (cont.)4. In Noah Rosenberg’s table, look up K/ TA or TB only goes as high as 5 in the table and sample sizes (rA, rB; here, 6, 5) and read probability that the populations from which the samples came are reciprocally monophyletic and evolving independently: P > 0.991675

Part of the table: rA rB TA TB Probability5 4 5 5 0.9901415 5 5 5 0.9910366 1 5 5 0.9827266 2 5 5 0.98726 3 5 5 0.9894376 4 5 5 0.990786 5 5 5 0.9916756 6 5 5 0.9923147 1 5 5 0.9832017 2 5 5 0.987677

Important caveat: The probability assumes that the samples are representative of the entire population. This can be tested, for example, by showing that increasing the number and variety of sample locations doesn’t change the conclusions.

Page 34: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Example 2: Ravens

QuickTime™ and a decompressor

are needed to see this picture.QuickTime™ and a decompressor

are needed to see this picture.

Common RavenCorvus corax

Chihuahuan RavenCorvus cryptoleucos

Page 35: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Ravens (cont.)

Omland et al. (2000 Proc. R. Soc. Lond. B 267:2475; 2006 Molec. Ecol. 15:795): mitochondrial and nuclear DNA sequences show three clades: Chihuahuan Ravens; Common Ravens from Europe, Asia, and most of the U.S.; and most Common Ravens from the Pacific Coast.

Pacific Coastravens

Page 36: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Ravens (cont.)

I downloaded all 101 sequences of the raven mitochondrial cobgene from GenBank, plus outgroups. • Same procedure as with Pterapods: Sequences were aligned (one

sequence was deleted because it could not be aligned). Trimmed sequences to 258 bp consisting of 76 complete codons (except one was missing 1 bp at 5’ end and one was missing 1 bp at 3’ end). Made Neighbor-joining trees with and without bootstrapping to identify sister clades. Calculated all pairwise sequence differences in PAUP*. All ingroup sequence differences were ≤ 0.06, so I made no corrections for multiple hits.

Page 37: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Ravens (cont.)

Results verify three species:

Common Raven-California vs. Chihuahuan RavenUsing from Chihuahuan:K/ = 2.34 n1, n2 = 17, 7 P = 0.93 (conservative)

Using from Common-California:K/ = 15.0 n1, n2 = 17, 7 P > 0.995

Common Raven-California vs. Common Raven-Holarctic K/ = 32.6 n1, n2 = 75, 17 P > 0.995

Cynopterus horsfieldi C0399Corvus coronoides

Corvus brachyrhynchosCorvus albicolis MBM10981

Corvus albus LSUMZBCorvus cryptoleucus LSUMZCorvus cryptoleucus AMNHCorvus cryptoleucus NM528

Corvus cryptoleucus TX549Corvus cryptoleucus NM602Corvus cryptoleucus NM589

Corvus cryptoleucus NM523Corvus cryptoleucus NM522

Corvus corax UCSB#26346Corvus corax CA176Corvus corax CA175Corvus corax CA170Corvus corax CA169

Corvus corax ID10Corvus corax ID8Corvus corax ID2

Corvus corax ID1Corvus corax UCSB#26360Corvus corax WA899Corvus corax ID12Corvus corax ID11Corvus corax ID7Corvus corax CA171Corvus corax CA168Corvus corax WASOA

Corvus corax UWBM#61493Corvus corax UAM8803Corvus corax AK955

Corvus corax Russia493Corvus corax WACLECorvus corax WAFAICorvus corax UAM11373Corvus corax UAM13489Corvus corax UAM11374Corvus corax UAM12982Corvus corax UAM13315Corvus corax UAM8175Corvus corax UAM13312Corvus corax UAM13313Corvus corax UAM13316Corvus corax UAM13317Corvus corax UAM13318Corvus corax UAM10748Corvus corax UAM10887Corvus corax UAM10891Corvus corax UAM10888Corvus corax UAM10889Corvus corax UAM10890Corvus corax UAM10754Corvus corax UAM10752Corvus corax UAM10749Corvus corax UAM10753Corvus corax UAM10750Corvus corax UAM10886Corvus corax UAM10803Corvus corax UWBM61493Corvus corax UWBM#53955Corvus corax ME1Corvus corax MA2

Corvus corax MBM#9018Corvus corax MN353

Corvus corax UWBM57899Corvus corax UAM10751Corvus corax UAM10673Corvus corax UAM8802Corvus corax UAM13314Corvus corax UAM13485Corvus corax WA567Corvus corax Siberia566Corvus corax WI214Corvus corax WA566Corvus corax Mongolia899Corvus corax Mongolia909Corvus corax MN402Corvus corax MN442Corvus corax MN371Corvus corax MN573Corvus corax ID9Corvus corax ID6Corvus corax ID5Corvus corax ID4Corvus corax ID3Corvus corax AK954

Corvus corax CCU86031Corvus corax MA3

Corvus corax WAHANCorvus corax Siberia544

Corvus corax NM523Corvus corax NM522

Corvus corax UAM13320Corvus corax UAM10017Corvus corax UAM8603Corvus corax UAM13319Corvus corax UAM10021Corvus corax UAM7272Corvus corax UAM7271Corvus corax UAM10111Corvus corax UAM9313Corvus corax UAM7400

Corvus corax Siberia861

0.001 substitutions/site

NJRavenCobAlignEditshort, NJ Uncorrected Tree

Chihuahuan Raven

Common Raven-California

Common Raven-Holarctic

Page 38: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Example 3: LiverwortFrullania tamarisci (Scalewort)

Jochen Heinrichs et al. 2010 One species or at least eight? Delimitation and distribution of Frullania tamarisci (L.) Dumort s. l. (Jugermanniopsida, Porellales) inferred from nuclear and chloroplast DNA markers. Mol. Phylogenet. Evol. 56:1105-1114.

I obtained the sequences from Jochen Heinrichs and edited them:1. Deleted taxa except for the clade identified as Frullania tanarisci

sensu lato by Heinrichs et al.2. Removed nuclear genes, leaving concatenated chloroplast genes trnL-F + atpB-rbcL. 3. Trimmed these to ca. same length and removed most gaps.4. Made Neighbor-joining tree and bootstrapped NJ tree to

identify well-supported clades.

Page 39: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Liverwort(cont.)

I used K/ to verify Heinrichs’ conclusion that F. tamarisci is a complex of species, and to show that two singlets and their sister clades are probably samples from different species.

Page 40: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Example 4: Clouded Leopard

Kitchener et al., 2006Four subspecies are actually two species (grey and black) based on phenotypes.

Page 41: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Clouded Leopard (cont.)

QuickTime™ and a decompressor

are needed to see this picture.

Buckley-Beason et al. 2006: NJ K2P tree of 771 bp of mtDNA verifies species based on reciprocal monophyly and deep divergences. By inspection, K/ ≥ 4 and P(2 species) ≥ 0.95.

Page 42: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Marine Enchytraeid Oligochaete Grania

In mitochondrial cox1 tree the established species formed well-supported clades separated from each other by deep gaps, judged by the authors to show absence of gene flow “in a long time” despite some of the species being sympatric.E.g. one specimen was judged to be well-separated its sister clade and, despite being morphologically identical to G. postclitellachaeta, was described as a new species, G. occulta.Examination of the cox1 tree showed that these clades have a sufficiently large K/ ratio to easily qualify as EG species. PDW15 vs. other G. postclitellochaeta may also be distinct species (open circle).

De Wit & Erséus 2010 “Genetic variation and phylogeny of Scandinavian species of Grania (Annelida: Clitellata: Enchytraeidae), with the discovery of a cryptic species.” J. Zool Syst. Evol. Res. 48:285

Page 43: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Grania (cont.)

De Wit & Erséus 2010 J. Zool Syst. Evol. Res. 48:285

Previously described species verified by K/New species, verified by K/Other K/ species?

The K/ ratio should be used to determine the probability that the yellow starred specimens represent new species. Authors didn’t consider these for species status because the nuclear ITS sequence didn’t separate them from sister clade, but it’s not surprising that nuclear genes would segregate later.

Page 44: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Potential Problems/Limitations of K/ Method

1. Problem of female philopatry, noted by Weisrock et al. (2010) for lemurs:The use of mitochondrial or chloroplast genes will be misleading if two populations have no female migration, but male migration continues. Then the two populations will be assigned to different species by the K/ ratio of mito genes but males will carry nuclear genes between the populations and prevent independent evolution. When this is suspected, it might be appropriate to use both an organelle gene and a nuclear gene to track males.

2. Because coalescence is a stochastic process, a small proportion of nuclear genes are expected to achieve reciprocal monophyly before organelle genes. Unfortunately it is impossible to identify those genes in advance, and it would be very difficult to identify them after the fact.

Page 45: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Potential Problems/Limitations of K/ Method (cont.)

3. It bears repeating that the probability assumes that the samples are representative of the entire population. This can be tested as I did for the bdelloid rotifers, by showing that increasing the number of sample locations, the number of samples per site, and the number of individuals in the sample doesn’t change the conclusions. Increasing the sample coverage did not split or lump species found with smaller samples.

But when K/ is large or is in the usual range for the group of organisms, it is unlikely that additional sampling will increase enough to reduce the ratio significantly.

Page 46: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Barcode GapAs two populations diverge, a frequency distribution of the pairwise differences among their sequences becomes bimodal: one peak for differences within species, the other for differences between species. The gap between the peaks is sometimes called the “ barcode gap”. Sea butterfly example:

Pairwise differences

1 2 3 4 5 6 7 8 9 10 11

1 GQ861830 - = 0.0092

2 GQ86828 0.05 K = 8

GQ86827 0.06 0

GQ86826 0.06 0 0

5 AY22779 0.008 0.05 0.06 0.07

6 GQ8682 0.5 0.6 0.6 0.6 0. = 0.007

7 GQ868 0. 0.5 0.6 0.6 0. 0.007

8 GQ86825 0. 0.5 0.6 0.6 0. 0.005 0.007

9 GQ8682 0.5 0.6 0.6 0.6 0. 0.007 0.0055 0.008

0 GU7280 0. 0.5 0.5 0.5 0.2 0.0052 0.005 0.008 0.00

AY22778 0. 0.5 0.6 0.6 0.2 0.007 0.0052 0.0078 0.0065 0.0097

K = 0.5

#pairs

0

2

4

6

8

10

12

14

16

18

1 2 3 4 5 6 7 8 9 10 11 12 13 14

0-1 1.1-2 2.1-3 ……………………….. .32 33 34 35 36 Percent sequence difference

No.pairs

20

10

0

Barcode gap

Page 47: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Barcode Gap (cont.)

Used by Consortium for the Barcode of Life (CBOL) and the International Barcode of Life project (iBOL) to identify gaps between sequences from already-described species. Critics of barcoding point to cases where gap fails to distinguish species, or splits a species, as failures of barcoding. But:

1.Assumes species defined by systematists are real species. So systematists are the only people who never make misteaks? Sets barcoding up for failure.

Sequence difference

No.pairs

Page 48: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Barcode Gap (cont.)

1. Problem: assumes species defined by systematists are real species. So systematists are the only people who never make misteaks?

2. Critics of barcoding point to cases where gap fails to distinguish species, or splits a species, as failures of barcoding. But when data from more than two species are pooled, the gap can disappear if the different species pairs have different diversities. Testing barcoding by looking for a gap in data pooled from many species sets it up for failure.

+

+

=

Page 49: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

Barcode Gap (cont.)

1. Problem: assumes species defined by systematists are real species. So systematists are the only people who never make misteaks?

2. Critics of barcoding point to cases where gap fails to distinguish species, or splits a species, as failures of barcoding. But when data from more than two species are pooled, the gap can disappear if the different species pairs have different diversities. Testing barcoding by looking for a gap in data pooled from many species sets it up for failure.

3. As practiced by CBOL/IBOL, barcoding has no theoretical rationale.

Using the evolutionary species concept or my version of it and the K/ ratio to delimit species would solve these problems.

Page 50: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

The K/ Ratio Is Not ExclusiveUse of the K/ ratio does not preclude the use of other methods to test whether a sample includes specimens from ≥ 2 evolutionary genetic species. For example:

• If one could show that the specimens fell into groups that could mate only with members of the same group, this is evidence that the sample includes members of two different species even if they are sympatric.

• If individuals in a sample came from one or the other of two well-separated geographic locations and there was no migration between them, this is evidence that the populations in those regions would be evolving independently and so are different species.

Note that the sampling problem still exists…statistical analysis is needed!

•Finding species by using DNA sequences is not the end of taxonomy! Whenever it is practical, species found in this way should be studied to find morphological traits that distinguish them reliably. Just as in traditional systematics, the behavior, ecology, and distribution of the species should be studied.

Page 51: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification

I GRATEFULLY ACKNOWLEDGE• Collaborators: Bdelloid species: Timothy Barraclough Silwood ParkDiego Fontaneto Silwood ParkGiulio Melone Claudia Ricci and Giulio Melone University of MilanDarwinulid species: Isa Schön and Koen Martens Royal Belgian Institute of Natural Sciences,

Brusssels, BelgiumRicardo Pinto University of Sao Paulo, BrazilStuart Halse Bennelongia Pty Ltd, Wembley WA, Australia• Many people for sharing their sequence files so I didn’t have to download

them from GenBank, and for invaluable discussions, comments, and suggestions.

• Rick Michod and all my colleagues for allowing me to keep my lab and office so that I might continue doing research after “retirement”.

• All of you for your kind attention!

Page 52: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification
Page 53: DNA Barcoding the Right Way: A Theory-based Method for Species Detection and Identification