Molecular Phylogenies, Genomics and the Microbial Species Concept Peg Riley University of...

52
Molecular Phylogenies, Molecular Phylogenies, Genomics and the Microbial Genomics and the Microbial Species Concept Species Concept Peg Riley Peg Riley University of Massachusetts Amherst University of Massachusetts Amherst www.ai.mit.edu/.../ ce/microbial-engineering.html

Transcript of Molecular Phylogenies, Genomics and the Microbial Species Concept Peg Riley University of...

Molecular Phylogenies, Molecular Phylogenies, Genomics and the Microbial Genomics and the Microbial

Species ConceptSpecies Concept

Peg RileyPeg RileyUniversity of Massachusetts AmherstUniversity of Massachusetts Amherst

www.ai.mit.edu/.../ ce/microbial-engineering.html

Biological DiversityBiological Diversity

From a morphological perspectiveFrom a morphological perspective

Where does your organism belong?Where does your organism belong?

Biological DiversityBiological Diversity

From a molecular perspectiveFrom a molecular perspective

16S rRNA16S rRNANow where does your organism belong?Now where does your organism belong?

Biological DiversityBiological DiversityMolecular phylogenies fundamentally Molecular phylogenies fundamentally

changed our views of biological diversitychanged our views of biological diversity

Molecular Molecular Phylogenies RevealPhylogenies Reveal

We live on a PLANET of MICROBESWe live on a PLANET of MICROBESMicrobes comprise by far the Microbes comprise by far the

greatest amount of biological diversitygreatest amount of biological diversity

Morphology works well for inferring evolutionary Morphology works well for inferring evolutionary relationships among non-microbial eukaryotes, relationships among non-microbial eukaryotes,

but molecules open our eyes to a wealth of but molecules open our eyes to a wealth of formerly hidden biological (microbial) diversityformerly hidden biological (microbial) diversity

Molecular Phylogenies Also Molecular Phylogenies Also RevealReveal

Species ASpecies A Species BSpecies B

an unexpected and relativelyan unexpected and relativelyhigh level of gene flowhigh level of gene flow

RecombinationRecombination

HorizontalHorizontaltransfertransfer

Great Moments in Evolution:Great Moments in Evolution:Photosynthesis EvolvesPhotosynthesis EvolvesVertical versus Horizontal Transfer

Oxygen - Based Oxygen - Based PhotosynthesisPhotosynthesis

Anaerobic Anaerobic PhotosynthesisPhotosynthesis

Cyano early divergence results in first biological structure- Stromatolites Rock!

Transfer can happen, BUT is Transfer can happen, BUT is there frequent gene transfer there frequent gene transfer

between domains?between domains?

TPI

Doolittle, 1998Doolittle, 1998

‘‘You Are What You Eat’!You Are What You Eat’!

Frequent gene transfer proposed Frequent gene transfer proposed from Bacteria to the Eukaryotes from Bacteria to the Eukaryotes that eat them….that eat them….

Gene transfer: Gene transfer: made possiblemade possibleby frequent, by frequent,

relatively kinky, relatively kinky, bacterial sexbacterial sex

(or horizontal transfer)(or horizontal transfer)

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

Conjugation

QuickTime™ and aGIF decompressor

are needed to see this picture.

Transduction Transformation

““Sex with dead things is Sex with dead things is better than no sex at all. “better than no sex at all. “

So mechanisms for So mechanisms for horizontal transfer horizontal transfer

exist exist

BUT BUT

A BA B

A B

……are such are such events common events common enough to limit enough to limit

divergence divergence between between

lineages?lineages?

Let’s focus on specific lineagesLet’s focus on specific lineages

A B

Does h.t. result Does h.t. result in a cloud of in a cloud of

diversity, due to diversity, due to frequent frequent

exchange exchange among distinct among distinct

lineages?lineages?

Gene Transfer Versus RetentionGene Transfer Versus Retention

• Mechanisms for gene transfer existMechanisms for gene transfer exist– Transfer happens all the time to all genesTransfer happens all the time to all genes

• Successful gene transfer is relatively rareSuccessful gene transfer is relatively rare– Just because a transfer events occurs does not Just because a transfer events occurs does not

mean it will survive in its new genomemean it will survive in its new genome– Success depends upon the donor, the recipient, Success depends upon the donor, the recipient,

the environment, perhaps a phage, or plasmid, the environment, perhaps a phage, or plasmid, etc.etc.

Most horizontal transfer Most horizontal transfer events are lost due to driftevents are lost due to drift

0.00.0

1.01.0

Probability of fixation = 1/NProbability of fixation = 1/N

TimeTime

Initial Initial frequency = 1/Nfrequency = 1/N

If your population size is 10If your population size is 101010, then the probability, then the probabilityof fixation is 1/10of fixation is 1/101010, or 0.0000000001, or 0.0000000001

Successful Horizontal Successful Horizontal Transfer in BacteriaTransfer in Bacteria

• Transfer occurs for all genes, it is justTransfer occurs for all genes, it is justmore likely to be retained more likely to be retained

when selection is strongwhen selection is strong

• That is why genes observed to have transferred That is why genes observed to have transferred are often involved in local adaptationare often involved in local adaptation– antibiotic resistance, heavy metal tolerance, antibiotic resistance, heavy metal tolerance,

virulence determinantsvirulence determinants

www.cbs.dtu.dk/.../ roanoke/genetics98

0316.htm

Successful Horizontal Transfer Successful Horizontal Transfer in Bacteriain Bacteria

• Between close relatives?Between close relatives?– Frequently occurs due to shared plasmids, Frequently occurs due to shared plasmids,

phages, recognition signals, appropriate gene phages, recognition signals, appropriate gene regulation systems, etc.regulation systems, etc.

• Between distant relatives?Between distant relatives?– Less clear how often such transfer is successfulLess clear how often such transfer is successful

• Antibiotic resistance genes, although?Antibiotic resistance genes, although?• Photosynthetic systems, endosymbiosis…Photosynthetic systems, endosymbiosis…• Genes involved in cytosolic metabolism ?Genes involved in cytosolic metabolism ?

Does the Universal Tree of Life Does the Universal Tree of Life Really Look Like This?Really Look Like This?

Is There Stability Or Flux In Is There Stability Or Flux In Evolutionary Lineages?Evolutionary Lineages?

Is successful transfer frequent Is successful transfer frequent enough to obliterate evolutionary enough to obliterate evolutionary

lineages?lineages?

Successful gene transfer rateSuccessful gene transfer rate

Low rateLow rate High rateHigh rate

StabilityStability FluxFlux

Genome Comparisons Genome Comparisons Suggest Flux At First BlushSuggest Flux At First Blush

Linear diagram comparing the six complete Linear diagram comparing the six complete E. coliE. coli and and S. flexneriS. flexneri genomes using a software genomes using a software tool called Mauve (Glasner and Perna, 2004)tool called Mauve (Glasner and Perna, 2004)

K12 and 0157H7 are 98.5% identical, K12 and 0157H7 are 98.5% identical,

BUT - punctuated by hundreds of islands of unique BUT - punctuated by hundreds of islands of unique sequencesequence

Bacterial Phenotype SpaceBacterial Phenotype Space

DiscreetDiscreetphenotypesphenotypes

ContinuousContinuousphenotypesphenotypes

E. coliE. coli S. marcescensS. marcescensC. freundiC. freundi

E. coliE. coli S. marcescensS. marcescensC. freundiC. freundi

Enteric Phenotype SpaceEnteric Phenotype Space

Phenotype 1Phenotype 1

Ph

eno

typ

e 2

Ph

eno

typ

e 2

Escherichia coliEscherichia coli

Salmonella typhiSalmonella typhi

Citrobacter freundiiCitrobacter freundii

Klebsiella oxytocaKlebsiella oxytoca

K. pneumoniaeK. pneumoniae

Hafnia alveiHafnia alvei

B. subtilis

B. pumilus

B. amyloliquefaciens

B. licheniformis

From Shute et al., 1985

Bacillus Phenotype SpaceBacillus Phenotype Space

Mapping Phenotype To GenotypeMapping Phenotype To Genotype

E. coliE. coliS. entericaS. enterica

C. freundiiC. freundii

Phenotypic CharactersPhenotypic Characters

Ph

enot

ypic

Ch

arac

ters

Ph

enot

ypic

Ch

arac

ters

Genotypic CharacterGenotypic CharacterDistributionDistribution

Bacterial TaxonomyBacterial Taxonomy

Gold Standard Gold Standard Polyphasic ApproachPolyphasic Approach

– Requires a phenotypic componentRequires a phenotypic component• Restricts taxonomy to the < 1% we can cultureRestricts taxonomy to the < 1% we can culture• 1930’s Bergey’s Manual of Determinative 1930’s Bergey’s Manual of Determinative

BacteriologyBacteriology– exclusive, diagnostic traits requiredexclusive, diagnostic traits required

– Requires a genetic componentRequires a genetic component• 16S rRNA sequence to place taxa16S rRNA sequence to place taxa• Measure of overall DNA similarityMeasure of overall DNA similarity

Phylo-Phenetic Approach To Phylo-Phenetic Approach To Bacterial TaxonomyBacterial Taxonomy

• Collect adequate sample of strains & use them allCollect adequate sample of strains & use them all

• Determine closest relative with 16S rRNADetermine closest relative with 16S rRNA

• Characterize the phenotypeCharacterize the phenotype– The more exhaustive, the betterThe more exhaustive, the better– Do not spare time or effortDo not spare time or effort

• Follow nomenclature rulesFollow nomenclature rules– Avoid using words that are hard to pronounce if you do not wish to Avoid using words that are hard to pronounce if you do not wish to

annoy your colleaguesannoy your colleagues

(Rossello-Mora & Amann, (Rossello-Mora & Amann, 2001)2001)

Phylo-Phenetic Species ConceptPhylo-Phenetic Species Concept

—Genomic similarity- >~70% DNA-DNA similarityGenomic similarity- >~70% DNA-DNA similarity

—Phenotype description should be exhaustivePhenotype description should be exhaustive

—Monophyletic- 16S rRNA sequence analysisMonophyletic- 16S rRNA sequence analysis

—It is “Theory-lite”It is “Theory-lite”

•A monophyletic and genomically coherent A monophyletic and genomically coherent cluster of individual organisms that show a high cluster of individual organisms that show a high degree of overall similarity with respect to many degree of overall similarity with respect to many independent characteristics, and is diagnosable independent characteristics, and is diagnosable by a discriminative phenotypic property.by a discriminative phenotypic property.

Rossello-Mora and Amann, 2001

Two Facts We May Be Able Two Facts We May Be Able To Agree UponTo Agree Upon

1.1. Bacteria cluster in phenotype spaceBacteria cluster in phenotype space

2.2. Bacteria successfully transfer some fraction of Bacteria successfully transfer some fraction of their genomes via horizontal transfertheir genomes via horizontal transfer

• What fraction of the genome underlies the What fraction of the genome underlies the phenotype clustering?phenotype clustering?

– Is there a core set of genes that defines a bacterial Is there a core set of genes that defines a bacterial lineage?lineage?

• Genes that rarely transferGenes that rarely transfer• Genes required for survival of the lineageGenes required for survival of the lineage

The Hummer AnalogyThe Hummer Analogy

Basic (core) HummerBasic (core) Hummer

Niche adapted HummersNiche adapted Hummers

Core Genome ProposalCore Genome Proposal• Core genes comprise the species “shared, Core genes comprise the species “shared,

core genome”core genome”– Rarely transfer and thus diverge between close relativesRarely transfer and thus diverge between close relatives– Might include essential housekeeping geneMight include essential housekeeping gene– Present in frequencies of >95% of isolatesPresent in frequencies of >95% of isolates

Lan and Reeves, 2001Ancestral SpeciesAncestral Species

identicalidentical

TIMETIME GENE SIMILARITYGENE SIMILARITY

differentdifferent

ancientancient

recentrecent

Species ASpecies A Species BSpecies B

Core Genome ProposalCore Genome Proposal

• Auxiliary genes are that set of genes that Auxiliary genes are that set of genes that serve to adapt isolates to local nichesserve to adapt isolates to local niches– Auxiliary genes frequently transfer and therefore do not Auxiliary genes frequently transfer and therefore do not

diverge between close relativesdiverge between close relatives

– Includes resistance, tolerance, pathogenicity genes, etc.Includes resistance, tolerance, pathogenicity genes, etc.

Lan and Reeves, 2001Lan and Reeves, 2001Ancestral SpeciesAncestral Species

identicalidentical

TIMETIME GENE SIMILARITYGENE SIMILARITY

very similarvery similar

ancientancient

recentrecent

Species ASpecies A Species BSpecies B

Evolving a Barrier to Evolving a Barrier to RecombinationRecombination

• ““Core” genes diverge as lineages evolveCore” genes diverge as lineages evolve– Nucleotide diversity for core genes is lower within than Nucleotide diversity for core genes is lower within than

between taxabetween taxa

– Suggests a genetic mechanism that can maintain Suggests a genetic mechanism that can maintain

lineage stabilitylineage stability - - Divergence limits recombinationDivergence limits recombination

Ancestral SpeciesAncestral Speciesidenticalidentical

TIMETIME GENE SIMILARITYGENE SIMILARITY

differentdifferent

ancientancient

recentrecent

Species ASpecies A Species BSpecies B

Assessing The Existence Of A Assessing The Existence Of A Core GenomeCore Genome

• Need a group of taxa that are closely related Need a group of taxa that are closely related enough to avoid multiple substitution issues and enough to avoid multiple substitution issues and alignment issuesalignment issues

• Need multiple isolates per species and multiple Need multiple isolates per species and multiple speciesspecies

• Need to examine isolates that coexist in time and Need to examine isolates that coexist in time and space such that recombination could occurspace such that recombination could occur

Strain Collection Species Source Statedesignation # organism

CF1 M250 Citrobacter freundii Isoodon macrourus NTCF2 M289 Citrobacter freundii Perameles nasuta NSWCF3 M141 Citrobacter freundii Antechinus flavipes SACF4 M140 Citrobacter freundii Antechinus flavipes SACF5 M255 Citrobacter freundii Isoodon macrourus NTEB1 M338 Enterobacter cloacae Mus musculus VICEB2 M50 Enterobacter cloacae Mus musculus VICEB3 M99 Enterobacter cloacae Mus musculus VICEB4 M90 Enterobacter cloacae Mus musculus VICEB5 M322 Enterobacter cloacae Mus musculus VICEC1 TA157 Escherichia coli Trichosurus vulpecula NTEC2 TA234 Escherichia coli Mus musculusEC3 TA479 Escherichia coli Mus musculusEC4 TA57 Escherichia coli Macropus giganteus ACTEC5 TA79 Escherichia coli Bettongia penicillata WAEC6 TA184 Escherichia coli Trichosurus caninus NSWHA1 M163 Hafnia alvei Phascogale tapoatafa WAHA2 M690 Hafnia alvei Homo sapiens WAHA3 M230 Hafnia alvei Antechinus bellus NTHA4 M261 Hafnia alvei Dasyurus hallucatus VICHA5 M259 Hafnia alvei Dasyurus hallucatus NTKO1 M151 Klebsiella oxytoca Dasycercus cristicauda NTKO2 M328 Klebsiella oxytoca Trichosurus vulpecula TASKO3 M192 Klebsiella oxytoca Zyzomys argurus NTKO4 M499 Klebsiella oxytoca Vespadelus vulturnus NSWKO5 M712 Klebsiella oxytoca Chalinolobus gouldii NSWKP1 M208 Klebsiella pneumoniae Parantechinus bilarni NTKP2 M40 Klebsiella pneumoniae Mus musculus VICKP3 M757 Klebsiella pneumoniae Nyctophilus geoffroyi NSWKP4 M758 Klebsiella pneumoniae Zyzomys argurus NTKP5 M663 Klebsiella pneumoniae Petaurus gracilis QLDKP6 M47 Klebsiella pneumoniae Mus musculus VICSM1 M145 Serratia marcescens Antechinus flavipes NSWSP1 M8 Serratia plymuthica Potorous tridactylus NSWSP2 M66 Serratia plymuthica Antechinus stuartii SASP3 M297 Serratia plymuthica Perameles nasuta NSW

Gordon Australian Enteric CollectionGordon Australian Enteric Collection

Gordon et. al. 2001

Assessing The Existence Of A Assessing The Existence Of A Core GenomeCore Genome

1.1. Choose potential “core” genes:Choose potential “core” genes:• Essential for the survival of the cellEssential for the survival of the cell

• Not closely linked - avoid co-trandusctionNot closely linked - avoid co-trandusction

• Not physiologically linked - avoid co-evolutionNot physiologically linked - avoid co-evolution

2.2. What is “core” for one species may not What is “core” for one species may not be “core” for anotherbe “core” for another

Target Core GenesTarget Core Genes

gapgapAA Glyceraldehyde-3-phosphate dehydrogenase map position 40.11 Glyceraldehyde-3-phosphate dehydrogenase map position 40.11 gene length 996 bpgene length 996 bp sequence length 832 bp PIs 194sequence length 832 bp PIs 194

grogroEL GroEL proteinEL GroEL protein map position 94.17 map position 94.17 gene length 1647 bpgene length 1647 bp sequence length 1146 bp PIs 245sequence length 1146 bp PIs 245

gyrgyrAA DNA gyrase subunit A DNA gyrase subunit A map position 50.33 map position 50.33 gene length 2628 bp gene length 2628 bp sequence length 660 bp PIs 226sequence length 660 bp PIs 226

ompompA Outer membrance protein AA Outer membrance protein A map position 21.95 map position 21.95 gene length 1041 bpgene length 1041 bp sequence length 526 bp PIs 219sequence length 526 bp PIs 219

pgipgi Glucose-6-phosphate isomerase Glucose-6-phosphate isomerase map position 91.21 map position 91.21 gene length 1650 bp gene length 1650 bp sequence length 670 bp PIs 210sequence length 670 bp PIs 210

16s16s 16S rRNA 16S rRNA map position several map position several gene length 1541 bp gene length 1541 bp sequence length 291 bp PIs 30sequence length 291 bp PIs 30

Gene Tree InferenceGene Tree Inference• Phylogenetic trees inferred with maximum Phylogenetic trees inferred with maximum

likelihood methods likelihood methods (PAUP4.0b8)(PAUP4.0b8)

• MODELTEST used to generate optimum MODELTEST used to generate optimum parameters for heuristic algorithm used for parameters for heuristic algorithm used for building ML trees in PAUPbuilding ML trees in PAUP

• Statistical support for branching patterns of gene Statistical support for branching patterns of gene trees assessed in two waystrees assessed in two ways– Bootstrapping ML trees, 500 replicatesBootstrapping ML trees, 500 replicates– Mr. Bayes - 50,000 trees, majority rule consensusMr. Bayes - 50,000 trees, majority rule consensus

gapAgapA groELgroEL

Core Gene TreesCore Gene Trees

SP1SP2SM1

SP3

HA1

HA3

HA2

HA5EC6

EC5

EC3

EC1EC4

ECMG

EC2CF5

CF1CF4

CF3

CF2

KO5

KO4

KO3KO1

KO2EB4

EB1

EB5EB2

EB3

KP2

KP4

KP1

KP6

93

98

99

94

80

57

58

62

63

71 97

68 54

74

92

61

83

96

97

59 97

79

66

84

KP6

KP2

KP3

KP4KP1KP5

HA5

HA1

HA2HA3

EB4EB1

EB5

KO5KO1

KO2

KO3KO4

EC6

ECMG

EC1

EC2

EC3EC4 EC5

CF2 CF4

CF1

CF5

CF3

SM1SP1SP3

99

67

86

54

9996

58

100

10087 57

95

54

100

61

88

Enteric Core Gene Trees Enteric Core Gene Trees SummarySummary

Multiple isolates Multiple isolates from each taxa from each taxa always cluster always cluster

togethertogether

Suggests something Suggests something maintains the stability maintains the stability

of those taxa of those taxa

gapAgapA

KP6

KP2

KP3

KP4KP1KP5

HA5

HA1

HA2HA3

EB4EB1

EB5

KO5KO1

KO2

KO3KO4

EC6

ECMG

EC1

EC2

EC3EC4 EC5

CF2 CF4

CF1

CF5

CF3

SM1SP1SP3

99

67

86

54

9996

58

100

10087 57

95

54

100

61

88

E. coliE. coli

H. alveiH. alvei

Enteric Core Gene Trees Enteric Core Gene Trees SummarySummary

• Within a speciesWithin a species– Isolates cluster together in the composite treeIsolates cluster together in the composite tree

• Between speciesBetween species

– The branching patterns follow those suggested from The branching patterns follow those suggested from phenotypic dataphenotypic data

• Practical take home messagePractical take home message– A relatively few housekeeping genes provides a A relatively few housekeeping genes provides a

composite view of enteric phylogenetic relationshipscomposite view of enteric phylogenetic relationships• Don’t need an entire genomeDon’t need an entire genome• Serves as a proxy for phenotypeServes as a proxy for phenotype

Evolving a Barrier to Core Gene Evolving a Barrier to Core Gene Recombination Between TaxaRecombination Between Taxa

• Core genes have diverged significantly Core genes have diverged significantly between these taxabetween these taxa– The levels of nucleotide diversity for core genes within The levels of nucleotide diversity for core genes within

these taxa are much lower than the levels of divergence these taxa are much lower than the levels of divergence between taxabetween taxa

• This pattern of divergence suggests a This pattern of divergence suggests a genetic mechanism that can maintain genetic mechanism that can maintain lineage stabilitylineage stability– Core genes diverge as lineages evolveCore genes diverge as lineages evolve– Divergence prohibits homologous recombinationDivergence prohibits homologous recombination

Genomic ComparisonsGenomic Comparisons

E. coliE. coli Salmonell speciesSalmonell species

Although horizontal transfer of genetic Although horizontal transfer of genetic information CAN bring lineages (species) information CAN bring lineages (species)

together, in the enterics it has had together, in the enterics it has had little to no effectlittle to no effect

diverge

recombine

Core Genome HypothesisCore Genome Hypothesis

• Provides a theoretical underpinning to the Phylo-Provides a theoretical underpinning to the Phylo-Phenetic approach to bacterial classificationPhenetic approach to bacterial classification

• So far, supports taxonomic distinctions based So far, supports taxonomic distinctions based upon phenotype dataupon phenotype data– Does not require phenotype or culturing(!)Does not require phenotype or culturing(!)– But may reveal genes that help in culturing effortsBut may reveal genes that help in culturing efforts

• Provides a simple molecular assay of bacterial Provides a simple molecular assay of bacterial species relationshipsspecies relationships

Core Versus Auxiliary GenesCore Versus Auxiliary Genes

• Core genes should accumulate Core genes should accumulate substitutions between species based substitutions between species based upon how long the species have upon how long the species have been divergingbeen diverging

• Auxiliary genes are passed back and Auxiliary genes are passed back and forth and should be more similar, on forth and should be more similar, on average, than core genesaverage, than core genes

Antibiotic Resistance Antibiotic Resistance “Core’ Gene?“Core’ Gene?

• bla bla OXYOXY

— Chromosomally encoded Chromosomally encoded

— Found only in isolates of K. oxytocaFound only in isolates of K. oxytoca

— Found in all K. oxytoca isolates tested in the AECFound in all K. oxytoca isolates tested in the AEC

— Nucleotide diversity is higher that that found in Nucleotide diversity is higher that that found in

housekeeping genes (0.200 vs. 0.002*)housekeeping genes (0.200 vs. 0.002*)

— Behaving like a core geneBehaving like a core gene

*Nucleotide diversity at synonymous sites

Antibiotic Resistance Antibiotic Resistance “Auxiliary” Gene?“Auxiliary” Gene?

• bla bla TEMTEM

— Plasmid encoded Plasmid encoded — Found In 31 of 73 AEC isolates testedFound In 31 of 73 AEC isolates tested— Found in at least one of each taxon examinedFound in at least one of each taxon examined— Only 2 alleles which differ at 2 nucleotide sites.Only 2 alleles which differ at 2 nucleotide sites.— Nucleotide diversity is much lower than that of Nucleotide diversity is much lower than that of

houskeeping genes (0.000 vs. 0.055*)houskeeping genes (0.000 vs. 0.055*)— Behaving like an auxiliary geneBehaving like an auxiliary gene

*Nucleotide diversity at synonymous sites

+

d

[-]

Vo

ge

s-P

rosk

aue

r

Met

hy

l R

ed

Ind

ole

Pro

du

ctio

n

Cit

rate

(S

imm

on

s)

Lys

ine

Dec

arb

oxy

lase

Ure

a H

ydro

lysi

s

Esc

uli

n H

yd

roly

sis

H2S

Pro

du

ctio

n

Po

lar

Fla

gel

laD

Nas

e

[-]

-

+

+

[+]

[+]

+

+

+

-

-

d

-

+

+

+

+

-

-

-

-

+

-

-

+

-

-

+

+

d

d

-

-

+

+

+

-

-

+

+

+

-

-

+

+

+

+

-

-

-

-

-

-

[+]

-

+

+

d

-

d

-

-

-

-

-

[+]

-

KP

EB

CF

EC

SP

HA

KO

Symbols - 0-10% positive [-] 11-25% positive d 26-75% positive [+] 76-89% positive + 90-100% positive

What Is A Core Gene For One What Is A Core Gene For One Taxa May Be An Auxiliary Gene For AnotherTaxa May Be An Auxiliary Gene For Another

Biological Species ConceptBiological Species Concept

Groups of actually or potentially Groups of actually or potentially

interbreeding natural populations, interbreeding natural populations,

which are reproductively which are reproductively

isolated from other isolated from other

such groups such groups (Mayr, 1942)(Mayr, 1942)

BSC Applied to BacteriaBSC Applied to Bacteria

Microbial Biological Species ConceptMicrobial Biological Species ConceptGroups of strains that exchange, or could exchange, core Groups of strains that exchange, or could exchange, core

genome information but that are restricted from genome information but that are restricted from exchange with other such groupsexchange with other such groups

• Allows for exchange of auxiliary genesAllows for exchange of auxiliary genes• Predicts that core genes will show higher levels of Predicts that core genes will show higher levels of

recombination within a species than between speciesrecombination within a species than between species• Predicts that core genes will diverge more rapidly than Predicts that core genes will diverge more rapidly than

auxiliary genes between speciesauxiliary genes between species

ConclusionsConclusions

1.1. Bacteria cluster in phenotype spaceBacteria cluster in phenotype space

2.2. There is corresponding genotypic clustering of “core” There is corresponding genotypic clustering of “core” genesgenes— At least in one sample of enteric bacteriaAt least in one sample of enteric bacteria— This is not the case in “auxiliary” genesThis is not the case in “auxiliary” genes

3.3. These patterns argue for a biological species concept These patterns argue for a biological species concept for bacteria and the existence of coevolved genomes for bacteria and the existence of coevolved genomes that survive through evolutionary timethat survive through evolutionary time— Requires population as well as genomic divergence dataRequires population as well as genomic divergence data

4.4. The question is not “does lateral transfer occur?” but The question is not “does lateral transfer occur?” but rather “does its occurrence obliterate coevolved rather “does its occurrence obliterate coevolved genomes?”genomes?”

The Future of the Microbial The Future of the Microbial Species ConceptSpecies Concept

A Rocky, Rocky, Road ahead! A Rocky, Rocky, Road ahead!

Why?Why?

1.1. Requires population genetic thinkingRequires population genetic thinking- Gene frequencies, not presence/absenceGene frequencies, not presence/absence

2.2. A species to one person may be a clinical A species to one person may be a clinical

isolate to another (E. coli vs Shigella)isolate to another (E. coli vs Shigella)

3. 3. Species are not static entitiesSpecies are not static entities

4. 4. Newly created Comparative Genome Analysis Newly created Comparative Genome Analysis Consortium - DOE basedConsortium - DOE based

AcknowledgementsAcknowledgements

The FundingThe Funding

NIHNIHNSFNSF

Rockefeller FoundationRockefeller FoundationCulpepper FoundationCulpepper Foundation

Yale UniversityYale UniversityUMass AmherstUMass Amherst

Riley LabRiley Lab CollaboratorsCollaboratorsCarla GoldstoneCarla Goldstone David Gordon, ANUDavid Gordon, ANUJohn WertzJohn Wertz Rob Dorit, SmithRob Dorit, SmithCynthia HuntCynthia Hunt Carl Bergstron, UWCarl Bergstron, UWCaroline ObertCaroline Obert Ben Kerr, UWBen Kerr, UWLisa NigroLisa Nigro Rich Lenski, MSURich Lenski, MSUBen KirkupBen KirkupEmily CurdEmily CurdOsnat GillorOsnat GillorMilind ChavanMilind ChavanMike VainMike VainMichelle LizzoteMichelle Lizzote

The WorkThe Work

The Microbial PlanetThe Microbial Planet

16S rRNA16S rRNA