Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

43
EC H 3937 20167 SC R I1043 57558 EC H 3937 17662 EC H 3937 17665 EC H 3937 17672 EC H 3937 17674 SC R I1043 58972 SC R I1043 59598 SC R I1043 59602 SC R I1043 59605 SC R I1043 59608 SC R I1043 48415 SC R I1043 48511 SC R I1043 58061 EC H 3937 18765 SC R I1043 52036 EC H 3937 15168 SC R I1043 58277 SenLT2 95332 M G 1655 4744 EcoR IM 131994 ED L933 26073 Sfl301 88087 Sfl2457T 72554 92 120790 92 120791 EC H 3937 17090 EC H 3937 17537 YPCO 92 119344 SC R I1043 54897 EC H 3937 19858 EC H 3937 19855 EC H 3937 19852 EC H 3937 19851 SC R I1043 51623 EC H 3937 19306 EC H 3937 19050 SC R I1043 54545 SC R I1043 54551 EC H 3937 17097 SC R I1043 47533 EC H 3937 20252 EC H 3937 15600 EC H 3937 15603 EC H 3937 16115 SC R I1043 47561 YP91001 243830 YPCO 92 129020 YPKIM 33881 YP91001 241164 YPCO 92 117982 YPKIM 32953 EC H 3937 19309 EC H 3937 14726 EcoR IM 134012 ED L933 28099 M G 1655 10087 C FT073 79997 Sfl301 89357 Sfl2457T 74510 SenC T18 112185 SenLT2 100002 SenTy2 84856 SC R I1043 57364 YP91001 238556 YPCO 92 120760 YPKIM 31500 EC H 3937 14536 SC R I1043 52281 EC H 3937 19790 SC R I1043 47806 SC R I1043 47811 SenLT2 99999 SenC T18 112182 SenTy2 84855 EC H 3937 19718 SC R I1043 56439 SC R I1043 52063 SC R I1043 48500 EC H 3937 14824 SC R I1043 57403 EC H 3937 18502 SC R I1043 57501 EC H 3937 18511 SC R I1043 57474 EcoR IM 132579 ED L933 26681 M G 1655 6288 Sfl301 88220 Sfl2457T 73134 PlTTO 1 141025 YP91001 238664 YPCO 92 120888 YPKIM M C FT073 78420 EcoR IM 132580 ED L933 26682 Sfl301 88221 Sfl2457T 73135 SenC T18 109228 SenLT2 96205 SenTy2 82672 SenLT2 103914 EcoR IM 135436 ED L933 29529 M G 1655 14282 C FT073 81646 Sfl301 90598 Sfl2457T 76001 SenC T18 107948 SenTy2 83275 SenC T18 112064 SenC T18 113896 SenLT2 101078 92 120881 YPKIM 31532 SC R I1043 53573 EC H 3937 16380 Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014

Transcript of Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Page 1: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

ECH3937 20431ECH3937 20167ECH3937 14722ECH3937 17419ECH3937 18541ECH3937 17896ECH3937 17863PlTTO1 141022SCRI1043 50855SCRI1043 53164SCRI1043 54408SCRI1043 57558ECH3937 17662ECH3937 17665ECH3937 17668ECH3937 17672ECH3937 17674SCRI1043 58972SCRI1043 59598

SCRI1043 59602

SCRI1043 59605

SCRI1043 59608

ECH3937 16585SCRI1043 48415ECH3937 18892SCRI1043 48511ECH3937 18585SCRI1043 58061

ECH3937 18765

SCRI1043 52036ECH3937 15168

SCRI1043 58277SenLT2 95332

MG1655 4744

EcoRIM 131994

EDL933 26073

Sfl301 88087Sfl2457T 72554

YPCO92 120790YPCO92 120791

ECH3937 16436ECH3937 17090

ECH3937 17537SCRI1043 50312YP91001 236316YPCO92 119344YPKIM 32701

ECH3937 16979SCRI1043 51018ECH3937 14618SCRI1043 54897

ECH3937 19858

ECH3937 19855ECH3937 19852

ECH3937 19851SCRI1043 51623ECH3937 19306

ECH3937 19050SCRI1043 54545SCRI1043 54551

ECH3937 17097SCRI1043 47533

ECH3937 20252

ECH3937 15600ECH3937 15603ECH3937 16115

SCRI1043 47561

YP91001 243830YPCO92 129020YPKIM 33881

ECH3937 15513SCRI1043 53830YP91001 241164YPCO92 117982YPKIM 32953

ECH3937 19309

ECH3937 14726

EcoRIM 134012EDL933 28099MG1655 10087

CFT073 79997

Sfl301 89357

Sfl2457T 74510

SenCT18 112185

SenLT2 100002

SenTy2 84856

ECH3937 14843SCRI1043 57364YP91001 238556YPCO92 120760YPKIM 31500

ECH3937 14536

SCRI1043 52281

ECH3937 19790

SCRI1043 47806

SCRI1043 47811

SenLT2 99999

SenCT18 112182SenTy2 84855

ECH3937 19718SCRI1043 56439ECH3937 18754SCRI1043 52063ECH3937 46680SCRI1043 48500

ECH3937 14824SCRI1043 57403ECH3937 18502SCRI1043 57501

ECH3937 18511SCRI1043 57474

EcoRIM 132579EDL933 26681MG1655 6288

Sfl301 88220

Sfl2457T 73134

PlTTO1 141025

YP91001 238664YPCO92 120888 YPKIM 31533

MG1655 6290CFT073 78420

EcoRIM 132580EDL933 26682

Sfl301 88221

Sfl2457T 73135

SenCT18 109228SenLT2 96205SenTy2 82672

SenLT2 103914

EcoRIM 135436

EDL933 29529

MG1655 14282

CFT073 81646

Sfl301 90598

Sfl2457T 76001SenCT18 107948SenLT2 95413SenTy2 83275SenCT18 112064SenLT2 99818SenCT18 113896SenLT2 101078SenTy2 85671YP91001 238660YPCO92 120881YPKIM 31532

SCRI1043 53573ECH3937 16380

Microbial Evolution

Zoology/Anthro/Botany 410Nicole T. PernaApril24, 2014

Page 2: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

A couple of key facts

• Prokaryotes have been around a long time (2.5-3.5 GYA). Bacteria and Archaea diverged a very long time ago and are not more closely related to each other than to eukaryotes

• Prokaryotes exhibit tremendous diversity of habitats, lifestyles, and metabolic strategies

Page 3: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Important applications of microbial evolution

Page 4: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Critical Topics Already Introduced

• Genomic revolution and genome evolution– Core vs. Variable fractions of genomes– Pan-genome– Genome size and organization

• Horizontal (Lateral) Gene Transfer (HGT)– There is no “tree of life”– How frequent is HGT?

• Bacterial species - is there such a thing? What do we mean?

Page 5: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Assigned reading

Page 6: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Microbial genome sequence availability is exponentially increasing

Page 7: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

NCBI Genome Project List

As of 4/19/2012 (2013):

2029 complete bacterial genomes (2510)

134 complete archaea (262)

3313 draft bacteria (>10K)44 draft archaea

4600 bacteria – no data yet49 archaea – no data yet

http://www.ncbi.nlm.nih.gov/genome/browse/

Page 8: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

How well sampled is prokaryotic diversity by current genome sequences?

Koonin and Wolf 2008 perspective:

• Uncultivated organisms remain problematic• Only 10% of the genes in major metagenomic samplings have

no detectable homologs

“The possibility, certainly, remains that major new and, perhaps, unusual groups of archaea and bacteria dwell in complex and unusual habitats. Nevertheless, it appears likely that the current collections of archaeal and bacterial genomes provide a reasonable approximation of the diversity of prokaryotic life forms on earth.”

Page 9: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Genomic Encyclopedia of Bacteria and Archaea (GEBA) Project

• Objective – sequence genomes selected solely for their phylogenetic novelty (plus in depth sampling of a single phylum)

• …based on 16S rDNA tree

• Wu et al. Nature. 2009 Dec 24; 462(7276):1056-60.

Page 10: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

DY Wu et al. Nature 462, 1056-1060 (2009) doi:10.1038/nature08656

Maximum-likelihood phylogenetic tree of the bacterial domain based on a concatenated alignment of 31 broadly conserved protein-coding genes16. Phyla are distinguished by colour of the branch and GEBA genomes are indicated in red in the outer circle of species names.

53 GEBA bacteria accounted for 2.8–4.4 times more phylogenetic diversity than randomly sampled subsets of 53 non-GEBA bacterial genomes

Page 11: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

DY Wu et al. Nature 462, 1056-1060 (2009) doi:10.1038/nature08656

Rate of discovery of protein families as a function of phylogenetic breadth of genomes.

Even discovered a bacterial homolog of eukaryotic cytoskeleton protein, Actin

Page 12: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Evolution-oriented reasons to target genomes for sequencing

• Maximize sampling of diversity• Understand structure of particular populations

and/or species• Make targeted comparisons to understand the

genetic basis of phenotypic differences

Page 13: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Size and organization of microbial genomes (Koonin and Wolf 2008)

Size Range = 180 Kbp – 13 Mbp

Page 14: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Structure of a prokaryotic genome

• One circular chromosome is typical.• Some have other replicons, such as linear or

circular plasmids.• Some have more than one chromosome,

generally distinguished from a plasmid by the presence of at least one “essential” gene.

• Some have linear chromosomes.

Page 15: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Fitch WM. Trends Genet. 2000 May;16(5):227-31.

Page 16: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Analogy vs Homology

Analogy

The relationship of any two characters that have descended convergently from unrelated ancestors.

Homology

The relationship of any two characters that have descended, usually with divergence, from a common ancestral character.

Page 17: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Orthology

The relationship of any two homologous characters whose common ancestor lies in the cenancestor of the taxa from which the two sequences were obtained.

Paralogy

The relationship of any two homologous characters arising from a duplication of the gene for that character.

Xenology

The relationship of any two homologous characters whose history, since their common ancestor, involves an interspecies (horizontal) transfer of the genetic material for at least one of those characters.

Page 18: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Test Yourself

• A1 – B1• A1 – B2• A1 – C3• B1 – C2• C2 – C3• B2 – C3• C3 – AB1

Page 19: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Homology on a Genome-Scale

• How many and which genes are common to two or more organisms?

• Which genes differentiate one organism from another?

• How is homology related to function?

Page 20: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

A phylogenetic perspective

• Orthologs are the set of genes/proteins with gene trees identical to the species tree.

• We can understand other types of homology relationships by comparison to the species tree.

• But often we don’t know the species tree, and phylogenetic methods are complex

Page 21: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Consider two genomes

• Use BLASTP to compare one set of proteins (proteome) to the other

• Which set will you use as the query and which as the database?

• What criteria will you use to define “a match”?

GenomeA – gene 1GenomeA – gene 2GenomeA – gene 3

GenomeB– gene 1GenomeB – gene 2GenomeB – gene 3

A1, A3, B2 and B3 are homologs (assuming the aligned regions overlap)

Page 22: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Reciprocal Best Hits

• Use BLASTP to compare sets of proteins (proteome) to each other– First using GenomeA to query against GenomeB– Then using GenomeB to query against GenomeA– Save only one best match for each query– Save only the reciprocal best matches as “orthologs”

GenomeA – gene 1GenomeA – gene 2GenomeA – gene 3

GenomeB– gene 1GenomeB – gene 2GenomeB – gene 3

GenomeA – gene 1GenomeA – gene 2GenomeA – gene 3

GenomeB– gene 1GenomeB – gene 2GenomeB – gene 3

GenomeA – gene 1GenomeA – gene 2GenomeA – gene 3

GenomeB– gene 1GenomeB – gene 2GenomeB – gene 3

Lose A3-B2 and A1-B3 homology

Page 23: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Software/Methods for Predicting Orthologs from Genome Sequences

• RBH• RSD (Reciprocal Shortest Distance)• INPARANOID• RIO• Orthostrapper• Ortholuge• TribeMCL• OrthoMCL

Page 24: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Method Comparison

Chen F, Mackey AJ, Vermunt JK, Roos DS. PLoS ONE. 2007 Apr 18;2(4):e383.

Page 25: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Core and variable genes- single genome perspective

A small number of genes have orthologs in all microbial genomes (core)

More genes have orthologs in many genomes, but not all (shell)

Some genes are rare and have orthologs in only a few genomes (cloud)

Some are unique to one genome (ORFans)

Page 26: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Core and variable genes – species perspective (pan-genome)

For some species as a whole,

The number of core (plus shell) genes can be much smaller than the variable fraction (cloud plus ORFans)

And the pan-genome can be very large

Touchon et al. PLoS Genetics. 2009

Page 27: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Different types of pan-genomes

Figure 3. Power law regression for species with open and closed pan-genomes. Tettelin et al. Curr Opin Microbiology 2008:11(5).

Page 28: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Open vs Closed Pan-genomes

• Open– Number of new genes discovered continues to grow

as additional genomes of the species are sequenced– Organisms live in diverse environments and are

genetically amenable to horizontal gene transfer• Closed

– Number of new genes discovered is very small as additional genomes of the species are sequenced

– Organisms have little exposure to other organisms and/or are refractory to horizontal gene transfer

Page 29: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Horizontal Gene Transfer

• Mechanisms include conjugation, transduction and transformation

• Can introduce entirely new genes and gene clusters into genomes (grow the pan-genome)

• Can replace existing genes with functionally equivalent (?) xenologs (scramble phylogenetic history)

Page 30: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Horizontal Gene Transfer

• How prevalent is it?– We don’t know. Debates continue largely based on the

challenges of separating the error associated with phylogenetic reconstruction from true differences in phylogenetic signal

• Who is doing it?– We don’t know. Same problem as above.– Good evidence that it is much more frequent within (some)

species than between– Some evidence for relationship with evolutionary distance

and/or commonality of enviroment

Page 31: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

SSU rDNA perspective

Page 32: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

EVOLUTION: Genome Data Shake Tree of LifeE Pennisi - Science, 1998 - sciencemag.org

The ring of life provides evidence for a genome fusion origin of eukaryotes MC Rivera, JA Lake - Nature, 2004

The net of life: reconstructing the microbial phylogenetic networkV Kunin, L Goldovsky, N Darzentas, CA … - Genome Research 2005

The tree of one percentT Dagan, W Martin - Genome biology, 2006

Uprooting the tree of lifeWF Doolittle - Evolution: a Scientific American reader, 2006

Page 33: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.
Page 34: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Comparison of phylogenies for nearly universally conserved genes

102 ML trees for 100 taxa

Objective – compare topological distance between trees

New metric called IS (inconsistency score) = fraction of splits two trees have in common

The network of similarities among the nearly universal trees (NUTs). (a) Each node (green dot) denotes a NUT, and nodes are connected by edges if the similarity between the respective edges exceeds the indicated threshold. (b) The connectivity of 102 NUTs and the 14 1:1 NUTs depending on the topological similarity threshold.

Page 35: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Real trees are more similar to each other than randomly simulated trees

Although no single tree appears to represent the evolutionary history of these organisms, there is distinctly preserved phylogenetic signal across the dataset as a whole

Page 36: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

The big divide?

• Look for evidence of HGT between bacteria and archaea

• 56% of NUTs separated the groups perfectly

• 44% show at least one HGT– 13% from archaea to bacteria– 23% from bacteria to archaea– 8% both directions

The supernetwork of the NUTs. Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159

Page 37: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Expanding to ~6800 other predicted ortholog clusters

• Network connectivity is greatly reduced

• Different functional categories of genes show different levels of connectedness

Network representation of the 6,901 trees of the forest of life. The 102 NUTs are shown as red circles in the middle. The NUTs are connected to trees with similar topologies: trees with at least 50% of similarity with at least one NUT (P-value < 0.05) are shown as purple circles and connected to the NUTs. The rest of the trees are shown as green circles.Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159

Page 38: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Proc Natl Acad Sci U S A. 2005 Oct 4;102(40):14332-7.

Page 39: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved

Beiko R G et al. PNAS 2005;102:14332-14337

©2005 by National Academy of Sciences

Page 40: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Horizontal Transfer within speciesEstimate that a given basepair is 100 times more likely to have undergone a recombination event than a point mutation within the species E. coli, so how can we justify representing the relationship between strains with a tree like structure?

Modeling and simulation support inference of a tree summarizing dominant signal AS LONG AS patterns of recombination are more or less random between lineages

Touchon et al. PLoS Genetics. 2009

Page 41: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Major processes affecting prokaryotic genome evolution (Koonin and Wolf, 2008)

(1) Genome streamlining under strong selection.(2) Neutral gene loss and genome degradation under weak

selection (or neutral).(3) Innovation and complexification via gene duplication.(4) Innovation via operon shuffling.(5) Innovation and complexification via HGT, in particular, of

partially selfish operons, a process that often leads to nonorthologous gene displacement.

(6) Replicon fusion, propagation of mobile elements and other interactions between the relatively stable chromosomes and the mobilome.

Page 42: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Test Yourself

• A1 – B1• A1 – B2• A1 – C3• B1 – C2• C2 – C3• B2 – C3• C3 – AB1

Page 43: Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Test Yourself

• A1 – B1 = Ortho• A1 – B2 = Ortho• A1 – C3 = Ortho• B1 – C2 = Para (out)• C2 – C3 = Para (in)• B2 – C3 = Ortho• C3 – AB1= Xeno