Phylogenomics Talk at UC Berkeley by J. A. Eisen

175
Phylogenomics and the Diversity and Diversification of Microbes Jonathan A. Eisen UC Davis UC Berkeley Talk February 3, 2011 Monday, February 14, 2011

description

Talk at UC Berkeley by Jonathan Eisen from UC Davis

Transcript of Phylogenomics Talk at UC Berkeley by J. A. Eisen

Page 1: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics and the Diversity and Diversification of Microbes

Jonathan A. EisenUC Davis

UC Berkeley TalkFebruary 3, 2011

Monday, February 14, 2011

Page 2: Phylogenomics Talk at UC Berkeley by J. A. Eisen

My Obsessions

Jonathan A. EisenUC Davis

UC Berkeley TalkFebruary 3, 2011

Monday, February 14, 2011

Page 3: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 4: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Social Networking in Science

Scientist Reveals Secret of the Ocean: It's HimBy NICHOLAS WADE Published: April 1, 2007

Maverick scientist J. Craig Venter has done it again. It was just a few years

ago that Dr. Venter announced that the human genome sequenced by Celera

Genomics was in fact, mostly his own. And now, Venter has revealed a second

twist in his genomic self-examination. Venter was discussing his Global

Ocean Voyage, in which he used his personal yacht to collect ocean water

samples from around the world. He then used large filtration units to collect

microbes from the water samples which were then brought back to his high

tech lab in Rockville, MD where he used the same methods that were used to

sequence the human genome to study the genomes of the 1000s of ocean

dwelling microbes found in each sample. In discussing the sampling methods, Venter let slip his

latest attack on the standards of science – some of the samples were in fact not from the ocean, but

were from microbial habitats in and on his body.

“The human microbiome is the next frontier,” Dr. Venter said. “The ocean voyage was just a cover.

My main goal has always been to work on the microbes that live in and on people. And now that my

genome is nearly complete, why not use myself as the model for human microbiome studies as well.

It is certainly true that in the last few years, the microbes that live in and on people have become a

hot research topic. So hot that the same people who were involved in the race to sequence the human

genome have been involved in this race too. Francis Collins, Venter main competitor and still the

director of the National Human Genome Research Institute (NHGRI), recently testified before

Congress regarding this type of work. He said, “There are more bacteria in the human gut than

human cells in the entire human body… The human microbiome project represents an exciting new

research area for NHGRI.” Other minor players in the public’s human genome effort, such as Eric

Lander at the Whitehead Institute and George Weinstock at Baylor College of Medicine are also

trying to muscle their way into studies of the human microbiome.

But Venter was not going to have any of this. “This time, I was not going to let them know I was

coming. There would be no artificially declared tie. We set up a cutting edge human microbiome

sampling system on the yacht, and then headed out to sea. They never knew what hit them. Now I

have finished my microbiome.”

Reactions among scientists range from amusement to indifference, most saying that it is

unimportant whose microbiome was sequenced. But a few scientists expressed disappointment that

Dr. Venter had once again subverted the normal system of anonymity. Recent human microbome

studies by other researchers have all involved anonymous donors. Jeff Gordon, at the Washington

University in St. Louis expressed astonishment, “I have to fill out about 200 forms for every sample.

It takes years to get anything done. And now Venter sails away with the prize. All I can say is, I will

never listen to one of my review boards again.”

Venter had hinted at the possibility that something was amiss in an interview he gave last week for

the BBC News. He said “Most of the samples we studied were from the ocean but a few were from

people.” When the interviewer seemed stunned, Doug Rusch, one of Venter’s collaborators stepped

in and said “Collected with the help of other people.”

Venter was apparently spurred to make the admission today that many of the samples were in fact

from his own microbiome due to a video that surfaced on YouTube showing Jeff Hoffman, the

nytimes.com/sports

How good is your bracket? Compare your tournament picksto choices from members of The New York Times sportsdesk and other players.

Also in Sports:

The Bracket Blog - all the news leading up to the FinalFour

Bats Blog: Spring training updates

Play Magazine: How to build a super athlete

Sunday, April 1, 2007 HealthWORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION ARTS STYLE TRAVEL JOBS REAL ESTATE AUTOS

FITNESS & NUTRITION HEALTH CARE POLICY MENTAL HEALTH & BEHAVIOR

PRINT

SINGLE-PAGE

SAVE

SHARE

SHARE

HOME PAGE MY TIMES TODAY'S PAPER VIDEO MOST POPULAR TIMES TOPICS Welcome, fcollins Member Center Log Out

Monday, February 14, 2011

Page 5: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Bacterial evolve

Monday, February 14, 2011

Page 6: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty

Monday, February 14, 2011

Page 7: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty

Mechanisms of Origin of New

Functions

Monday, February 14, 2011

Page 8: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty

Mechanisms of Origin of New

Functions

Variation in Mechanisms:

Patterns, Causes and Effects

Monday, February 14, 2011

Page 9: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty

Mechanisms of Origin of New

Functions

Species Evolution

Variation in Mechanisms:

Patterns, Causes and Effects

Monday, February 14, 2011

Page 10: Phylogenomics Talk at UC Berkeley by J. A. Eisen

• How does novelty originate?• Major categories of processes• From within

• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes

• From outside• Lateral gene transfer• Symbioses

Phylogenomics of Novelty

Monday, February 14, 2011

Page 11: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Mechanisms of Origin of New

Functions

• How does novelty originate?• Major categories of processes• From within

• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes

• From outside• Lateral gene transfer• Symbioses

Phylogenomics of Novelty

Monday, February 14, 2011

Page 12: Phylogenomics Talk at UC Berkeley by J. A. Eisen

• Patterns of variation• Within species• Between species

• Causes• Variation in replication, recombination and repair

• Effects• Differences in evolvability• Ecological niche• Short and long term genome evolution

Phylogenomics of Novelty

Monday, February 14, 2011

Page 13: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Variation in Mechanisms:

Patterns, Causes and Effects

• Patterns of variation• Within species• Between species

• Causes• Variation in replication, recombination and repair

• Effects• Differences in evolvability• Ecological niche• Short and long term genome evolution

Phylogenomics of Novelty

Monday, February 14, 2011

Page 14: Phylogenomics Talk at UC Berkeley by J. A. Eisen

• Information needed to distinguish convergence from homology• Allows inference of rates and patterns of change• Allows one to determine if something is a “one time” event or a common theme in many lineages

Phylogenomics of Novelty

Monday, February 14, 2011

Page 15: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Species Evolution

• Information needed to distinguish convergence from homology• Allows inference of rates and patterns of change• Allows one to determine if something is a “one time” event or a common theme in many lineages

Phylogenomics of Novelty

Monday, February 14, 2011

Page 16: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty

Mechanisms of Origin of New

Functions

Species Evolution

Variation in Mechanisms:

Patterns, Causes and Effects

Monday, February 14, 2011

Page 17: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty

Mechanisms of Origin of New

Functions

Species Evolution

Variation in Mechanisms:

Patterns, Causes and Effects

Focus Today on Using Sequence

Information for All of This

Monday, February 14, 2011

Page 18: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Why do this?

• Discover causes and effects of differences in evolvability

• Improve predictions from genome analysis• Guide interpretation of biological data

Monday, February 14, 2011

Page 19: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Outline

• Introduction• Phylogenomic Stories

– Within genome invention of novelty– Stealing novelty– Communities of microbes– Community service and knowing what we don’t

know

Monday, February 14, 2011

Page 20: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Introduction

Genome Sequencing

Monday, February 14, 2011

Page 21: Phylogenomics Talk at UC Berkeley by J. A. Eisen

rRNA Tree of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Monday, February 14, 2011

Page 22: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Limited Sampling of RRR Studies

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Monday, February 14, 2011

Page 23: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Limited Sampling of RRR Studies

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Haloferax

Methanococcus

Deinococcus

Chlorobium

Thermotoga

Monday, February 14, 2011

Page 24: Phylogenomics Talk at UC Berkeley by J. A. Eisen

TIGR

Ecoli vs. Hvolcanii

1E-07

1E-06

1E-05

0.0001

0.001

0.01

0.1

1

RelativeSurvival

0 50 100 150 200 250 300 350 400

UV J/m2

UV Survival E.coli vs H.volcanii

H.volcanii WFD11

E.coli NR10125 mfd+

E.coli NR10121 mfd-

Monday, February 14, 2011

Page 25: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Label5#2

0

0.2

0.4

0.6

0 2000 4000 6000 8000 10000 12000 14000 16000 18000

Avg. Mol. Wt.(Base Pairs)

H. volcanii UV Repair Label 7 - 45J / m2)

45 J/m2 Dark 24 Hours

45 J/m2 Photoreac.

45 J/m2 t0

0 J/m2 t0

Monday, February 14, 2011

Page 26: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Fleischmann et al. 1995

Monday, February 14, 2011

Page 27: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Limited Sampling of RRR Studies

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Haloferax

Methanococcus

Deinococcus

Chlorobium

Thermotoga

Monday, February 14, 2011

Page 28: Phylogenomics Talk at UC Berkeley by J. A. Eisen

From http://genomesonline.orgMonday, February 14, 2011

Page 29: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 30: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 31: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 32: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Human commensals

Monday, February 14, 2011

Page 33: Phylogenomics Talk at UC Berkeley by J. A. Eisen

From http://genomesonline.orgMonday, February 14, 2011

Page 34: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty I

Origin of Functions from Within

Monday, February 14, 2011

Page 35: Phylogenomics Talk at UC Berkeley by J. A. Eisen

• How does novelty originate?• Major categories of processes• From within

• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes

• From outside• Lateral gene transfer• Symbioses

Phylogenomics of Novelty

Monday, February 14, 2011

Page 36: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Mechanisms of Origin of New

Functions

• How does novelty originate?• Major categories of processes• From within

• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes

• From outside• Lateral gene transfer• Symbioses

Phylogenomics of Novelty

Monday, February 14, 2011

Page 38: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Blast Search of H. pylori “MutS”

• Blast search pulls up Syn. sp MutS#2 with much higher p value than other MutS homologs

• Based on this TIGR predicted this species had mismatch repair

• Assumes functional constancy

Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078. Monday, February 14, 2011

Page 39: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Predicting Function• Identification of motifs

– Short regions of sequence similarity that are indicative of general activity

– e.g., ATP binding• Homology/similarity based methods

– Gene sequence is searched against a databases of other sequences

– If significant similar genes are found, their functional information is used

• Problem– Genes frequently have similarity to hundreds of motifs

and multiple genes, not all with the same function

Monday, February 14, 2011

Page 41: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Overlaying Functions onto Tree

Aquae Trepa

Rat

FlyXenla

MouseHumanYeast

NeucrArath

BorbuSynsp

Neigo

ThemaStrpy

Bacsu

Ecoli

TheaqDeiraChltr

SpombeYeast

YeastSpombe

MouseHuman

Arath

YeastHumanMouse

Arath

StrpyBacsu

HumanCeleg

YeastMetthBorbu

AquaeSynsp

Deira Helpy

mSaco

YeastCeleg

Human

MSH4

MSH5MutS2

MutS1

MSH1

MSH3

MSH6

MSH2

Based on Eisen, 1998 Nucl Acids Res 26: 4291-4300.

Monday, February 14, 2011

Page 42: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 43: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Evolutionary Functional Prediction

1 2 3 4 5 6

3 5

3

1A 2A 3A 1B 2B 3B

2A 1B

1

1 2

2

2 31

1A 3A

1A 2A 3A

1A 2A 3A

4 6

4 5 6

4 5 6

2B 3B

1B 2B 3B

1B 2B 3B

1A3A

1B 2B3B

12 4

62A

2A

53

5

EXAMPLE BMETHOD

Duplication?

Duplication?

IDENTIFY HOMOLOGS

OVERLAY KNOWNFUNCTIONS ONTO TREE

INFER LIKELY FUNCTIONOF GENE(S) OF INTEREST

ALIGN SEQUENCES

CALCULATE GENE TREE

CHOOSE GENE(S) OF INTEREST

Species 3Species 1 Species 2

ACTUAL EVOLUTION(ASSUMED TO BE UNKNOWN)

EXAMPLE A

Duplication?

Duplication

Ambiguous

Based on Eisen, 1998 Genome Res 8: 163-167.

Monday, February 14, 2011

Page 44: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Example 2: Recent Changes

E.coli gi1787690

B.subtilis gi2633766Synechocystis sp. gi1001299Synechocystis sp. gi1001300Synechocystis sp. gi1652276Synechocystis sp. gi1652103H.pylori gi2313716H.pylori99 gi4155097C.jejuni Cj1190cC.jejuni Cj1110cA.fulgidus gi2649560A.fulgidus gi2649548B.subtilis gi2634254B.subtilis gi2632630B.subtilis gi2635607B.subtilis gi2635608B.subtilis gi2635609B.subtilis gi2635610B.subtilis gi2635882E.coli gi1788195E.coli gi2367378E.coli gi1788194

E.coli gi1789453C.jejuni Cj0144C.jejuni Cj0262c

H.pylori gi2313186H.pylori99 gi4154603C.jejuni Cj1564

C.jejuni Cj1506cH.pylori gi2313163H.pylori99 gi4154575H.pylori gi2313179H.pylori99 gi4154599C.jejuni Cj0019cC.jejuni Cj0951cC.jejuni Cj0246cB.subtilis gi2633374T.maritima TM0014

T.pallidum gi3322777T.pallidum gi3322939T.pallidum gi3322938B.burgdorferi gi2688522T.pallidum gi3322296B.burgdorferi gi2688521T.maritima TM0429T.maritima TM0918T.maritima TM0023T.maritima TM1428T.maritima TM1143T.maritima TM1146P.abyssi PAB1308P.horikoshii gi3256846P.abyssi PAB1336P.horikoshii gi3256896P.abyssi PAB2066P.horikoshii gi3258290P.abyssi PAB1026P.horikoshii gi3256884D.radiodurans DRA00354D.radiodurans DRA0353D.radiodurans DRA0352P.abyssi PAB1189P.horikoshii gi3258414B.burgdorferi gi2688621M.tuberculosis gi1666149

V.cholerae VC0512V.cholerae VCA1034V.cholerae VCA0974V.cholerae VCA0068V.cholerae VC0825V.cholerae VC0282V.cholerae VCA0906V.cholerae VCA0979V.cholerae VCA1056V.cholerae VC1643V.cholerae VC2161V.cholerae VCA0923V.cholerae VC0514V.cholerae VC1868V.cholerae VCA0773V.cholerae VC1313V.cholerae VC1859V.cholerae VC1413V.cholerae VCA0268V.cholerae VCA0658V.cholerae VC1405V.cholerae VC1298V.cholerae VC1248V.cholerae VCA0864V.cholerae VCA0176V.cholerae VCA0220V.cholerae VC1289V.cholerae VCA1069V.cholerae VC2439V.cholerae VC1967V.cholerae VCA0031V.cholerae VC1898V.cholerae VCA0663V.cholerae VCA0988V.cholerae VC0216V.cholerae VC0449V.cholerae VCA0008V.cholerae VC1406V.cholerae VC1535V.cholerae VC0840

V.cholerae VC0098V.cholerae VCA1092

V.cholerae VC1403V.cholerae VCA1088

V.cholerae VC1394

V.cholerae VC0622

NJ

**

*****

******

****

***

****

**

*

****

**

**

******

******

*

****

******

***

***

***

****

**

*

****

*

• Phylogenomic functional prediction may not work well for very newly evolved functions

• Can use understanding of origin of novelty to better interpret these cases?

• Screen genomes for genes that have changed recently

– Pseudogenes and gene loss– Contingency Loci– Acquisition (e.g., LGT)– Unusual dS/dN ratios– Rapid evolutionary rates– Recent duplications

Monday, February 14, 2011

Page 45: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Tetrahymena Genome Processing

• Probably exists as a defense mechanism• Analogous to RIPPING and

heterochromatin silencing• Presence of repetitive DNA in MAC but

not TEs suggests the mechanism involves targeting foreign DNA

• Thus unlike RIPPING ciliate processing does not limit diversification by duplication

Eisen et al. 2006. PLoS Biology.Monday, February 14, 2011

Page 46: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Conclusions

• Enormous variation in processes underlying origin of novelty

• See within genomes -> between species• Knowledge about mechanisms and variation

helps predictions of function and biology from analysis of sequence data

Monday, February 14, 2011

Page 47: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty II

Sometimes, it is easier to steal, borrow, or coopt functions rather than evolve them

anew

Monday, February 14, 2011

Page 48: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Stealing DNA

Monday, February 14, 2011

Page 49: Phylogenomics Talk at UC Berkeley by J. A. Eisen

rRNA Tree of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Monday, February 14, 2011

Page 50: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Perna et al. 2003Monday, February 14, 2011

Page 51: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Network of Life

Figure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Monday, February 14, 2011

Page 52: Phylogenomics Talk at UC Berkeley by J. A. Eisen

A. thaliana T1E2.8 is aChloroplast Derived HSP60

Monday, February 14, 2011

Page 53: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenetic Distribution Novelty: Bacterial Actin Related Protein

Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin

!"#$%&'()*&& !"#$%&'(%()+"#,-.(/01 !"#*+,**'+(

2"#3)&4&*&& !"#*)$*),+%5"#$-.-6&0&1- !"#$%,$-%)(7"#0(1.8-9& !"#$''+-+,',!5"#:1,)*&$/0 !"#&$,%+)+-+

;"#01,&-*0 !"#%*+$--(<"#$-.-3.1%&0 !"#%',&'-+)

2"#$&*-.-1 !"#$'(-%%+&$="#$.1001 !"#-*$+$(&(>"#0$1,/%1.&0 !"#&$**+),)-!;"#01,&-*0 !"#*+,$*'(

5"#:1,)*&$/0 !"#&$,%+%-%%5"#$-.-6&0&1- !"#',&+$)*?"#@-%1*)A10(-. !"#&%'%&*%*B"#A1%%/0# "#%*,-&*'(2"#*-)').@1*0 !"#*-&'''(+5"#$-.-6&0&1- !"#',&&*&*?"#@-%1*)A10(-. !"#$)),)*%,;"#01,&-*0 !"#*+,$*),!;"#)$C.1$-/@ !"#&&),(*((-

."#,1(-*0 !"#$'-+*$((&!!"#(C1%&1*1 !"#$-,(%'+-!

5"#$-.-6&0&1- !"#$++-&%%!

?"#@-%1*)A10(-. !"#$)),),%)

?"#C1*0-*&&!"#&$-*$$(&$5"#$-.-6&0&1- !"#',&,$$%

5"#:1,)*&$/0 !"#&$,%+-,(,!5"#$-.-6&0&1- !"#$,+$(,&

?"#4&0$)&4-/@ !"#''-+&%$-

D"#01(&61 !"#$-&'*)%&+!!"#(C1%&1*1!"#$-%$ $),)

?"#@-%1*)A1(-. !"#$((&+,*-<"#@/0$/%/0 !"#&&'&%'*(,

((

')

$++$++

'*

$++

$++

)*

$++

$++

*$

((),

$++()

(%$++

)%

$++

-)

$++

+/*!

!"#$%

!&'(

!&')

!&'*

+!&'

!&',

!&'-

!&'.

!&'/

!&'(0

See also Guljamow et al. 2007 Current Biology. Wu et al. 2009 Nature 462, 1056-1060

Monday, February 14, 2011

Page 54: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Correlated gain/loss of genes

• Microbial genes are lost rapidly when not maintained by selection

• Genes can be acquired by lateral transfer• Frequently gain and loss occurs for entire

pathways/processes• Thus might be able to use correlated

presence/absence information to identify genes with similar functions

Monday, February 14, 2011

Page 55: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Non-Homology Predictions: Phylogenetic Profiling

• Step 1: Search all genes in organisms of interest against all other genomes

• Ask: Yes or No, is each gene found in each other species

• Cluster genes by distribution patterns (profiles)

Monday, February 14, 2011

Page 56: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Carboxydothermus hydrogenoformans

• Isolated from a Russian hotspring• Thermophile (grows at 80°C)• Anaerobic• Grows very efficiently on CO

(Carbon Monoxide)• Produces hydrogen gas• Low GC Gram positive

(Firmicute)• Genome Determined (Wu et al.

2005 PLoS Genetics 1: e65. )

Monday, February 14, 2011

Page 60: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Stealing Organisms (Symbioses)

Monday, February 14, 2011

Page 61: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Mutualistic Genome Evolution

• Compare and contrast different types of mutualistic symbioses

• Diverse hosts, symbionts, biology, ages• Organelles, chemosymbioses,

photosynthetic symbioses, nutritional symbioses

• What are the rules & patterns?

Monday, February 14, 2011

Page 62: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Glassy Winged Sharpshooter

• Feeds on xylem sap

• Vector for Pierce’s Disease

• Potential bioterror agent

Monday, February 14, 2011

Page 63: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Sharpshooter Shotgun Sequencing

shotgun

Wu et al. 2006 PLoS Biology 4: e188.Collaboration with Nancy Moran’s lab

Monday, February 14, 2011

Page 64: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 65: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 66: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 67: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Higher Evolutionary Rates in Endosymbionts

Wu et al. 2006 PLoS Biology 4: e188. Collaboration with Nancy Moran’ s LabMonday, February 14, 2011

Page 68: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Variation in Evolution Rates

MutS MutL

+ +

+ +

+ +

+ +

_ _

_ _

Wu et al. 2006 PLoS Biology 4: e188. Collaboration with Nancy Moran’ s LabMonday, February 14, 2011

Page 69: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Polymorphisms in Metapopulation

• Data from ~200 hosts– 104 SNPs– 2 indels

• PCR surveys show that this is between host variation

• Much lower ratio of transitions:transversions than in Blochmannia

• Consistent with absence of MMR from Blochmannia

Monday, February 14, 2011

Page 71: Phylogenomics Talk at UC Berkeley by J. A. Eisen

No Amino-Acid Synthesis

Monday, February 14, 2011

Page 72: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 73: Phylogenomics Talk at UC Berkeley by J. A. Eisen

The Uncultured Majority

Monday, February 14, 2011

Page 74: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Great Plate Count Anomaly

Culturing Microscope

CountCount

Monday, February 14, 2011

Page 75: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Great Plate Count Anomaly

Culturing Microscope

CountCount <<<<

Monday, February 14, 2011

Page 76: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Great Plate Count Anomaly

Culturing Microscope

CountCount <<<<

DNA

Monday, February 14, 2011

Page 77: Phylogenomics Talk at UC Berkeley by J. A. Eisen

rRNA PCR

The Hidden Majority Richness estimates

Bohannan and Hughes 2003Hugenholtz 2002

Monday, February 14, 2011

Page 78: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 79: Phylogenomics Talk at UC Berkeley by J. A. Eisen

rRNA data increasing exponentially tooMonday, February 14, 2011

Page 80: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Perna et al. 2003Monday, February 14, 2011

Page 81: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Metagenomics

shotgun

clone

Monday, February 14, 2011

Page 82: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 83: Phylogenomics Talk at UC Berkeley by J. A. Eisen

How can we best use metagenomic data?

• Many possible uses including:– Improvements on rRNA based phylotyping and

species diversity measurements– Adding functional information on top of

phylogenetic/species diversity information• Most/all possible uses either require or are

improved with phylogenetic analysis

Monday, February 14, 2011

Page 84: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Example I: Phylotyping with rRNA and other genes

Monday, February 14, 2011

Page 85: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Functional Diversity of Proteorhodopsins?

Venter et al., 2004Monday, February 14, 2011

Page 86: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 87: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Example II: Binning

Monday, February 14, 2011

Page 88: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Metagenomics Challenge

Monday, February 14, 2011

Page 89: Phylogenomics Talk at UC Berkeley by J. A. Eisen

ABCDEFG

TUVWXYZ

Binning challenge

Monday, February 14, 2011

Page 90: Phylogenomics Talk at UC Berkeley by J. A. Eisen

ABCDEFG

TUVWXYZ

Binning challenge

Best binning method: reference genomes

Monday, February 14, 2011

Page 91: Phylogenomics Talk at UC Berkeley by J. A. Eisen

ABCDEFG

TUVWXYZ

Binning challenge

Best binning method: reference genomes

Monday, February 14, 2011

Page 92: Phylogenomics Talk at UC Berkeley by J. A. Eisen

ABCDEFG

TUVWXYZ

Binning challenge

No reference genome? What do you do?

Monday, February 14, 2011

Page 93: Phylogenomics Talk at UC Berkeley by J. A. Eisen

ABCDEFG

TUVWXYZ

Binning challenge

No reference genome? What do you do?

Phylogeny ....Monday, February 14, 2011

Page 94: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 95: Phylogenomics Talk at UC Berkeley by J. A. Eisen

No Amino-Acid Synthesis

Monday, February 14, 2011

Page 96: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 97: Phylogenomics Talk at UC Berkeley by J. A. Eisen

???????

Monday, February 14, 2011

Page 98: Phylogenomics Talk at UC Berkeley by J. A. Eisen

CFB Phyla

Monday, February 14, 2011

Page 99: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Wu et al. 2006 PLoS Biology 4: e188.

Baumannia makes vitamins and cofactors

Sulcia makes amino acids

Monday, February 14, 2011

Page 100: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenomics of Novelty III

Knowing What We Don’t Know

Monday, February 14, 2011

Page 101: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Research Topics

Mechanisms of Origin of New

Functions

Species Evolution

Variation in Mechanisms:

Patterns, Causes and Effects

Monday, February 14, 2011

Page 102: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Research Topics

Mechanisms of Origin of New

Functions

Species Evolution

Variation in Mechanisms:

Patterns, Causes and Effects

Monday, February 14, 2011

Page 103: Phylogenomics Talk at UC Berkeley by J. A. Eisen

As of 2002

Monday, February 14, 2011

Page 104: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 105: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 106: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 107: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 108: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Solution I: sequence more phyla

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen, Ward, Robb, Nelson, et al

Monday, February 14, 2011

Page 109: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 110: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Still highly biased in terms of the tree

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen & Ward, PIs

Monday, February 14, 2011

Page 111: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Archaea

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen & Ward, PIs

Monday, February 14, 2011

Page 112: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Eukaryotes

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen & Ward, PIs

Monday, February 14, 2011

Page 113: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Viruses

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen & Ward, PIs

Monday, February 14, 2011

Page 114: Phylogenomics Talk at UC Berkeley by J. A. Eisen

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Solution: Really Fill in the Tree

• GEBA• A genomic

encyclopedia of bacteria and archaea

Eisen & Ward, PIs

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Monday, February 14, 2011

Page 115: Phylogenomics Talk at UC Berkeley by J. A. Eisen

http://www.jgi.doe.gov/programs/GEBA/pilot.htmlMonday, February 14, 2011

Page 116: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Pilot Project: Components• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan

Eisen, Eddy Rubin, Jim Bristow)• Project management (David Bruce, Eileen Dalin, Lynne Goodwin)• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)• Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus,

Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)• Annotation and data release (Nikos Kyrpides, Victor Markowitz, et

al)• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor

Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla)

• Adopt a microbe education project (Cheryl Kerfeld)• Outreach (David Gilbert)• $$$ (DOE, Eddy Rubin, Jim Bristow)

Monday, February 14, 2011

Page 117: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Pilot Project Overview

• Identify major branches in rRNA tree for which no genomes are available

• Identify those with a cultured representative in DSMZ

• DSMZ grew > 200 of these and prepped DNA• Sequence and finish 100+ (covering breadth of

bacteriałarchaea diversity)• Annotate, analyze, release data• Assess benefits of tree guided sequencing• 1st paper Wu et al in Nature Dec 2009

Monday, February 14, 2011

Page 118: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Phylogenomic Lesson 1

The rRNA Tree of Life is a Useful Tool for Identifying Phylogenetically Novel

Genomes

Monday, February 14, 2011

Page 119: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Compare PD in Trees

From Wu et al. 2009 Nature 462, 1056-1060Monday, February 14, 2011

Page 120: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Phylogenomic Lesson 2

The rRNA Tree of Life is not perfect ...

Monday, February 14, 2011

Page 123: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 124: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Phylogenomic Lesson 3

Phylogeny-driven genome selection helps discover new genetic diversity

Monday, February 14, 2011

Page 125: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Network of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Archaea

Eukaryotes

Bacteria

Monday, February 14, 2011

Page 126: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Protein Family Rarefaction Curves

• Take data set of multiple complete genomes• Identify all protein families using MCL• Plot # of genomes vs. # of protein families

Monday, February 14, 2011

Page 132: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Synapomorphies exist

Wu et al. 2009 Nature 462, 1056-1060

Monday, February 14, 2011

Page 133: Phylogenomics Talk at UC Berkeley by J. A. Eisen

! !

!"#$%"&'(%)"*

+,%-./&#(%)"*

Monday, February 14, 2011

Page 134: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenetic Distribution Novelty: Bacterial Actin Related Protein

Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin

!"#$%&'()*&& !"#$%&'(%()+"#,-.(/01 !"#*+,**'+(

2"#3)&4&*&& !"#*)$*),+%5"#$-.-6&0&1- !"#$%,$-%)(7"#0(1.8-9& !"#$''+-+,',!5"#:1,)*&$/0 !"#&$,%+)+-+

;"#01,&-*0 !"#%*+$--(<"#$-.-3.1%&0 !"#%',&'-+)

2"#$&*-.-1 !"#$'(-%%+&$="#$.1001 !"#-*$+$(&(>"#0$1,/%1.&0 !"#&$**+),)-!;"#01,&-*0 !"#*+,$*'(

5"#:1,)*&$/0 !"#&$,%+%-%%5"#$-.-6&0&1- !"#',&+$)*?"#@-%1*)A10(-. !"#&%'%&*%*B"#A1%%/0# "#%*,-&*'(2"#*-)').@1*0 !"#*-&'''(+5"#$-.-6&0&1- !"#',&&*&*?"#@-%1*)A10(-. !"#$)),)*%,;"#01,&-*0 !"#*+,$*),!;"#)$C.1$-/@ !"#&&),(*((-

."#,1(-*0 !"#$'-+*$((&!!"#(C1%&1*1 !"#$-,(%'+-!

5"#$-.-6&0&1- !"#$++-&%%!

?"#@-%1*)A10(-. !"#$)),),%)

?"#C1*0-*&&!"#&$-*$$(&$5"#$-.-6&0&1- !"#',&,$$%

5"#:1,)*&$/0 !"#&$,%+-,(,!5"#$-.-6&0&1- !"#$,+$(,&

?"#4&0$)&4-/@ !"#''-+&%$-

D"#01(&61 !"#$-&'*)%&+!!"#(C1%&1*1!"#$-%$ $),)

?"#@-%1*)A1(-. !"#$((&+,*-<"#@/0$/%/0 !"#&&'&%'*(,

((

')

$++$++

'*

$++

$++

)*

$++

$++

*$

((),

$++()

(%$++

)%

$++

-)

$++

+/*!

!"#$%

!&'(

!&')

!&'*

+!&'

!&',

!&'-

!&'.

!&'/

!&'(0

See also Guljamow et al. 2007 Current Biology. Wu et al. 2009 Nature 462, 1056-1060

Monday, February 14, 2011

Page 135: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Phylogenomic Lesson 4

Phylogeny driven genome selection (and phylogenetics in general) improves genome annotation

Monday, February 14, 2011

Page 136: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling

• Took 56 GEBA genomes and compared results vs. 56 randomly sampled new genomes

• Better definition of protein family sequence “patterns”• Greatly improves “comparative” and “evolutionary”

based predictions• Conversion of hypothetical into conserved hypotheticals• Linking distantly related members of protein families• Improved non-homology predictionKostas

MavrommatisNatalia Ivanova

Thanos Lykidis

Nikos Kyrpides

Iain Anderson

Monday, February 14, 2011

Page 137: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Phylogenomic Lesson 5

Improves analysis of genome data from uncultured organisms

Monday, February 14, 2011

Page 138: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 139: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

Cannot be done without good sampling of genomes

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 140: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Phylogenetic Binning

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 141: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

Cannot be done without good sampling of genomes

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 142: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

GEBA Project improves metagenomic analysis

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 143: Phylogenomics Talk at UC Berkeley by J. A. Eisen

131

GEBA CyanoSequencing status (as of 01/14):

Awaiting  Material 11Library 12Production 22Finishing 5

Grand  Total 50

On-going/ Planed Activities:

- Building Cyanobacterial Metadatabase (IMG-GOLD)

- 10th Cyanobacterial Molecular Biology Workshop, Lake Arrowhead, CA (06/10)

--> Cheryl will host: Workshop training as prep for virtual Jamboree

Monday, February 14, 2011

Page 144: Phylogenomics Talk at UC Berkeley by J. A. Eisen

132

Plan: Sequence multiple Root Nodule Bacteria (RNBs) across

the planet. Pilot: 100 RNBs.

Alpha RNB

BradyrhizobiumMesorhizobiumRhizobium

Beta RNB

Sinorhizobium

CupriavidisBurkholderia

Balneimonas-like

DevosiaOchrobactrumPhyllobacterium

AzorhizobiumAllorhizobium

GEBA RNB

Goal: • Understand BioGeographical effects on species

evolution and understand host-specificity.

Rationale: • N2 fixation by legume pastures and crops provides 65% of

the N currently utilized in agricultural production.

• Contributes 25 to 90 million metric tones N pa.

• Symbioses save $US 6-10 billion annually on N fertilizer.

• Grain and animal production enhanced by fixed nitrogen supplied by the symbiosis.

Nikos KyrpidesMonday, February 14, 2011

Page 145: Phylogenomics Talk at UC Berkeley by J. A. Eisen

133

Monday, February 14, 2011

Page 146: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Still not happy

• NSF-funded Tree of Life Project

• A genome from each of eight phyla

Eisen & Ward, PIs

Monday, February 14, 2011

Page 147: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0

0.1250

0.2500

0.3750

0.5000

Alphaproteobacteria

Betaproteobacteria

Gammaproteobacteria

Epsilonproteobacteria

Deltaproteobacteria

Cyanobacteria

Firmicutes

Actinobacteria

Chlorobi

CFB

Chloroflexi

Spirochaetes

Fusobacteria

Deinococcus-Thermus

Euryarchaeota

Crenarchaeota

Sargasso Phylotypes

Wei

ght

ed %

of

Clo

nes

Major Phylogenetic Group

EFGEFTuHSP70RecARpoBrRNA

Shotgun Sequencing Allows Use of Other Markers

GEBA Project improves metagenomic analysis, but only a little

Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011

Page 148: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Future 1

Need to adapt genomic and metagenomic methods to make use of

GEBA data

Monday, February 14, 2011

Page 149: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Ways to Make Better Use of GEBA Data

• Better phylogenetic methods for short reads• Rebuild protein family models• New phylogenetic markers• Need better phylogenies, including HGT• Improved tools for using distantly related

genomes in metagenomic analysis

Monday, February 14, 2011

Page 150: Phylogenomics Talk at UC Berkeley by J. A. Eisen

iSEEM Project

Monday, February 14, 2011

Page 151: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 152: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 153: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 154: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Future 3

We have still only scratched the surface of microbial diversity

Monday, February 14, 2011

Page 155: Phylogenomics Talk at UC Berkeley by J. A. Eisen

rRNA Tree of Life

FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.

Based on tree from Pace NR, 2003.

Monday, February 14, 2011

Page 158: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenetic Diversity: Isolates

From Wu et al. 2009 Nature 462, 1056-1060Monday, February 14, 2011

Page 159: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Phylogenetic Diversity: All

From Wu et al. 2009 Nature 462, 1056-1060Monday, February 14, 2011

Page 160: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Most phyla with cultured species are sparsely sampled

• Lineages with no cultured taxa even more poorly sampled

Well sampled phylaPoorly sampledNo cultured taxa

Monday, February 14, 2011

Page 161: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Uncultured Lineages:Technical Approaches

• Get into culture• Enrichment cultures• If abundant in low diversity ecosystems• Flow sorting• Microbeads• Microfluidic sorting• Single cell amplification

Monday, February 14, 2011

Page 162: Phylogenomics Talk at UC Berkeley by J. A. Eisen

150

Number of SAGs from Candidate Phyla

OD

1

OP

11

OP

3

SA

R4

06

Site A: Hydrothermal vent 4 1 - -Site B: Gold Mine 6 13 2 -Site C: Tropical gyres (Mesopelagic) - - - 2Site D: Tropical gyres (Photic zone) 1 - - -

Sample collections at 4 additional sites are underway.

Phil Hugenholtz

GEBA uncultured

Monday, February 14, 2011

Page 163: Phylogenomics Talk at UC Berkeley by J. A. Eisen

GEBA Future 2

Need Experiments from Across the Tree of Life too

Monday, February 14, 2011

Page 164: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 165: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Experimental studies are mostly from three phyla

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 166: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Experimental studies are mostly from three phyla

• Some studies in other phyla

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 167: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Eukaryotes

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 168: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

• At least 40 phyla of bacteria

• Genome sequences are mostly from three phyla

• Some other phyla are only sparsely sampled

• Same trend in Viruses

As of 2002

Based on Hugenholtz, 2002

Monday, February 14, 2011

Page 169: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0.1

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Tree based on Hugenholtz (2002) with some modifications.

Need experimental studies from across the tree too

Monday, February 14, 2011

Page 170: Phylogenomics Talk at UC Berkeley by J. A. Eisen

0.1

Acidobacteria

Bacteroides

Fibrobacteres

Gemmimonas

Verrucomicrobia

Planctomycetes

Chloroflexi

Proteobacteria

Chlorobi

FirmicutesFusobacteria Actinobacteria

Cyanobacteria

Chlamydia

Spriochaetes

Deinococcus-Thermus

Aquificae

Thermotogae

TM6OS-K

Termite GroupOP8

Marine GroupAWS3

OP9

NKB19

OP3

OP10

TM7

OP1OP11

Nitrospira

SynergistesDeferribacteres

Thermudesulfobacteria

Chrysiogenetes

Thermomicrobia

Dictyoglomus

Coprothmermobacter

Tree based on Hugenholtz (2002) with some modifications.

Adopt a Microbe

Monday, February 14, 2011

Page 171: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Conclusion

• Phylogenetic sampling of genomes improves our understanding of microbial diversity in many ways

• Still need– More biogeography– More phenotypic/experimental data– Deeper phylogenetic sampling

Monday, February 14, 2011

Page 172: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Monday, February 14, 2011

Page 173: Phylogenomics Talk at UC Berkeley by J. A. Eisen

MICROBES

Monday, February 14, 2011

Page 174: Phylogenomics Talk at UC Berkeley by J. A. Eisen

A Happy Tree of Life

Monday, February 14, 2011

Page 175: Phylogenomics Talk at UC Berkeley by J. A. Eisen

Acknowledgements

• GEBA: DOE-JGI• GWSS: Nancy Moran & lab, Dongying Wu• iSEEM: Katie Pollard, Jessica Green,

Martin Wu• RecA: Dongying Wu, Craig Venter, Doug

Rusch, et al.

Monday, February 14, 2011