Phylogenomics Talk at UC Berkeley by J. A. Eisen
-
Upload
jonathan-eisen -
Category
Education
-
view
2.722 -
download
1
description
Transcript of Phylogenomics Talk at UC Berkeley by J. A. Eisen
Phylogenomics and the Diversity and Diversification of Microbes
Jonathan A. EisenUC Davis
UC Berkeley TalkFebruary 3, 2011
Monday, February 14, 2011
My Obsessions
Jonathan A. EisenUC Davis
UC Berkeley TalkFebruary 3, 2011
Monday, February 14, 2011
Monday, February 14, 2011
Social Networking in Science
Scientist Reveals Secret of the Ocean: It's HimBy NICHOLAS WADE Published: April 1, 2007
Maverick scientist J. Craig Venter has done it again. It was just a few years
ago that Dr. Venter announced that the human genome sequenced by Celera
Genomics was in fact, mostly his own. And now, Venter has revealed a second
twist in his genomic self-examination. Venter was discussing his Global
Ocean Voyage, in which he used his personal yacht to collect ocean water
samples from around the world. He then used large filtration units to collect
microbes from the water samples which were then brought back to his high
tech lab in Rockville, MD where he used the same methods that were used to
sequence the human genome to study the genomes of the 1000s of ocean
dwelling microbes found in each sample. In discussing the sampling methods, Venter let slip his
latest attack on the standards of science – some of the samples were in fact not from the ocean, but
were from microbial habitats in and on his body.
“The human microbiome is the next frontier,” Dr. Venter said. “The ocean voyage was just a cover.
My main goal has always been to work on the microbes that live in and on people. And now that my
genome is nearly complete, why not use myself as the model for human microbiome studies as well.
”
It is certainly true that in the last few years, the microbes that live in and on people have become a
hot research topic. So hot that the same people who were involved in the race to sequence the human
genome have been involved in this race too. Francis Collins, Venter main competitor and still the
director of the National Human Genome Research Institute (NHGRI), recently testified before
Congress regarding this type of work. He said, “There are more bacteria in the human gut than
human cells in the entire human body… The human microbiome project represents an exciting new
research area for NHGRI.” Other minor players in the public’s human genome effort, such as Eric
Lander at the Whitehead Institute and George Weinstock at Baylor College of Medicine are also
trying to muscle their way into studies of the human microbiome.
But Venter was not going to have any of this. “This time, I was not going to let them know I was
coming. There would be no artificially declared tie. We set up a cutting edge human microbiome
sampling system on the yacht, and then headed out to sea. They never knew what hit them. Now I
have finished my microbiome.”
Reactions among scientists range from amusement to indifference, most saying that it is
unimportant whose microbiome was sequenced. But a few scientists expressed disappointment that
Dr. Venter had once again subverted the normal system of anonymity. Recent human microbome
studies by other researchers have all involved anonymous donors. Jeff Gordon, at the Washington
University in St. Louis expressed astonishment, “I have to fill out about 200 forms for every sample.
It takes years to get anything done. And now Venter sails away with the prize. All I can say is, I will
never listen to one of my review boards again.”
Venter had hinted at the possibility that something was amiss in an interview he gave last week for
the BBC News. He said “Most of the samples we studied were from the ocean but a few were from
people.” When the interviewer seemed stunned, Doug Rusch, one of Venter’s collaborators stepped
in and said “Collected with the help of other people.”
Venter was apparently spurred to make the admission today that many of the samples were in fact
from his own microbiome due to a video that surfaced on YouTube showing Jeff Hoffman, the
nytimes.com/sports
How good is your bracket? Compare your tournament picksto choices from members of The New York Times sportsdesk and other players.
Also in Sports:
The Bracket Blog - all the news leading up to the FinalFour
Bats Blog: Spring training updates
Play Magazine: How to build a super athlete
Sunday, April 1, 2007 HealthWORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION ARTS STYLE TRAVEL JOBS REAL ESTATE AUTOS
FITNESS & NUTRITION HEALTH CARE POLICY MENTAL HEALTH & BEHAVIOR
SINGLE-PAGE
SAVE
SHARE
SHARE
HOME PAGE MY TIMES TODAY'S PAPER VIDEO MOST POPULAR TIMES TOPICS Welcome, fcollins Member Center Log Out
Monday, February 14, 2011
Bacterial evolve
Monday, February 14, 2011
Phylogenomics of Novelty
Monday, February 14, 2011
Phylogenomics of Novelty
Mechanisms of Origin of New
Functions
Monday, February 14, 2011
Phylogenomics of Novelty
Mechanisms of Origin of New
Functions
Variation in Mechanisms:
Patterns, Causes and Effects
Monday, February 14, 2011
Phylogenomics of Novelty
Mechanisms of Origin of New
Functions
Species Evolution
Variation in Mechanisms:
Patterns, Causes and Effects
Monday, February 14, 2011
• How does novelty originate?• Major categories of processes• From within
• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes
• From outside• Lateral gene transfer• Symbioses
Phylogenomics of Novelty
Monday, February 14, 2011
Mechanisms of Origin of New
Functions
• How does novelty originate?• Major categories of processes• From within
• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes
• From outside• Lateral gene transfer• Symbioses
Phylogenomics of Novelty
Monday, February 14, 2011
• Patterns of variation• Within species• Between species
• Causes• Variation in replication, recombination and repair
• Effects• Differences in evolvability• Ecological niche• Short and long term genome evolution
Phylogenomics of Novelty
Monday, February 14, 2011
Variation in Mechanisms:
Patterns, Causes and Effects
• Patterns of variation• Within species• Between species
• Causes• Variation in replication, recombination and repair
• Effects• Differences in evolvability• Ecological niche• Short and long term genome evolution
Phylogenomics of Novelty
Monday, February 14, 2011
• Information needed to distinguish convergence from homology• Allows inference of rates and patterns of change• Allows one to determine if something is a “one time” event or a common theme in many lineages
Phylogenomics of Novelty
Monday, February 14, 2011
Species Evolution
• Information needed to distinguish convergence from homology• Allows inference of rates and patterns of change• Allows one to determine if something is a “one time” event or a common theme in many lineages
Phylogenomics of Novelty
Monday, February 14, 2011
Phylogenomics of Novelty
Mechanisms of Origin of New
Functions
Species Evolution
Variation in Mechanisms:
Patterns, Causes and Effects
Monday, February 14, 2011
Phylogenomics of Novelty
Mechanisms of Origin of New
Functions
Species Evolution
Variation in Mechanisms:
Patterns, Causes and Effects
Focus Today on Using Sequence
Information for All of This
Monday, February 14, 2011
Why do this?
• Discover causes and effects of differences in evolvability
• Improve predictions from genome analysis• Guide interpretation of biological data
Monday, February 14, 2011
Outline
• Introduction• Phylogenomic Stories
– Within genome invention of novelty– Stealing novelty– Communities of microbes– Community service and knowing what we don’t
know
Monday, February 14, 2011
Introduction
Genome Sequencing
Monday, February 14, 2011
rRNA Tree of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Monday, February 14, 2011
Limited Sampling of RRR Studies
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Monday, February 14, 2011
Limited Sampling of RRR Studies
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Haloferax
Methanococcus
Deinococcus
Chlorobium
Thermotoga
Monday, February 14, 2011
TIGR
Ecoli vs. Hvolcanii
1E-07
1E-06
1E-05
0.0001
0.001
0.01
0.1
1
RelativeSurvival
0 50 100 150 200 250 300 350 400
UV J/m2
UV Survival E.coli vs H.volcanii
H.volcanii WFD11
E.coli NR10125 mfd+
E.coli NR10121 mfd-
Monday, February 14, 2011
Label5#2
0
0.2
0.4
0.6
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
Avg. Mol. Wt.(Base Pairs)
H. volcanii UV Repair Label 7 - 45J / m2)
45 J/m2 Dark 24 Hours
45 J/m2 Photoreac.
45 J/m2 t0
0 J/m2 t0
Monday, February 14, 2011
Fleischmann et al. 1995
Monday, February 14, 2011
Limited Sampling of RRR Studies
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Haloferax
Methanococcus
Deinococcus
Chlorobium
Thermotoga
Monday, February 14, 2011
From http://genomesonline.orgMonday, February 14, 2011
Monday, February 14, 2011
Monday, February 14, 2011
Monday, February 14, 2011
Human commensals
Monday, February 14, 2011
From http://genomesonline.orgMonday, February 14, 2011
Phylogenomics of Novelty I
Origin of Functions from Within
Monday, February 14, 2011
• How does novelty originate?• Major categories of processes• From within
• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes
• From outside• Lateral gene transfer• Symbioses
Phylogenomics of Novelty
Monday, February 14, 2011
Mechanisms of Origin of New
Functions
• How does novelty originate?• Major categories of processes• From within
• De novo invention• Simple substitutions• Duplication and divergence• Domain shuffling• Small & large rearrangements• Regulatory changes
• From outside• Lateral gene transfer• Symbioses
Phylogenomics of Novelty
Monday, February 14, 2011
From Eisen et al. 1997 Nature Medicine 3: 1076-1078.
Monday, February 14, 2011
Blast Search of H. pylori “MutS”
• Blast search pulls up Syn. sp MutS#2 with much higher p value than other MutS homologs
• Based on this TIGR predicted this species had mismatch repair
• Assumes functional constancy
Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078. Monday, February 14, 2011
Predicting Function• Identification of motifs
– Short regions of sequence similarity that are indicative of general activity
– e.g., ATP binding• Homology/similarity based methods
– Gene sequence is searched against a databases of other sequences
– If significant similar genes are found, their functional information is used
• Problem– Genes frequently have similarity to hundreds of motifs
and multiple genes, not all with the same function
Monday, February 14, 2011
MutL??
Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078. Monday, February 14, 2011
Overlaying Functions onto Tree
Aquae Trepa
Rat
FlyXenla
MouseHumanYeast
NeucrArath
BorbuSynsp
Neigo
ThemaStrpy
Bacsu
Ecoli
TheaqDeiraChltr
SpombeYeast
YeastSpombe
MouseHuman
Arath
YeastHumanMouse
Arath
StrpyBacsu
HumanCeleg
YeastMetthBorbu
AquaeSynsp
Deira Helpy
mSaco
YeastCeleg
Human
MSH4
MSH5MutS2
MutS1
MSH1
MSH3
MSH6
MSH2
Based on Eisen, 1998 Nucl Acids Res 26: 4291-4300.
Monday, February 14, 2011
Monday, February 14, 2011
Evolutionary Functional Prediction
1 2 3 4 5 6
3 5
3
1A 2A 3A 1B 2B 3B
2A 1B
1
1 2
2
2 31
1A 3A
1A 2A 3A
1A 2A 3A
4 6
4 5 6
4 5 6
2B 3B
1B 2B 3B
1B 2B 3B
1A3A
1B 2B3B
12 4
62A
2A
53
5
EXAMPLE BMETHOD
Duplication?
Duplication?
IDENTIFY HOMOLOGS
OVERLAY KNOWNFUNCTIONS ONTO TREE
INFER LIKELY FUNCTIONOF GENE(S) OF INTEREST
ALIGN SEQUENCES
CALCULATE GENE TREE
CHOOSE GENE(S) OF INTEREST
Species 3Species 1 Species 2
ACTUAL EVOLUTION(ASSUMED TO BE UNKNOWN)
EXAMPLE A
Duplication?
Duplication
Ambiguous
Based on Eisen, 1998 Genome Res 8: 163-167.
Monday, February 14, 2011
Example 2: Recent Changes
E.coli gi1787690
B.subtilis gi2633766Synechocystis sp. gi1001299Synechocystis sp. gi1001300Synechocystis sp. gi1652276Synechocystis sp. gi1652103H.pylori gi2313716H.pylori99 gi4155097C.jejuni Cj1190cC.jejuni Cj1110cA.fulgidus gi2649560A.fulgidus gi2649548B.subtilis gi2634254B.subtilis gi2632630B.subtilis gi2635607B.subtilis gi2635608B.subtilis gi2635609B.subtilis gi2635610B.subtilis gi2635882E.coli gi1788195E.coli gi2367378E.coli gi1788194
E.coli gi1789453C.jejuni Cj0144C.jejuni Cj0262c
H.pylori gi2313186H.pylori99 gi4154603C.jejuni Cj1564
C.jejuni Cj1506cH.pylori gi2313163H.pylori99 gi4154575H.pylori gi2313179H.pylori99 gi4154599C.jejuni Cj0019cC.jejuni Cj0951cC.jejuni Cj0246cB.subtilis gi2633374T.maritima TM0014
T.pallidum gi3322777T.pallidum gi3322939T.pallidum gi3322938B.burgdorferi gi2688522T.pallidum gi3322296B.burgdorferi gi2688521T.maritima TM0429T.maritima TM0918T.maritima TM0023T.maritima TM1428T.maritima TM1143T.maritima TM1146P.abyssi PAB1308P.horikoshii gi3256846P.abyssi PAB1336P.horikoshii gi3256896P.abyssi PAB2066P.horikoshii gi3258290P.abyssi PAB1026P.horikoshii gi3256884D.radiodurans DRA00354D.radiodurans DRA0353D.radiodurans DRA0352P.abyssi PAB1189P.horikoshii gi3258414B.burgdorferi gi2688621M.tuberculosis gi1666149
V.cholerae VC0512V.cholerae VCA1034V.cholerae VCA0974V.cholerae VCA0068V.cholerae VC0825V.cholerae VC0282V.cholerae VCA0906V.cholerae VCA0979V.cholerae VCA1056V.cholerae VC1643V.cholerae VC2161V.cholerae VCA0923V.cholerae VC0514V.cholerae VC1868V.cholerae VCA0773V.cholerae VC1313V.cholerae VC1859V.cholerae VC1413V.cholerae VCA0268V.cholerae VCA0658V.cholerae VC1405V.cholerae VC1298V.cholerae VC1248V.cholerae VCA0864V.cholerae VCA0176V.cholerae VCA0220V.cholerae VC1289V.cholerae VCA1069V.cholerae VC2439V.cholerae VC1967V.cholerae VCA0031V.cholerae VC1898V.cholerae VCA0663V.cholerae VCA0988V.cholerae VC0216V.cholerae VC0449V.cholerae VCA0008V.cholerae VC1406V.cholerae VC1535V.cholerae VC0840
V.cholerae VC0098V.cholerae VCA1092
V.cholerae VC1403V.cholerae VCA1088
V.cholerae VC1394
V.cholerae VC0622
NJ
**
*****
******
****
***
****
**
*
****
**
**
******
******
*
****
******
***
***
***
****
**
*
****
*
• Phylogenomic functional prediction may not work well for very newly evolved functions
• Can use understanding of origin of novelty to better interpret these cases?
• Screen genomes for genes that have changed recently
– Pseudogenes and gene loss– Contingency Loci– Acquisition (e.g., LGT)– Unusual dS/dN ratios– Rapid evolutionary rates– Recent duplications
Monday, February 14, 2011
Tetrahymena Genome Processing
• Probably exists as a defense mechanism• Analogous to RIPPING and
heterochromatin silencing• Presence of repetitive DNA in MAC but
not TEs suggests the mechanism involves targeting foreign DNA
• Thus unlike RIPPING ciliate processing does not limit diversification by duplication
Eisen et al. 2006. PLoS Biology.Monday, February 14, 2011
Conclusions
• Enormous variation in processes underlying origin of novelty
• See within genomes -> between species• Knowledge about mechanisms and variation
helps predictions of function and biology from analysis of sequence data
Monday, February 14, 2011
Phylogenomics of Novelty II
Sometimes, it is easier to steal, borrow, or coopt functions rather than evolve them
anew
Monday, February 14, 2011
Stealing DNA
Monday, February 14, 2011
rRNA Tree of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
Monday, February 14, 2011
Perna et al. 2003Monday, February 14, 2011
Network of Life
Figure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
Monday, February 14, 2011
A. thaliana T1E2.8 is aChloroplast Derived HSP60
Monday, February 14, 2011
Phylogenetic Distribution Novelty: Bacterial Actin Related Protein
Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin
!"#$%&'()*&& !"#$%&'(%()+"#,-.(/01 !"#*+,**'+(
2"#3)&4&*&& !"#*)$*),+%5"#$-.-6&0&1- !"#$%,$-%)(7"#0(1.8-9& !"#$''+-+,',!5"#:1,)*&$/0 !"#&$,%+)+-+
;"#01,&-*0 !"#%*+$--(<"#$-.-3.1%&0 !"#%',&'-+)
2"#$&*-.-1 !"#$'(-%%+&$="#$.1001 !"#-*$+$(&(>"#0$1,/%1.&0 !"#&$**+),)-!;"#01,&-*0 !"#*+,$*'(
5"#:1,)*&$/0 !"#&$,%+%-%%5"#$-.-6&0&1- !"#',&+$)*?"#@-%1*)A10(-. !"#&%'%&*%*B"#A1%%/0# "#%*,-&*'(2"#*-)').@1*0 !"#*-&'''(+5"#$-.-6&0&1- !"#',&&*&*?"#@-%1*)A10(-. !"#$)),)*%,;"#01,&-*0 !"#*+,$*),!;"#)$C.1$-/@ !"#&&),(*((-
."#,1(-*0 !"#$'-+*$((&!!"#(C1%&1*1 !"#$-,(%'+-!
5"#$-.-6&0&1- !"#$++-&%%!
?"#@-%1*)A10(-. !"#$)),),%)
?"#C1*0-*&&!"#&$-*$$(&$5"#$-.-6&0&1- !"#',&,$$%
5"#:1,)*&$/0 !"#&$,%+-,(,!5"#$-.-6&0&1- !"#$,+$(,&
?"#4&0$)&4-/@ !"#''-+&%$-
D"#01(&61 !"#$-&'*)%&+!!"#(C1%&1*1!"#$-%$ $),)
?"#@-%1*)A1(-. !"#$((&+,*-<"#@/0$/%/0 !"#&&'&%'*(,
((
')
$++$++
'*
$++
$++
)*
$++
$++
*$
((),
$++()
(%$++
)%
$++
-)
$++
+/*!
!"#$%
!&'(
!&')
!&'*
+!&'
!&',
!&'-
!&'.
!&'/
!&'(0
See also Guljamow et al. 2007 Current Biology. Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Correlated gain/loss of genes
• Microbial genes are lost rapidly when not maintained by selection
• Genes can be acquired by lateral transfer• Frequently gain and loss occurs for entire
pathways/processes• Thus might be able to use correlated
presence/absence information to identify genes with similar functions
Monday, February 14, 2011
Non-Homology Predictions: Phylogenetic Profiling
• Step 1: Search all genes in organisms of interest against all other genomes
• Ask: Yes or No, is each gene found in each other species
• Cluster genes by distribution patterns (profiles)
Monday, February 14, 2011
Carboxydothermus hydrogenoformans
• Isolated from a Russian hotspring• Thermophile (grows at 80°C)• Anaerobic• Grows very efficiently on CO
(Carbon Monoxide)• Produces hydrogen gas• Low GC Gram positive
(Firmicute)• Genome Determined (Wu et al.
2005 PLoS Genetics 1: e65. )
Monday, February 14, 2011
Homologs of Sporulation Genes
Wu et al. 2005 PLoS Genetics 1: e65.
Monday, February 14, 2011
Carboxydothermus sporulates
Wu et al. 2005 PLoS Genetics 1: e65.
Monday, February 14, 2011
Wu et al. 2005 PLoS Genetics 1: e65.
Monday, February 14, 2011
Stealing Organisms (Symbioses)
Monday, February 14, 2011
Mutualistic Genome Evolution
• Compare and contrast different types of mutualistic symbioses
• Diverse hosts, symbionts, biology, ages• Organelles, chemosymbioses,
photosynthetic symbioses, nutritional symbioses
• What are the rules & patterns?
Monday, February 14, 2011
Glassy Winged Sharpshooter
• Feeds on xylem sap
• Vector for Pierce’s Disease
• Potential bioterror agent
Monday, February 14, 2011
Sharpshooter Shotgun Sequencing
shotgun
Wu et al. 2006 PLoS Biology 4: e188.Collaboration with Nancy Moran’s lab
Monday, February 14, 2011
Monday, February 14, 2011
Monday, February 14, 2011
Monday, February 14, 2011
Higher Evolutionary Rates in Endosymbionts
Wu et al. 2006 PLoS Biology 4: e188. Collaboration with Nancy Moran’ s LabMonday, February 14, 2011
Variation in Evolution Rates
MutS MutL
+ +
+ +
+ +
+ +
_ _
_ _
Wu et al. 2006 PLoS Biology 4: e188. Collaboration with Nancy Moran’ s LabMonday, February 14, 2011
Polymorphisms in Metapopulation
• Data from ~200 hosts– 104 SNPs– 2 indels
• PCR surveys show that this is between host variation
• Much lower ratio of transitions:transversions than in Blochmannia
• Consistent with absence of MMR from Blochmannia
Monday, February 14, 2011
Baumannia is a Vitamin and Cofactor Producing Machine
Wu et al. 2006 PLoS Biology 4: e188.
Monday, February 14, 2011
No Amino-Acid Synthesis
Monday, February 14, 2011
Monday, February 14, 2011
The Uncultured Majority
Monday, February 14, 2011
Great Plate Count Anomaly
Culturing Microscope
CountCount
Monday, February 14, 2011
Great Plate Count Anomaly
Culturing Microscope
CountCount <<<<
Monday, February 14, 2011
Great Plate Count Anomaly
Culturing Microscope
CountCount <<<<
DNA
Monday, February 14, 2011
rRNA PCR
The Hidden Majority Richness estimates
Bohannan and Hughes 2003Hugenholtz 2002
Monday, February 14, 2011
Monday, February 14, 2011
rRNA data increasing exponentially tooMonday, February 14, 2011
Perna et al. 2003Monday, February 14, 2011
Metagenomics
shotgun
clone
Monday, February 14, 2011
Monday, February 14, 2011
How can we best use metagenomic data?
• Many possible uses including:– Improvements on rRNA based phylotyping and
species diversity measurements– Adding functional information on top of
phylogenetic/species diversity information• Most/all possible uses either require or are
improved with phylogenetic analysis
Monday, February 14, 2011
Example I: Phylotyping with rRNA and other genes
Monday, February 14, 2011
Functional Diversity of Proteorhodopsins?
Venter et al., 2004Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
Example II: Binning
Monday, February 14, 2011
Metagenomics Challenge
Monday, February 14, 2011
ABCDEFG
TUVWXYZ
Binning challenge
Monday, February 14, 2011
ABCDEFG
TUVWXYZ
Binning challenge
Best binning method: reference genomes
Monday, February 14, 2011
ABCDEFG
TUVWXYZ
Binning challenge
Best binning method: reference genomes
Monday, February 14, 2011
ABCDEFG
TUVWXYZ
Binning challenge
No reference genome? What do you do?
Monday, February 14, 2011
ABCDEFG
TUVWXYZ
Binning challenge
No reference genome? What do you do?
Phylogeny ....Monday, February 14, 2011
Monday, February 14, 2011
No Amino-Acid Synthesis
Monday, February 14, 2011
Monday, February 14, 2011
???????
Monday, February 14, 2011
CFB Phyla
Monday, February 14, 2011
Wu et al. 2006 PLoS Biology 4: e188.
Baumannia makes vitamins and cofactors
Sulcia makes amino acids
Monday, February 14, 2011
Phylogenomics of Novelty III
Knowing What We Don’t Know
Monday, February 14, 2011
Research Topics
Mechanisms of Origin of New
Functions
Species Evolution
Variation in Mechanisms:
Patterns, Causes and Effects
Monday, February 14, 2011
Research Topics
Mechanisms of Origin of New
Functions
Species Evolution
Variation in Mechanisms:
Patterns, Causes and Effects
Monday, February 14, 2011
As of 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Solution I: sequence more phyla
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen, Ward, Robb, Nelson, et al
Monday, February 14, 2011
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Still highly biased in terms of the tree
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Archaea
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Eukaryotes
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Viruses
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
Monday, February 14, 2011
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Solution: Really Fill in the Tree
• GEBA• A genomic
encyclopedia of bacteria and archaea
Eisen & Ward, PIs
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
Monday, February 14, 2011
http://www.jgi.doe.gov/programs/GEBA/pilot.htmlMonday, February 14, 2011
GEBA Pilot Project: Components• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan
Eisen, Eddy Rubin, Jim Bristow)• Project management (David Bruce, Eileen Dalin, Lynne Goodwin)• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)• Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus,
Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)• Annotation and data release (Nikos Kyrpides, Victor Markowitz, et
al)• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla)
• Adopt a microbe education project (Cheryl Kerfeld)• Outreach (David Gilbert)• $$$ (DOE, Eddy Rubin, Jim Bristow)
Monday, February 14, 2011
GEBA Pilot Project Overview
• Identify major branches in rRNA tree for which no genomes are available
• Identify those with a cultured representative in DSMZ
• DSMZ grew > 200 of these and prepped DNA• Sequence and finish 100+ (covering breadth of
bacteriałarchaea diversity)• Annotate, analyze, release data• Assess benefits of tree guided sequencing• 1st paper Wu et al in Nature Dec 2009
Monday, February 14, 2011
GEBA Phylogenomic Lesson 1
The rRNA Tree of Life is a Useful Tool for Identifying Phylogenetically Novel
Genomes
Monday, February 14, 2011
Compare PD in Trees
From Wu et al. 2009 Nature 462, 1056-1060Monday, February 14, 2011
GEBA Phylogenomic Lesson 2
The rRNA Tree of Life is not perfect ...
Monday, February 14, 2011
16s Says Hyphomonas is in Rhodobacteriales
Badger et al. 2005 Int J System Evol Microbiol 55: 1021-1026.
Monday, February 14, 2011
WGT and individual gene trees:Its Related to Caulobacterales
Badger et al. 2005 Int J System Evol Microbiol 55: 1021-1026.
Monday, February 14, 2011
Monday, February 14, 2011
GEBA Phylogenomic Lesson 3
Phylogeny-driven genome selection helps discover new genetic diversity
Monday, February 14, 2011
Network of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Archaea
Eukaryotes
Bacteria
Monday, February 14, 2011
Protein Family Rarefaction Curves
• Take data set of multiple complete genomes• Identify all protein families using MCL• Plot # of genomes vs. # of protein families
Monday, February 14, 2011
Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Synapomorphies exist
Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
! !
!"#$%"&'(%)"*
+,%-./&#(%)"*
Monday, February 14, 2011
Phylogenetic Distribution Novelty: Bacterial Actin Related Protein
Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin
!"#$%&'()*&& !"#$%&'(%()+"#,-.(/01 !"#*+,**'+(
2"#3)&4&*&& !"#*)$*),+%5"#$-.-6&0&1- !"#$%,$-%)(7"#0(1.8-9& !"#$''+-+,',!5"#:1,)*&$/0 !"#&$,%+)+-+
;"#01,&-*0 !"#%*+$--(<"#$-.-3.1%&0 !"#%',&'-+)
2"#$&*-.-1 !"#$'(-%%+&$="#$.1001 !"#-*$+$(&(>"#0$1,/%1.&0 !"#&$**+),)-!;"#01,&-*0 !"#*+,$*'(
5"#:1,)*&$/0 !"#&$,%+%-%%5"#$-.-6&0&1- !"#',&+$)*?"#@-%1*)A10(-. !"#&%'%&*%*B"#A1%%/0# "#%*,-&*'(2"#*-)').@1*0 !"#*-&'''(+5"#$-.-6&0&1- !"#',&&*&*?"#@-%1*)A10(-. !"#$)),)*%,;"#01,&-*0 !"#*+,$*),!;"#)$C.1$-/@ !"#&&),(*((-
."#,1(-*0 !"#$'-+*$((&!!"#(C1%&1*1 !"#$-,(%'+-!
5"#$-.-6&0&1- !"#$++-&%%!
?"#@-%1*)A10(-. !"#$)),),%)
?"#C1*0-*&&!"#&$-*$$(&$5"#$-.-6&0&1- !"#',&,$$%
5"#:1,)*&$/0 !"#&$,%+-,(,!5"#$-.-6&0&1- !"#$,+$(,&
?"#4&0$)&4-/@ !"#''-+&%$-
D"#01(&61 !"#$-&'*)%&+!!"#(C1%&1*1!"#$-%$ $),)
?"#@-%1*)A1(-. !"#$((&+,*-<"#@/0$/%/0 !"#&&'&%'*(,
((
')
$++$++
'*
$++
$++
)*
$++
$++
*$
((),
$++()
(%$++
)%
$++
-)
$++
+/*!
!"#$%
!&'(
!&')
!&'*
+!&'
!&',
!&'-
!&'.
!&'/
!&'(0
See also Guljamow et al. 2007 Current Biology. Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
GEBA Phylogenomic Lesson 4
Phylogeny driven genome selection (and phylogenetics in general) improves genome annotation
Monday, February 14, 2011
Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling
• Took 56 GEBA genomes and compared results vs. 56 randomly sampled new genomes
• Better definition of protein family sequence “patterns”• Greatly improves “comparative” and “evolutionary”
based predictions• Conversion of hypothetical into conserved hypotheticals• Linking distantly related members of protein families• Improved non-homology predictionKostas
MavrommatisNatalia Ivanova
Thanos Lykidis
Nikos Kyrpides
Iain Anderson
Monday, February 14, 2011
GEBA Phylogenomic Lesson 5
Improves analysis of genome data from uncultured organisms
Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
Cannot be done without good sampling of genomes
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Phylogenetic Binning
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
Cannot be done without good sampling of genomes
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
GEBA Project improves metagenomic analysis
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
131
GEBA CyanoSequencing status (as of 01/14):
Awaiting Material 11Library 12Production 22Finishing 5
Grand Total 50
On-going/ Planed Activities:
- Building Cyanobacterial Metadatabase (IMG-GOLD)
- 10th Cyanobacterial Molecular Biology Workshop, Lake Arrowhead, CA (06/10)
--> Cheryl will host: Workshop training as prep for virtual Jamboree
Monday, February 14, 2011
132
Plan: Sequence multiple Root Nodule Bacteria (RNBs) across
the planet. Pilot: 100 RNBs.
Alpha RNB
BradyrhizobiumMesorhizobiumRhizobium
Beta RNB
Sinorhizobium
CupriavidisBurkholderia
Balneimonas-like
DevosiaOchrobactrumPhyllobacterium
AzorhizobiumAllorhizobium
GEBA RNB
Goal: • Understand BioGeographical effects on species
evolution and understand host-specificity.
Rationale: • N2 fixation by legume pastures and crops provides 65% of
the N currently utilized in agricultural production.
• Contributes 25 to 90 million metric tones N pa.
• Symbioses save $US 6-10 billion annually on N fertilizer.
• Grain and animal production enhanced by fixed nitrogen supplied by the symbiosis.
Nikos KyrpidesMonday, February 14, 2011
133
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Still not happy
• NSF-funded Tree of Life Project
• A genome from each of eight phyla
Eisen & Ward, PIs
Monday, February 14, 2011
0
0.1250
0.2500
0.3750
0.5000
Alphaproteobacteria
Betaproteobacteria
Gammaproteobacteria
Epsilonproteobacteria
Deltaproteobacteria
Cyanobacteria
Firmicutes
Actinobacteria
Chlorobi
CFB
Chloroflexi
Spirochaetes
Fusobacteria
Deinococcus-Thermus
Euryarchaeota
Crenarchaeota
Sargasso Phylotypes
Wei
ght
ed %
of
Clo
nes
Major Phylogenetic Group
EFGEFTuHSP70RecARpoBrRNA
Shotgun Sequencing Allows Use of Other Markers
GEBA Project improves metagenomic analysis, but only a little
Venter et al., Science 304: 66-74. 2004Monday, February 14, 2011
GEBA Future 1
Need to adapt genomic and metagenomic methods to make use of
GEBA data
Monday, February 14, 2011
Ways to Make Better Use of GEBA Data
• Better phylogenetic methods for short reads• Rebuild protein family models• New phylogenetic markers• Need better phylogenies, including HGT• Improved tools for using distantly related
genomes in metagenomic analysis
Monday, February 14, 2011
iSEEM Project
Monday, February 14, 2011
Monday, February 14, 2011
Monday, February 14, 2011
Monday, February 14, 2011
GEBA Future 3
We have still only scratched the surface of microbial diversity
Monday, February 14, 2011
rRNA Tree of Life
FIgure from Barton, Eisen et al. “Evolution”, CSHL Press.
Based on tree from Pace NR, 2003.
Monday, February 14, 2011
Phylogenetic Diversity: Sequenced Bacteria & Archaea
From Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Phylogenetic Diversity with GEBA
From Wu et al. 2009 Nature 462, 1056-1060
Monday, February 14, 2011
Phylogenetic Diversity: Isolates
From Wu et al. 2009 Nature 462, 1056-1060Monday, February 14, 2011
Phylogenetic Diversity: All
From Wu et al. 2009 Nature 462, 1056-1060Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Most phyla with cultured species are sparsely sampled
• Lineages with no cultured taxa even more poorly sampled
Well sampled phylaPoorly sampledNo cultured taxa
Monday, February 14, 2011
Uncultured Lineages:Technical Approaches
• Get into culture• Enrichment cultures• If abundant in low diversity ecosystems• Flow sorting• Microbeads• Microfluidic sorting• Single cell amplification
Monday, February 14, 2011
150
Number of SAGs from Candidate Phyla
OD
1
OP
11
OP
3
SA
R4
06
Site A: Hydrothermal vent 4 1 - -Site B: Gold Mine 6 13 2 -Site C: Tropical gyres (Mesopelagic) - - - 2Site D: Tropical gyres (Photic zone) 1 - - -
Sample collections at 4 additional sites are underway.
Phil Hugenholtz
GEBA uncultured
Monday, February 14, 2011
GEBA Future 2
Need Experiments from Across the Tree of Life too
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Experimental studies are mostly from three phyla
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Experimental studies are mostly from three phyla
• Some studies in other phyla
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Eukaryotes
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
• At least 40 phyla of bacteria
• Genome sequences are mostly from three phyla
• Some other phyla are only sparsely sampled
• Same trend in Viruses
As of 2002
Based on Hugenholtz, 2002
Monday, February 14, 2011
0.1
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
Tree based on Hugenholtz (2002) with some modifications.
Need experimental studies from across the tree too
Monday, February 14, 2011
0.1
Acidobacteria
Bacteroides
Fibrobacteres
Gemmimonas
Verrucomicrobia
Planctomycetes
Chloroflexi
Proteobacteria
Chlorobi
FirmicutesFusobacteria Actinobacteria
Cyanobacteria
Chlamydia
Spriochaetes
Deinococcus-Thermus
Aquificae
Thermotogae
TM6OS-K
Termite GroupOP8
Marine GroupAWS3
OP9
NKB19
OP3
OP10
TM7
OP1OP11
Nitrospira
SynergistesDeferribacteres
Thermudesulfobacteria
Chrysiogenetes
Thermomicrobia
Dictyoglomus
Coprothmermobacter
Tree based on Hugenholtz (2002) with some modifications.
Adopt a Microbe
Monday, February 14, 2011
Conclusion
• Phylogenetic sampling of genomes improves our understanding of microbial diversity in many ways
• Still need– More biogeography– More phenotypic/experimental data– Deeper phylogenetic sampling
Monday, February 14, 2011
Monday, February 14, 2011
MICROBES
Monday, February 14, 2011
A Happy Tree of Life
Monday, February 14, 2011
Acknowledgements
• GEBA: DOE-JGI• GWSS: Nancy Moran & lab, Dongying Wu• iSEEM: Katie Pollard, Jessica Green,
Martin Wu• RecA: Dongying Wu, Craig Venter, Doug
Rusch, et al.
Monday, February 14, 2011