methods evolution using phylognetic Probing early...
Transcript of methods evolution using phylognetic Probing early...
Archezoa
The SSU Ribosomal RNA Tree for Eukaryotes
Mitochondria?
Prokaryoticoutgroup
AnimalsFungi
Ciliates + ApicomplexaStramenopiles
Euglenozoa
GiardiaTrichomonas
Plants / green algae
Red algae
Entamoebae
Choanozoa
Dictyostelium
Physarum
Microsporidia
Percolozoa
The Archezoa HypothesisT. Cavalier-Smith (1983)
“Archezoa are eukaryotes which primitively lackmitochondria”
– The nucleus was invented before themitochondrion was acquired
– The first eukaryotes were anaerobes
– Archezoans might provide insight into thenature of ancestral eukaryotic genomes andbiology
The Archezoa HypothesisT. Cavalier-Smith (1983)
• The Archezoa hypothesis would fall if:– Find mitochondrial genes on archezoan
genomes
– Find mitochondrion-derived organelles inarchezoans
– Find that archezoans branch amongaerobic species with mitochondria
Is Trichomonas an Archezoan?
Chaperonin60/GroEL phylogeny
mitochondria
alpha-proteobacteria
Clostridia
other bacteria
Pyruvate Acetyl CoA
2H+ H2
Pi + ADP ATP
pyruvate ferredoxin oxidoreductasePFO
Fe-hydrogenase
Trichomonas chaperonin 60 shares commonancestry with mitochondrial chaperonins
alpha - proteobacteria
Mitochondria
.10
Homo sapiensRatus norvegicusMus musculus
Heliothis virescensTrichomonas vaginalis
Ehrlichia chaffeensisRickettsia tsutsugumushi
Dictyostelium discoidiumTrypanosoma brucei
Zea maysArabidopsis thaliana
Naegleria fowleri
Saccharomyces cerevisiaeAjellomyces capsulatum
Schizosaccharomyces pombeNeocallimastix frontalis
Mitochondrial genes in Archezoa
Giardia /Spironucleus
Trichomonas
Microsporidia
Heat shock 70, Chaperonin 60
Heat shock 70, Chaperonin 60
Heat shock 70
*defined as forming a monophyletic group with mitochondrialhomologues in a non-controversial species phylogeny
Proteins of mitochondrial origin*Archezoa
Chaperonin 60 Protein Maximum Likelihood Tree(PROTML, Roger et al. 1998, PNAS 95: 229)
A case ofEukaryote Eukaryote
HGT?
Note 100% Bootstrap support
Long branches may cause problems forphylogenetic analysis
• Felsenstein (1978) made a simple model phylogeny includingfour taxa and a mixture of short and long branches
• Methods which assume all sites change at the same rate(e.g. PROTML) may be particularly sensitive to thisproblem
A
B
C
D
TRUE TREE WRONG TREE
A B
C D
ppq
qq p > q
Chaperonin 60 Protein Maximum Likelihood Tree(PROTML, Roger et al. 1998, PNAS 95: 229)
Longestbranches
• Does the Cpn60 tree topology change:
– If we remove long-branch outgroups
– If we remove sites where every specieshas the same amino acid
A simple experiment:
Cpn-60 Protein ML tree (PROTML) from variablesites with outgroups removed
Giardia
Entamoeba
Dictyostelium
30
31
Plants
Apicomplexa
Euglena & Trypanosoma
Trichomonas
Animals & Fungi
Competing Hypotheses for Microsporidia
AnimalsFungi
Ciliates + ApicomplexaStramenopiles
Euglenozoa
GiardiaTrichomonas
Plants / green algae
Red algae
Entamoebae
Choanozoa
Dictyostelium
Physarum
Microsporidia
Percolozoa“Microsporidia Early”SSU rRNA, EF-1 alphaEF-2
Microsporidia + FungiTubulin, mitHSP70
• HGT from Fungi to Microsporidia? (Sogin, 1998)• Another artefact of the method of analysis?
• Absence of mitochondria and peroxisomes• 70s ribosomes - most eukaryotes have 80S• 5.8S and 23S rRNA genes are fused - like
in some prokaryotes• Lack 9 + 2 microtubule structures
Microsporidia have a numberof unusual features
• Retention of ancestral features ofthe eukaryote cell at an early stageof evolution?
Or are they• Adaptations to an obligate
intracellular lifestyle?
Alternative explanations ofMicrosporidia unusual features
Elongation Factor 2 protein ML tree (PROTML)(Hashimoto et al. 1997 Arch. Protist. 148:287)
Plant
Trichomonas
Giardia
Sulfolobus
HalobacteriumMethanococcus
Trypanosoma
Cryptosporidium
Animals + Fungi
EntamoebaDictyostelium
Microsporidia
Archaebacteria outgroups
Eukaryote root
Also note that in PROTML the amino acid substitution process isassumed to be homogeneous across the tree
75
88
Shared nucleotide or amino acid composition biasescan also cause problems for phylogenetic analysis
Truetree
Wrongtree
Aquifex Thermus
Bacillus Deinococcus
Aquifex (73%)
Thermus (72%)
Bacillus (50%)
Deinococcus(52% G+C)
16S rRNA
The correct tree can be obtained if amodel is used which allows base/aacomposition to vary betweensequences -LogDet/ParalinearDistancesHeterogeneous Maximum Likelihood
Thermus
Deinococcus
Aquifex
Bacillus
LogDet/Paralinear distances for EF-2 DNAvariable sites codon positions 1+2
Animals
ChlorellaTrypanosoma
Trichomonas
Giardia
DictyosteliumEntamoebaSacharomyces
MicrosporidiaCryptosporidium
SulfolobusMethanococcus 44%
Halobacterium 58%
60
25
76
70
Archaebacteriaoutgroups
Note thatroot haschanged
45% G+C
A combination of factors (outgroup GC content andsite rate heterogeneity) influence the EF-2 DNA tree
0 20 40 60 80 100
Methanococcus outgroup(low G+C)
0
20
40
60
80
100
Halobacterium outgroupHigher G+C
0
20
40
60
80
100
0 20 40 60 80 100
(Microsporidia, outgroup)
Fraction of constant sites removed
LogDetBootstrap
values
MLestimate
0 20 40 60 80 1000
20
40
60
80
100
0
20
40
60
80
100
0 20 40 60 80 100
(Microsporidia, outgroup)(Microsporidia, Fungi)
Fraction of constant sites removed
Bootstrapvalues
A combination of factors (outgroup GC content & siterate heterogeneity) influence the EF-2 DNA tree
Methanococcus outgroup(low G+C)
Halobacterium outgroupHigher G+C
A combination of factors (outgroup GC content & siterate heterogeneity) influence the EF-2 DNA tree
0 20 40 60 80 1000
20
40
60
80
100
0
20
40
60
80
100
0 20 40 60 80 100
(Giardia, Trichomonas, outgroup)Fraction of constant sites removed
Bootstrapvalues
Methanococcus outgroup(low G+C)
Halobacterium outgroupHigher G+C
Competing hypotheses for Microsporidia
AnimalsFungi
Ciliates + ApicomplexaStramenopiles
Euglenozoa
GiardiaTrichomonas
Plants / green algae
Red algae
Entamoebae
Choanozoa
Dictyostelium
Physarum
Microsporidia
Percolozoa
“Microsporidia Early”SSU rRNA
Microsporidia + FungiTubulin, RNA polymerase,LSU rRNA, HSP70, TATA binding protein,EF-2, EF-1 alpha
The best supported hypothesis for Microsporidia is arelationship to fungi - why does SSU rRNA place them deep?
Summary I
• Making trees is not easy:– Among-site rate heterogeneity, “fastclock” species, shared nucleotide or aminoacid composition biases
– Different data sets may be affected byindividual phenomena to different degrees
– Biases need not be large if phylogeneticsignal is weak
Summary II
• Are Archezoa ancient offshoots?– Microsporidia are related to fungi– Evidence for Giardia and Trichomonas branching
deeper than other eukaryotes is based on treesmade using unrealistic assumptions
• PLUS– For the same reasons we don’t know where the
root lies on the eukaryote tree– So arguments about early or late branching are
probably premature anyway
Can we make a robust unrootedtree for eukaryotes?
• Combining different genes in a single analysismay provide a more robust eukaryotic tree
• One argument is that phylogenetic signalshould be additive whereas gene-specific“noise” will pull in different directions
DNA ML tree found using a model which allows bothbase composition and site rates to vary across the tree
Plasmodium
Cryptosporidium
Halteria /Stylonichia
Tetrahymena Green algae
Arabidopsis
Chondrus
Trypanosoma
EuglenaTrichomonas
Giardia
Sacharomyces
Schizosaccharomyces HumanDrosophila
Dictyostelium
Entamoeba
another red algaActin+tubulin+EF-2
Animals + fungi + slimemoulds
Ciliates plusapicomplexa
Red and greenalgae/plants
Giardia andTrichomonas
Origin(s) of Hydrogenosomes
• Is a 2 part problem– The organelle (the bag)– The biochemistry to produce hydrogen
particularly hydrogenase
MalatePyruvateME
NAD(P)+NAD(P)H
2Fd-2Fd
PFOAcetyl-CoA Acetate
ASCT
Succinyl-CoASuccinate
ADP + PiATP
AAC
STK
[Fe]HydNAD(P)-FO
2H+
H2
Doublemembrane
hsp70
cpn60
Schematic Map of Hydrogenosomes(after Muller 1993)
CO2
CO2
ADPATP
Transitpeptides
Proteinimport
N
Enzyme found also in mitochondriaAlpha-proteobacterial ancestry
Unknown ancestry
CoASH
Fungi and Trichomonas
DNA ML tree for Fe hydrogenases
0.1 substitutions per site
Clostridium acetobutylicum Clostridium pasteurianum
Clostridium perfringensClostridium acetobutylicum
Thermotoga maritima
Clostridium thermocellum
Thermotoga maritima Entamoeba histolytica
Spironucleus barkhanus
Trichomonas vaginalis
Trichomonas vaginalis
Nyctotherus ovalis
Desulfovibrio vulgaris
Desulfovibrio fructosovorans
Clostridium difficile
48
45
72 100
100 Desulfovibrio spp.
A likelihood ratio test of monophyly(Huelsenbeck, Hillis & Nielson 1996)
The Test Statistic (δδδδ) = lnL1 - lnL0
• Where lnL1 is the likelihood of the best treeand lnL0 is the likelihood of the bestmonophyly tree
• The null (eukaryote monophyly) distribution ofδδδδ is generated by simulation under anappropriate model (parametric bootstrapping)
Parametric Bootstrapping to estimate atest distribution
Estimate ML modelparameters using original
data
Simulate 1000 newsequence data sets usingthis model over the best
monophyly tree
For each new data setestimate L0 and L1 using
ML, with model re-optimised each time
Plot d for each of the1000 data sets to give thetest distribution and the95%confidence interval
What might the test statistic distribution look like if the Fehydrogenases were monophyletic?
Calculate d for original data andcompare to distribution - if it fallsoutside of the 95% interval it isbigger than expected by chance andmonophyly can be rejected
The likelihood ratio test rejects the hypothesisthat eukaryotic hydrogenases are monophyletic
9.6495%
δδδδ (lnL1 - lnL0) distribution from 1000 simulations of the [Fe]hydrogenase data on the best monophyly tree
δ= lnL1 - lnL0
δ for original dataδδδδ Original data95%
Eukaryotic compartment
Trichomonas Hydrogenosome
Cytosolic?
Ciliate Hydrogenosome
Plastid
Iron hydrogenase ML treeClostridium acetobutylicum p262Clostridium perfringens
Clostridium pasteurianumClostridium acetobutylicum ATCC824
Shewanella putrefaciens
Dehalococcoides ethanogenes
Desulfovibrio vulgaris (Oxamicus)
Desulfovibrio vulgaris (Hildenborough)Desulfovibrio fructosovorans
Megasphaera elsdenii
Treponema denticola
Thermotoga maritimaDesulfovibrio vulgaris Hyd-g
Nyctotherus ovalisClostridium thermocellum.
Clostridium difficileDesulfovibrio fructosovorans (Hildenborough)
Trichomonas vaginalis (long form)Trichomonas vaginalis (short form)
Spironucleus barkhanusEntamoeba histolytica
Chlamydomonas reinhardtiiScenedesmus obliquus
Conclusions I• Hydrogenosomes share common ancestry with
mitochondria• Hydrogenase has been acquired at least twice
and can be targeted to different cellcompartments in different eukaryotes– Humans, plants and fungi also contain remnants of
iron hydrogenases
• There is no evidence from phylogenetic analysisthat the “bag” and hydrogenase share a commonorigin from the mitochondrion endosymbiont
Conclusions II• Phylogeny is hard, there are lots of potential problems
with data, so we need to be careful in ourinterpretations of what trees mean - includesinferences of HGT
• Better methods hold promise of more reliable trees(allowing re-analysis of SSUrRNA data)
• Archezoa contain genes which originated withmitochondrion endosymbiont and the jury is still out onwhether former archezoa have lost the mitochondrialbag
• We don’t know which eukaryotes are early branching -for this we need a rooted tree
The mitosome, a novel organelle relatedto mitochondria in Entamoeba histolytica
Tovar et al., 1999.
Slide shows epitopetagged recombinantcpn60 localised tomitosome
Are there still organelles of common ancestrywith mitochondria in Giardia and Microsporidia?
• Giardia:– “What are the ovoid pellicular bodies (in Giardia)? The
study made suggests that they might be nothing butchanged mitochondria with a few crysts or tubules”
– “The ultrastructure of mitochondria may be related withthe oxygen deficiency in Lamblia environment”
(Cheissen, 1965)
• Microsporidia:– “There are reports of mitochondria-like structures in
several microsporidia” (Vavra, 1976)