Jonathan Eisen talk "Phylogneomic approaches to functional prediction"a #AFP2012 #ISMB
"Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15
-
Upload
jonathan-eisen -
Category
Science
-
view
1.229 -
download
1
Transcript of "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15
![Page 1: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/1.jpg)
Sea
Searching for Novel Forms of Life
Jonathan A. Eisen UC Davis
@phylogenomics
Diversity of Life Workshop Pacifica, CA
December 16, 2015
![Page 2: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/2.jpg)
![Page 3: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/3.jpg)
![Page 4: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/4.jpg)
![Page 5: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/5.jpg)
![Page 6: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/6.jpg)
![Page 7: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/7.jpg)
![Page 8: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/8.jpg)
![Page 9: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/9.jpg)
![Page 10: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/10.jpg)
![Page 11: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/11.jpg)
![Page 12: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/12.jpg)
Once You Find Something Alive …
You find a CLE
![Page 13: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/13.jpg)
Once You Find Something Alive …
You find a CLE
Separate Origin from Known Life?
Common Origin with Known Life?
![Page 14: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/14.jpg)
Once You Find Something Alive …
You find a CLE
Separate Origin from Known Life?
Common Origin with Known Life?
Homologies w/ Known Life?
![Page 15: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/15.jpg)
Once You Find Something Alive …
You find a CLE
Separate Origin from Known Life?
Common Origin with Known Life?
Homologies w/ Known Life?
No
![Page 16: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/16.jpg)
Once You Find Something Alive …
You find a CLE
Separate Origin from Known Life?
Common Origin with Known Life?
Homologies w/ Known Life?
Yes
How Novel Is It?
![Page 17: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/17.jpg)
Once You Find Something Alive …
You find a CLE
Separate Origin from Known Life?
Common Origin with Known Life?
Homologies w/ Known Life?
Yes
How Novel Is It?
![Page 18: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/18.jpg)
• Novel form • Novel function • Novel phylogeny
How Novel Is It?
![Page 19: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/19.jpg)
• Novel form • Novel function • Novel phylogeny
How Novel Is It?
![Page 20: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/20.jpg)
Phylogeny
![Page 21: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/21.jpg)
Archaea
Worse Classification of Cultured Taxa by rRNA
rRNA rRNArRNA
ACUGC ACCUAU CGUUCG
ACUCC AGCUAU CGAUCG
ACCCC AGCUCU CGCUCG
Taxa Characters S ACUGCACCUAUCGUUCG R ACUCCACCUAUCGUUCG E ACUCCAGCUAUCGAUCG F ACUCCAGGUAUCGAUCG C ACCCCAGCUCUCGCUCG W ACCCCAGCUCUGGCUCG
EukaryotesBacteria
CarlWoese
![Page 22: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/22.jpg)
Woese 3 Domain Tree
![Page 23: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/23.jpg)
rRNA Phylotyping: One Taxon
DNA
ACTGC ACCTAT CGTTCG
ACTGC ACCTAT CGTTCG
ACTGC ACCTAT CGTTCG
Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACTGCACCTATCGTTCG
EukaryotesBacteria Archaea
Many sequences from one sample all point to the same branch on the tree
NormPace
![Page 24: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/24.jpg)
Expanded Tree (Pace 1997)
Archaea
Eukaryotes
Bacteria
Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.
Based on tree from Pace 1997 Science 276:734-740
![Page 25: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/25.jpg)
Is There Anything Like This?
Archaea
Eukaryotes
Bacteria
Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.
Based on tree from Pace 1997 Science 276:734-740
??????
![Page 26: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/26.jpg)
Metagenomics
metagenomics
ACUGC ACCUAU CGUUCG
ACUCC AGCUAU CGAUCG
ACCCC AGCUCU CGCUCG
Taxa Characters S ACUGCACCUAUCGUUCG R ACUCCACCUAUCGUUCG E ACUCCAGCUAUCGAUCG F ACUCCAGGUAUCGAUCG C ACCCCAGCUCUCGCUCG W ACCCCAGCUCUGGCUCG
Taxa Characters S ACUGCACCUAUCGUUCG
E ACUCCAGCUAUCGAUCG
C ACCCCAGCUCUCGCUCG
EukaryotesBacteria Archaea
![Page 27: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/27.jpg)
rRNA Tree of Life
Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.
Based on tree from Pace 1997 Science 276:734-740
Eukaryotes
??????
Archaea
Bacteria
Scanned through GOS data for rRNAs that fit this pattern
![Page 28: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/28.jpg)
rRNA Tree of Life
Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.
Based on tree from Pace 1997 Science 276:734-740
Eukaryotes
??????
Archaea
Bacteria
??????????
![Page 29: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/29.jpg)
RecA vs. rRNA
Eisen 1995 Journal of Molecular Evolution 41: 1105-1123..
![Page 30: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/30.jpg)
Venter et al., Science 304: 66. 2004
RecA Phylotyping - Sargasso Metagenome
![Page 31: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/31.jpg)
RecA Tree of Life?
Archaea
Eukaryotes
Bacteria
???????????
Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007.
Based on tree from Pace 1997 Science 276:734-740
![Page 32: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/32.jpg)
GOS 1
GOS 2
GOS 3
GOS 4
GOS 5
Novel RecA Sequences in GOS Data
Wu et al PLoS One 2011
![Page 33: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/33.jpg)
Novel RpoBs too
Wu et al PLoS One 2011
![Page 34: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/34.jpg)
GOS 1
GOS 2
GOS 3
GOS 4
GOS 5Wu et al PLoS One 2011
I am happy to wellcome you as a new member of the 4th domain club. If by chance you are passing through Europe I will be delighted to invite you to give a seminar in Marseille and show you our strange bugs.
Kind regards Didier
Phylogenetic ID of Novel Lineages
![Page 35: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/35.jpg)
Virus Origins
![Page 36: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/36.jpg)
2007-2014: GEBA
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
![Page 37: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/37.jpg)
Synapomorphies Exist
![Page 38: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/38.jpg)
Missing Microbes?
![Page 39: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/39.jpg)
Challenge: Poor Sampling
From Wu et al. 2009 Nature 462, 1056-1060
![Page 40: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/40.jpg)
JGI Dark Matter Project
environmental samples (n=9)
isolation of singlecells (n=9,600)
whole genomeamplification (n=3,300)
SSU rRNA gene based identification
(n=2,000)
genome sequencing, assembly and QC (n=201)
draft genomes(n=201)
SAK
HSM ETLTG
HOT
GOM
GBS
EPR
TAETL T
PR
EBS
AK E
SM G TATTG
OM
OT
seawater brackish/freshwater hydrothermal sediment bioreactor
GN04WS3 (Latescibacteria)GN01
!"#$%&'$LD1
WS1PoribacteriaBRC1
LentisphaeraeVerrucomicrobia
OP3 (Omnitrophica)ChlamydiaePlanctomycetes
NKB19 (Hydrogenedentes)WYOArmatimonadetesWS4
ActinobacteriaGemmatimonadetesNC10SC4WS2
Cyanobacteria()*&2
Deltaproteobacteria
EM19 (Calescamantes)+,-*./'&'012345678#89/,-568/:
GAL35Aquificae
EM3Thermotogae
Dictyoglomi
SPAMGAL15
CD12 (Aerophobetes)OP8 (Aminicenantes)AC1SBR1093
ThermodesulfobacteriaDeferribacteres
Synergistetes
OP9 (Atribacteria)()*&2
CaldisericaAD3
Chloroflexi
AcidobacteriaElusimicrobiaNitrospirae49S1 2B
CaldithrixGOUTA4
*;<%0123=/68>8?8,6@98/:Chlorobi
486?8,A-5BTenericutes4AB@9/,-568/Chrysiogenetes
Proteobacteria
4896@9/,-565BTG3SpirochaetesWWE1 (Cloacamonetes)
C=1ZB3
=D)&'EF58>@,@,,AB&CG56?ABOP1 (Acetothermia)Bacteriodetes
TM7GN02 (Gracilibacteria)
SR1BH1
OD1 (Parcubacteria)
(*1OP11 (Microgenomates)
Euryarchaeota
Micrarchaea
DSEG (Aenigmarchaea)Nanohaloarchaea
Nanoarchaea
Cren MCGThaumarchaeota
Cren C2Aigarchaeota
Cren pISA7
Cren ThermoproteiKorarchaeota
pMC2A384 (Diapherotrites)
BACTERIA ARCHAEA
archaeal toxins (Nanoarchaea)
lytic murein transglycosylase
stringent response (Diapherotrites, Nanoarchaea)
ppGpp
limitingamino acids
SpotT RelA
(GTP or GDP)+ PPi
GTP or GDP+ATP
limitingphosphate,fatty acids,carbon, iron
DksA
Expression of components for stress response
sigma factor (Diapherotrites, Nanoarchaea)
!4
"#$#"%
!2!3 !1
-35 -10
&'()
&*()
+',#-./0123452
oxidoretucase
+ +e- donor e- acceptor
H
'Ribo
ADP
+
'62
O
Reduction
OxidationH
'Ribo
ADP
'6
O
2H
',)##$#6##$#72#####################',)6+ + -
HGT from Eukaryotes (Nanoarchaea)
Eukaryota
O68*62
OH
'6
*8*63
OO
68*62
'6
*8*63
O
tetra-peptide
O68*62
OH
'6
*8*63
OO
68*62
'6
*8*63
O
tetra-peptide
murein (peptido-glycan)
archaeal type purine synthesis (Microgenomates)
PurFPurD9:3'PurL/QPurMPurKPurE9:3*PurB
PurP
?
Archaea
adenine guanine
O
6##'2
+'
'62
'
'
H
H
'
'
'
H
HH' '
H
PRPP ;,<*,+
IMP
,<*,+
A*
GUA *G U
GU
A
*
GU
A UA * U
A * U
Growing AA chain
=+',>?/0@#recognizes
UGA1+',
UGA recoded for Gly (Gracilibacteria)
ribosome
Woyke et al. Nature 2013.
Tanja Woyke
![Page 41: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/41.jpg)
Microbial Dark Matter Part 2
• Ramunas Stepanauskas
• Tanja Woyke • Jonathan Eisen • Duane Moser • Tullis Onstott
![Page 42: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/42.jpg)
• More accurate phylogeny • Rooting • Incorporating New and Fragmented Data • Lateral gene transfer • More biology about the “novel” lineages
Challenge: Reference Information
![Page 43: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/43.jpg)
Three Domains of Life
Bacteria Archaea Eukaryotes
![Page 44: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/44.jpg)
Bacteria Archaea Eukaryotes
Archaea and Bacteria as Sister
![Page 45: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/45.jpg)
Bacteria ArchaeaEukaryotes
Eukaryotes and Bacteria as
![Page 46: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/46.jpg)
Bacteria Archaea Eukaryotes
Archaea and Eukaryotes as
![Page 47: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/47.jpg)
Bacteria Archaea Eukaryotes
Other Patterns
Archaea
![Page 48: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/48.jpg)
Bacteria Archaea Eukaryotes
Outgroup for the Tree of Life?
![Page 49: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/49.jpg)
Bacteria Archaea Eukaryotes
Lateral Gene Transfer
Archaea
![Page 50: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/50.jpg)
Bacteria Archaea Eukaryotes
Lateral Gene Transfer
Archaea
![Page 51: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/51.jpg)
Automated Genome Tree
Lang JM, Darling AE, Eisen JA (2013) Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees and Supermatrices. PLoS ONE 8(4): e62510. doi:10.1371/journal.pone.0062510
Jenna Lang
![Page 52: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/52.jpg)
Better Reference Data (e.g., PhyEco Markers)
Phylogenetic group Genome Number Gene Number Maker Candidates
Archaea 62 145415 106
Actinobacteria 63 267783 136
Alphaproteobacteria 94 347287 121
Betaproteobacteria 56 266362 311
Gammaproteobacteria 126 483632 118
Deltaproteobacteria 25 102115 206
Epislonproteobacteria 18 33416 455
Bacteriodes 25 71531 286
Chlamydae 13 13823 560
Chloroflexi 10 33577 323
Cyanobacteria 36 124080 590
Firmicutes 106 312309 87
Spirochaetes 18 38832 176
Thermi 5 14160 974
Thermotogae 9 17037 684
Wu D, Jospin G, Eisen JA (2013) Systematic Identification of Gene Families for Use as “Markers” for Phylogenetic and Phylogeny-Driven Ecological Studies of Bacteria and Archaea and Their Major Subgroups. PLoS ONE 8(10): e77033. doi:10.1371/journal.pone.0077033
![Page 53: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/53.jpg)
Better Binning (e.g., HIC)
Beitel CW, Froenicke L, Lang JM, Korf IF, Michelmore RW, Eisen JA, Darling AE. (2014) Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. PeerJ 2:e415 http://dx.doi.org/10.7717/peerj.415
Table 1 Species alignment fractions. The number of reads aligning to each replicon present in thesynthetic microbial community are shown before and after filtering, along with the percent of totalconstituted by each species. The GC content (“GC”) and restriction site counts (“#R.S.”) of each replicon,species, and strain are shown. Bur1: B. thailandensis chromosome 1. Bur2: B. thailandensis chromosome2. Lac0: L. brevis chromosome, Lac1: L. brevis plasmid 1, Lac2: L. brevis plasmid 2, Ped: P. pentosaceus,K12: E. coli K12 DH10B, BL21: E. coli BL21. An expanded version of this table can be found in Table S2.
Sequence Alignment % of Total Filtered % of aligned Length GC #R.S.
Lac0 10,603,204 26.17% 10,269,562 96.85% 2,291,220 0.462 629
Lac1 145,718 0.36% 145,478 99.84% 13,413 0.386 3
Lac2 691,723 1.71% 665,825 96.26% 35,595 0.385 16
Lac 11,440,645 28.23% 11,080,865 96.86% 2,340,228 0.46 648
Ped 2,084,595 5.14% 2,022,870 97.04% 1,832,387 0.373 863
BL21 12,882,177 31.79% 2,676,458 20.78% 4,558,953 0.508 508
K12 9,693,726 23.92% 1,218,281 12.57% 4,686,137 0.507 568
E. coli 22,575,903 55.71% 3,894,739 17.25% 9,245,090 0.51 1076
Bur1 1,886,054 4.65% 1,797,745 95.32% 2,914,771 0.68 144
Bur2 2,536,569 6.26% 2,464,534 97.16% 3,809,201 0.672 225
Bur 4,422,623 10.91% 4,262,279 96.37% 6,723,972 0.68 369
Figure 1 Hi-C insert distribution. The distribution of genomic distances between Hi-C read pairs isshown for read pairs mapping to each chromosome. For each read pair the minimum path length onthe circular chromosome was calculated and read pairs separated by less than 1000 bp were discarded.The 2.5 Mb range was divided into 100 bins of equal size and the number of read pairs in each binwas recorded for each chromosome. Bin values for each chromosome were normalized to sum to 1 andplotted.
E. coli K12 genome were distributed in a similar manner as previously reported (Fig. 1;(Lieberman-Aiden et al., 2009)). We observed a minor depletion of alignments spanningthe linearization point of the E. coli K12 assembly (e.g., near coordinates 0 and 4686137)due to edge eVects induced by BWA treating the sequence as a linear chromosome ratherthan circular.
Beitel et al. (2014), PeerJ, DOI 10.7717/peerj.415 9/19
Figure 2 Metagenomic Hi-C associations. The log-scaled, normalized number of Hi-C read pairsassociating each genomic replicon in the synthetic community is shown as a heat map (see color scale,blue to yellow: low to high normalized, log scaled association rates). Bur1: B. thailandensis chromosome1. Bur2: B. thailandensis chromosome 2. Lac0: L. brevis chromosome, Lac1: L. brevis plasmid 1, Lac2:L. brevis plasmid 2, Ped: P. pentosaceus, K12: E. coli K12 DH10B, BL21: E. coli BL21.
reference assemblies of the members of our synthetic microbial community with the samealignment parameters as were used in the top ranked clustering (described above). We firstcounted the number of Hi-C reads associating each reference assembly replicon (Fig. 2;Table S3), observing that Hi-C data associated replicons within the same species (cell)orders of magnitude more frequently than it associated replicons from diVerent species.The rate of within-species association was 98.8% when ignoring read pairs mapping lessthan 1,000 bp apart. Including read pairs <1,000 bp inflated this figure to 99.97%. Fig. 3illustrates this by visualizing the graph of contigs and their associations. Similarly, forthe two E. coli strains (K12, BL21) we observed the rate of within-strain association tobe 96.36%. When evaluated on genes unique to each strain (where read mapping to eachstrain would be unambiguous), the self-association rate was observed to be >99%.
We observed that the rate of association of L. brevis plasmids 1 and 2 with each other andwith the L. brevis chromosome was at least 100-fold higher than with the other constituentsof the synthetic community (Fig. 2). Chromosome and plasmid Hi-C contact maps showthat the plasmids associate with sequences throughout the L. brevis chromosome (Fig. 4;Figs. S3–S5) and exhibit the expected enrichment near restriction sites. This demonstratesthat metagenomic Hi-C can be used to associate plasmids to specific strains in microbialcommunities as well as to determine cell co-localization of plasmids with one another.
Variant graph connectednessAlgorithms that reconstruct single-molecule genotypes from samples containing two ormore closely-related strains or chromosomal haplotypes depend on reads or read pairsthat indicate whether pairs of variants coexist in the same DNA molecule. Such algorithms
Beitel et al. (2014), PeerJ, DOI 10.7717/peerj.415 11/19
Figure 3 Contigs associated by Hi-C reads. A graph is drawn with nodes depicting contigs and edgesdepicting associations between contigs as indicated by aligned Hi-C read pairs, with the count thereofdepicted by the weight of edges. Nodes are colored to reflect the species to which they belong (see legend)with node size reflecting contig size. Contigs below 5 kb and edges with weights less than 5 were excluded.Contig associations were normalized for variation in contig size.
typically represent the reads and variant sites as a variant graph wherein variant sites arerepresented as nodes, and sequence reads define edges between variant sites observed inthe same read (or read pair). We reasoned that variant graphs constructed from Hi-Cdata would have much greater connectivity (where connectivity is defined as the meanpath length between randomly sampled variant positions) than graphs constructed frommate-pair sequencing data, simply because Hi-C inserts span megabase distances. Suchconnectivity should, in theory, enable more accurate reconstruction of single-moleculegenotypes from smaller amounts of data. Furthermore, by linking distant sites with fewerintermediate nodes in the graph, estimates of linkage disequilibrium at distant sites (from amixed population) are likely to have greater precision.
To evaluate whether Hi-C produces more connected variant graphs we compared theconnectivity of variant graphs constructed from Hi-C data to those constructed fromsimulated mate-pair data (with average inserts of 5 kb, 10 kb, 20 kb, and 40 kb). To excludepaired-end products from the analysis, Hi-C reads with inserts under 1 kb were excludedfrom the analysis. For each variant graph constructed from these inputs, 10,000 variantposition pairs were sampled at random, with 94.75% and 100% of these pairs belonging tothe same connected graph component of the Hi-C and 40 kb variant graphs, respectively.
Beitel et al. (2014), PeerJ, DOI 10.7717/peerj.415 12/19
Chris Beitel@datscimed
Aaron Darling @koadman
![Page 54: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/54.jpg)
Phylosift - Automated Bayesian Phylogenomics
Input Sequences rRNA workflow
protein workflow
profile HMMs used to align candidates to reference alignment
Taxonomic Summaries
parallel option
hmmalign multiple alignment
LAST fast candidate search
pplacer phylogenetic placement
LAST fast candidate search
LAST fast candidate search
search input against references
hmmalign multiple alignment
hmmalign multiple alignment
Infernal multiple alignment
LAST fast candidate search
<600 bp
>600 bp
Sample Analysis & Comparison
Krona plots, Number of reads placed
for each marker gene
Edge PCA, Tree visualization, Bayes factor tests
each
inpu
t seq
uenc
e sc
anne
d ag
ains
t bot
h w
orkf
low
s
Aaron Darling @koadman
Erik Matsen @ematsen
Holly Bik @hollybik
Guillaume Jospin @guillaumejospin
Darling AE, Jospin G, Lowe E, Matsen FA IV, Bik HM, Eisen JA. (2014) PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2:e243 http://dx.doi.org/10.7717/peerj.243
Erik Lowe
![Page 55: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/55.jpg)
Normalizing Across Genes Tree OTU
Wu, D., Doroud, L, Eisen, JA 2013. arXiv. TreeOTU: Operational Taxonomic Unit Classification Based on Phylogenetic
Dongying Wu
![Page 56: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/56.jpg)
Challenge: Engaging Public
![Page 57: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/57.jpg)
The Rise of Citizen Microbiology
Darlene Cavalier
![Page 58: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/58.jpg)
Eisen Lab Citizen MicrobiologyKitty Microbiome
Georgia Barguil
Jack Gilbert
Project MERCCURI
Phone and
Shoes
Kitty Microbiome Project
tinyurl/kittybiomeHolly Ganz
David Coil
![Page 59: "Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15](https://reader034.fdocuments.net/reader034/viewer/2022042723/58f14c5e1a28ab016a8b4571/html5/thumbnails/59.jpg)
Acknowledgements
DOE JGI Sloan GBMF NSF
DHS DARPA
Aaron Darling Lizzy Wilbanks
Jenna Lang Russell Neches
Rob KnightJack Gilbert Tanja Woyke Rob Dunn
Katie PollardJessica Green
Darlene Cavalier
Eddy RubinWendy Brown
Dongying Wu
Phil Hugenholtz
DSMZ
Sundar
Srijak Bhatnagar David Coil
Alex Alexiev
Hannah Holland-MoritzHolly Bik
John Zhang
Holly Menninger
Guillaume Jospin
David Lang
Cassie Ettinger
Tim HarkinsJennifer Gardy
Holly Ganz