Computational metagenomics and the human microbiome
description
Transcript of Computational metagenomics and the human microbiome
![Page 1: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/1.jpg)
Computational metagenomics andthe human microbiome
Curtis Huttenhower
01-21-11Harvard School of Public HealthDepartment of Biostatistics
![Page 2: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/2.jpg)
2
What to do with your metagenome?
(x1010)
Diagnostic or prognostic
biomarker for host disease
Public health tool monitoring
population health and interactions
Comprehensive snapshot of
microbial ecology and evolution
Reservoir of gene and protein
functional informationWho’s there?
What are they doing?
What do functional genomic data tell us about microbiomes?
What can our microbiomes tell us about us?*
*Using terabases of sequence and thousands of experimental results
![Page 3: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/3.jpg)
3
The Human Microbiome Project
2007 - ongoing
• 300 “normal” adults, 18-40
• 16S rDNA + WGS• 5 sites/18 samples +
blood• Oral cavity: saliva, tongue,
palate, buccal mucosa, gingiva,
tonsils, throat, teeth• Skin: ears, inner elbows• Nasal cavity• Gut: stool• Vagina: introitus, mid, fornix
• Reference genomes (~200+800)
All healthy subjects; followup projects in psoriasis, Crohn’s,
colitis, obesity, acne, cancer, antibiotic
resistant infection…
Hamady, 2009
Kolenbrander, 2010
![Page 4: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/4.jpg)
4
HMP Organisms: Everyone andeverywhere is different
← Body sites + individuals →
← O
rgan
ism
s (ta
xa) →
ear gut nose mouth vaginaarmmucosa palate gingiva tonsils saliva sub. plaq. sup. plaq. throat tongue
Every microbiome is surprisingly different
Most organisms are rare in most places
Even common organisms vary tremendously in abundance
among individuals
Aerobicity, interaction with the immune system, and
extracellular medium appear to be major determinants
There are few organismal biotypes
in health
![Page 5: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/5.jpg)
5
HUMAnN: Community metabolic and functionalreconstruction
WGS reads
Pathways/modules
Genes(KOs)
Pathways(KEGGs)
Functional seq.KEGG + MetaCYC
CAZy, TCDB,VFDB, MEROPS…
BLAST → Genes
rra
r
raa
p
gap
ggc
)(
)(
1
)()1(
||1)(
Genes → PathwaysMinPath (Ye 2009)
SmoothingWitten-Bell
otherwiseTNNgcgcTNTVTN
gc)/()(
0)()/()/()(Gap filling
c(g) = max( c(g), median )
300 subjects1-3 visits/subject~6 body sites/visit
10-200M reads/sample100bp reads
BLAST
?Taxonomic limitation
Rem. paths in taxa < ave.
XipeDistinguish zero/low
(Rodriguez-Mueller in review)
HMPUnifiedMetabolicAnalysisNetwork
![Page 6: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/6.jpg)
6
HUMAnN: Community metabolic and functionalreconstruction
Pathway coverage Pathway abundance
![Page 7: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/7.jpg)
7
HUMAnN: Validating gene and pathwayabundances on synthetic data
Validated on individual genes, module coverage + abundance
• False negatives: short genes (<100bp),
taxonomically rare pathways • False positives: large and multicopy
(not many in bacteria)
![Page 8: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/8.jpg)
8
HUMAnN: The steps that didn’t make the cut
Abundance
Coverage
![Page 9: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/9.jpg)
9
Functional modules in 741 HMP samples
Coverage
Abundance
ANO(BM)PF O(SP)S RCO(TD)← Samples →
← P
athw
ays→
• Zero microbes (of ~1,000)
are core among body sites• Zero microbes are core
among individuals• 19 (of ~220) pathways are
present in every sample• 53 pathways are present in
90%+ samples
• Only 31 (of 1,110) pathways
are present/absent from
exactly one body site• 263 pathways are
differentially abundant in
exactly one body site
![Page 10: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/10.jpg)
10
Microbial environment trumpshost environment (in health)
HMP stool, colored by BMI MetaHIT stool, colored by IBD
← M
icro
bes→
← P
athw
ays→
Aerobic body sites
Gastrointestinal body sites
Pathways in all body sites (“core”) • Human microbiomestructure dictated
primarilyby microbial niche,
nothost (in health)
• Huge variation in who’s
there; small variation in
what they’re doing• Note: definitely variation in
how these functions are
implemented• Does not yet speak to
environment (diet!),genetics, or disease
![Page 11: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/11.jpg)
11
GeneexpressionSNPgenotypes
Metagenomic biomarker discovery
Healthy/IBDBMIDiet
Taxa &pathways
Batch effects?Populationstructure?
Niches &Phylogeny
Test for correlates
Multiplehypothesiscorrection
Featureselection
p >> n
Confounds/stratification/environment
Cross-validate
Biological story?
Independent sample
Intervention/perturbation
![Page 12: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/12.jpg)
12
LEfSe: Metagenomic classcomparison and explanation
LEfSe
http://huttenhower.sph.harvard.edu/lefse
Nicola Segata
LDA +Effect Size
![Page 13: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/13.jpg)
13
LEfSe: Evaluation on synthetic data
![Page 14: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/14.jpg)
14
Microbes characteristic of theoral and gut microbiota
![Page 15: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/15.jpg)
Aerobic, microaerobic and anaerobic communities
• High oxygen:skin, nasal• Mid oxygen:vaginal, oral• Low oxygen:gut
![Page 16: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/16.jpg)
16
LEfSe: The TRUC murine colitis microbiotaWith Wendy Garrett
![Page 17: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/17.jpg)
17
MetaHIT: The gut microbiome and IBD
WGS reads
Pathways/modules
124 subjects: 99 healthy21 UC + 4 CD
ReBLASTed against KEGG since published data
obfuscates read counts
Taxa
PhymmBrady 2009
Genes(KOs)
Pathways(KEGGs)
Qin 2010
With Ramnik Xavier, Joshua Korzenik
![Page 18: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/18.jpg)
18
MetaHIT: Taxonomic CD biomarkers
Firmicutes
Enterobacteriaceae
Up in CDDown in CD
UC
![Page 19: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/19.jpg)
19
MetaHIT: Functional CD biomarkers
Motility Transporters Sugar metabolism
Down in CD
Up in CD
Subset of enriched modules in CD patientsSubset of enriched pathways in CD patients
Growth/replication
![Page 20: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/20.jpg)
20
• Sleipnir C++ library for computational functional genomics
• Data types for biological entities• Microarray data, interaction data, genes and gene sets,
functional catalogs, etc. etc.• Network communication, parallelization
• Efficient machine learning algorithms• Generative (Bayesian) and discriminative (SVM)
• And it’s fully documented!
Sleipnir: Software forscalable functional genomics
Massive datasets require efficientalgorithms and implementations.
It’s also speedy: microbial data integration
computationtakes <3hrs.
http://huttenhower.sph.harvard.edu/sleipnirhttp://huttenhower.sph.harvard.edu/lefsehttp://huttenhower.sph.harvard.edu/humann
![Page 21: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/21.jpg)
21
Thanks!
Jacques IzardWendy Garrett
Pinaki SarderNicola Segata
Levi Waldron LarisaMiropolsky
Interested? We’re recruiting students and postdocs!
Human Microbiome Project
HMP Metabolic Reconstruction
George WeinstockJennifer WortmanOwen WhiteMakedonka MitrevaErica SodergrenVivien Bonazzi Jane PetersonLita Proctor
Sahar AbubuckerYuzhen Ye
Beltran Rodriguez-MuellerJeremy ZuckerQiandong Zeng
Mathangi ThiagarajanBrandi Cantarel
Maria RiveraBarbara Methe
Bill KlimkeDaniel Haft
Ramnik Xavier Dirk Gevers
Bruce Birren Mark DalyDoyle Ward Eric AlmAshlee Earl Lisa Cosimi
Sarah Fortune
http://huttenhower.sph.harvard.edu/
![Page 22: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/22.jpg)
![Page 23: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/23.jpg)
23
The LEfSe algorithm
Statisticalconsistency
Biologicalconsistency
Overalleffect size
![Page 24: Computational metagenomics and the human microbiome](https://reader033.fdocuments.net/reader033/viewer/2022061617/56816552550346895dd7cacd/html5/thumbnails/24.jpg)
24
HMP: Metabolism, host-microbiome interactions, and microbial taxa
>3200 gene families differential in the
mucosa
>1500 upregulated outsidethe mucosa and not in any
Actinobacterial genome
16S
WGS