Friedberg lab-overview-grad-students
Transcript of Friedberg lab-overview-grad-students
http://iddo-friedberg.net Twitter: @iddux
The Friedberg Lab
Bacterial genome evolutionProtein function prediction
MetagenomicsPhenomics
http://iddo-friedberg.net Twitter: @iddux
About me
● 2003: PhD Hebrew University, Jerusalem● 2003-2006: Postdoc, Burnham Institute, CA● 2006-2009: Researcher, UC San Diego● 2009-2015: Assistant Professor, Miami
University, Ohio● 2015- Associate Professor, Iowa State
University
http://iddo-friedberg.net Twitter: @iddux
Friedberg Lab Members
Naihui Zhou
Nafiz Hamid
Xiao Hu
Ataur Katebi
Huy Nguyen
http://iddo-friedberg.net Twitter: @iddux
Lab philosophy
● Ask biological questions that have computational answers
● You may answer something else. That's great.
● There's treasure everywhere. But no data dredging
http://iddo-friedberg.net Twitter: @iddux
What is an operon?● Operons are (almost) unique to Prokaryotes.
Transcription
polycistronic mRNA
Translation
Gene 1 Gene 2 Gene 3Regulation
http://iddo-friedberg.net Twitter: @iddux
Gene Blocks
Transcription
mRNA transcripts
Translation
Gene 1 Gene 2 Gene 3
● Gene blocks are any suspected syntenic group of open reading frames (ORFs) which have a maximum allowed spacing. For my research this maximum is 500 nt.
http://iddo-friedberg.net Twitter: @iddux
Background
● Operons are an important feature in prokaryotic genetics.– Often contain full metabolic pathways.
● a set of chemical processes transforming one compound into another.
– Regulate groups of genes.– Allow for the frequent transfer of gene blocks
between organisms.● Therefore, studying operon evolution helps us
to understand metabolic pathway formation.
http://iddo-friedberg.net Twitter: @iddux
How we model changes in gene blocks
● We borrow ideas from sequence evolution, but genes are the atom of change.– Changes are called
events.
– There are more possible events modeling gene block evolution than in biological sequence evolution.
5' ATCCGA 3'
ATCCGT ATC-GA
http://iddo-friedberg.net Twitter: @iddux
Events investigated●Deletions
●Duplications
● Splits
Gaps exceeding 100 kbp are common
http://iddo-friedberg.net Twitter: @iddux
Results: Normalized interspecies event rate
We asked: how do complex structures evolve?
We answered: the relationship between conservation and function.
http://iddo-friedberg.net Twitter: @iddux
Ancestry of Orthobolocks
1
2
3
4
5
A B C D E
A B C D E
A B C D E
A B C D E
A B C D ED
A B C D EA B C D E
http://iddo-friedberg.net Twitter: @iddux
Species tree shows ancestry
1
2
3
4
5
A B C D E
A B C D E
A B C D E
A B C D E
A B C D ED
A B C D EA B C D E
http://iddo-friedberg.net Twitter: @iddux
Ancestral resolution
1
2
3
4
5
A B C D E
A B C D E
A B C D E
A B C D E
A B C D EA B C D E
A B C D ED
A B C D EA B C D E
A B C D ED
A B C D ED
http://iddo-friedberg.net Twitter: @iddux
Why CAFA?
“On the one hand, we have enormous “protein” databases that are replete with errors, wishful thinking, phantoms, and uncertainties. On the other, we have a tiny fraction of real proteins that have been studied in any depth.” –- Dan Graur
Biggest problem in molecular biology: < $1,000 genome,
BUT:
$20,000- >$10,000,000 annotation.
http://iddo-friedberg.net Twitter: @iddux
CAFA
● The Critical Assessment of Function Annotation
● Hundreds of scientists trying to predict protein function from sequence
● A friendly competition between scientific teams
The Protein function prediction problem
>sp|P04637|P53_HUMANMEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD
Cell differentiation
Apoptosis
Biological process
?
Apoptosis
Biological process
Cell differentiation
Biological process
PREDICTED: TRUE:
MEASURING FUNCTIONAL SIMILARITY
Precision: Recall:
http://iddo-friedberg.net Twitter: @iddux
CAFA1 vs. CAFA2CAFA 1 (2011) CAFA2 (2014)
Methods 54 129
Groups 29 57
Targets 50,000 108,000
Benchmarks ~840 3,681
Target types No knowledge No knowledge, partial knowledge
Ontologies MFO, BPO MFO,BPO,CCO,HPO
Target set choice Full mode Full mode, partial (>5000) mode
Assessment Fmax Fmax, Smin
http://iddo-friedberg.net Twitter: @iddux
Human Phenotype Ontology
Many terms (>80) per protein. Some easily predicted,Others not predicted at all.
http://iddo-friedberg.net Twitter: @iddux
CAFA Team
Predrag Radivojac
Casey Greene
Sean Mooney
Maria Martin
Claire O'Donovan
http://iddo-friedberg.net Twitter: @iddux
...metagenomic revolution
6,000,000 genes>6,000,000,000 BP~900 ribotypes
3,000,000 genes160,000,000 BP>1,100 ribotypes
http://iddo-friedberg.net Twitter: @iddux
Microbial community(metagenome)
Gut epithelial cells(transcriptome)
454 sequencing
Who? (Phylogenetic analysis)
What? (functional analysis)
Clean & RT
Codelink chip
http://iddo-friedberg.net Twitter: @iddux
Microbial community(metagenome)
Gut epithelial cells(transcriptome)
Altered Schaedler Flora
• Bacteria originate from mice– Unique niche
• Stable GI community• All cultivable in anaerobic chamber
• Advantages– Representative community
• Animals are healthy/normal growth– Specific host immune responses
• Antibodies• T-cell recall
– Evaluate entire GI microbiota• Quantitative changes• Spatial redistribution• Community gene expression • Mutation identification
ASF
The Altered Schaedler Flora (ASF) Greg Phillips & Michael Wannenmueller
Wannemuehler, et al. 2014. Genome Announc. 2.
1No Proteobacteria in ASF *Specific Pathogen Free
ASF*1
Table 1: Genome sequencing results of the ASF community ASF #
Taxonomy Genome Size (Mb)
%GC Gene count
Contig count
N50 (Kb)
Fold Coverage
Genbank Accession #
ASF356 Clostridium bacterium 2.91 30.91 2799 31 209 143 AQFQ00000000.1 ASF360 Lactobacillus intestinalis 2.01 35.86 1930 244 19 47 AQFR00000000.1 ASF361 Lactobacillus murinus 2.17 39.96 2102 78 59.7 160 AQFS00000000.1 ASF457 Mucispirillum schaedleri 2.33 31.15 2144 39 151 142 AYGZ00000000.1 ASF492 Eubacterium plexicaudatum 6.51 42.86 6217 248 74.4 119 AQFT00000000.1 ASF500 Pseudoflavonifractor sp. 3.70 58.77 3563 42 300 137 AYJP00000000.1 ASF502 Clostridium bacterium 6.48 47.90 6062 134 137 82 AQFU00000000.1 ASF519 Parabacteroides goldsteinii 6.87 43.45 5477 24 584 143 AQFV00000000.1