J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments
-
Upload
jan-aerts -
Category
Technology
-
view
342 -
download
1
description
Transcript of J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments
Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments
Jens LichtenbergHematopoiesis Section, Genetics and Molecular Biology BranchNational Human Genome Research Institute, National Institutes of Health
http://code.google.com/p/nextgen-signaturesGNU General Public License, version 3.0 (GPLv3)
Motivation
● Large variety of omics approaches that produce sequencing data
● Common threads in the evaluation process
● Few approaches exist that attempt the large scale analysis of omics data
● Direct correlation of multiple omics data into actual biological insights
ChIP Seq
Histone Seq
Methylation Seq
Systems Biology Insights
Protein Seq
Comprehensive Analysis
RNA Seq
Requirements
● General○ Quantification of sequencing data requires dynamic
pipeline allowing for frequent adjustments○ Close interaction between bench and analysis
personnel● Specific
○ Quantitative analysis○ Functional analysis○ Regulatory analysis○ Visualizations
General Analysis Approach
Hematopoietic Stem Cell Differentiation in Mouse
Microarray Data curated in BloodExpress
RNA Seq Data
Methylation Seq Data
ChIP Seq Data (EKLF)
Histone Seq Data
Peak Calling Expression Correlation
Motif Discovery Occupancy Validation
Methylation Seq
Transcription Factors
Occupied Sites Number Overlapping
Exp. Overlapping
Z-Score P-Value
ERG 36166 966 1983 -20.80 2.16e-96
FLI1 19601 348 1075 -21.32 3.70e-101
GATA2 9234 278 507 -9.87 2.81e-23
GFI1B 8853 235 486 -11.04 1.23e-28
...
RUNX1 5269 97 290 -11.11 5.61e-29
SCL 7096 146 389 -12.26 7.42e-35
Peak Calling Methylation Correlation
Functional Analysis Motif Discovery
ChIP Seq
ERY (Meth.) MEP (Meth.)
Total 1187 587
Dist. Prom. 210 102
Prox. Prom. 29 21
Downstream 345 207
RefSeq 983 513
● EKLF control in MEP can be found in the first intron (Siatecka and Bieker, Blood, 2011)● During erythropoiesis EKLF is restricted to hematopoietic organs (Siatecka and Bieker,
Blood, 2011)● Down-regulation of EKLF expression in MEP cells leads megakaryopoiesis (Siatecka
and Bieker, Blood, 2011)
Histone SeqPeak Calling EKLF/Methylation Correlation
Functional Analysis Motif Discovery
MEME (OOPS) MEME (ZOOPS)
TomTom Lookup:● THI2, ZincFinger● NKx2-5, Homeobox● NKx2-6, Homeobox
TomTom Lookup:● THI2, ZincFinger● NKx2-3, Homeobox● NKx2-5, Homeobox● NKx2-6, Homeobox● NKx3-1, Homeobox
RNA SeqPeak Calling Functional Analysis
mRNA Differentiation Motif Discovery
Pathway Name ERY, MEP, MEG MEG, MEP ERY, MEG ERY, MEP
ERK/MAPK Sig. 1.83E-09 4.47E-16 5.01E-10
IGF-1 Sig. 1.04E-15 1.25E-10
MolMech. Cancer
3.72E-10 1.59E-22 1.13E-10 3.72E-10
...
PI3K/AKT Sig. 3.22E-20 2.84E-24 6.24E-18 1.33E-15ERY MEP
MEG
3338
47
966
1308
216 2408
241
Increase Decrease
MEP -> MEG 1238 7323
MEP -> ERY 1198 9307
Comprehensive Approach
Current Status● Perl Framework
○ Commonly used applications and repositories● Next-Generation Sequencing
○ Read Mapping■ UCSC Genomic Data
○ Peak Calling/Partitioning■ UCSC Genomic Data
○ Transcript Quantification■ UCSC/Ensembl Genomic Data
● Functional Genomics○ Expression Correlation
■ BloodExpress Database○ Pathway Analysis
■ KEGG/IPA○ Ontology Analysis
■ GO/IPA
● Regulatory Genomics○ Enumerative motif discovery
■ Transfac/Jaspar Database
○ Occupancy validation■ Literature specific data
sets
Future Issues
Data● Complete case study for Protein SeqImplementation● Complete implementation of all analysis facets● Transition Perl framework to C++ architecture● Parallelize software architecture for higher
performance/throughputSupport● Update web-interface and documentation to allow
unassisted data analysis
Conclusions and Availability
● A comprehensive approach is possible
● Meaningful results can be extracted using the approach
● Regulatory genomics can be used as a suitable post-processing analysis
● Comprehensive hematopoiesis study is feasible
● http://code.google.com/p/nextgen-signatures (GNU General Public License, version 3.0)
NHGRI - GMBB - Hematopoiesis SectionDavid Bodine and Amber Hogart
Acknowledgements
NHGRI Intramural Training Program