J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

13
Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments Jens Lichtenberg Hematopoiesis Section, Genetics and Molecular Biology Branch National Human Genome Research Institute, National Institutes of Health http://code.google.com/p/nextgen-signatures GNU General Public License, version 3.0 (GPLv3)

description

Presentation by J Lichtenberg at BOSC2012 - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Transcript of J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Page 1: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Jens LichtenbergHematopoiesis Section, Genetics and Molecular Biology BranchNational Human Genome Research Institute, National Institutes of Health

http://code.google.com/p/nextgen-signaturesGNU General Public License, version 3.0 (GPLv3)

Page 2: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Motivation

● Large variety of omics approaches that produce sequencing data

● Common threads in the evaluation process

● Few approaches exist that attempt the large scale analysis of omics data

● Direct correlation of multiple omics data into actual biological insights

ChIP Seq

Histone Seq

Methylation Seq

Systems Biology Insights

Protein Seq

Comprehensive Analysis

RNA Seq

Page 3: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Requirements

● General○ Quantification of sequencing data requires dynamic

pipeline allowing for frequent adjustments○ Close interaction between bench and analysis

personnel● Specific

○ Quantitative analysis○ Functional analysis○ Regulatory analysis○ Visualizations

Page 4: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

General Analysis Approach

Page 5: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Hematopoietic Stem Cell Differentiation in Mouse

Microarray Data curated in BloodExpress

RNA Seq Data

Methylation Seq Data

ChIP Seq Data (EKLF)

Histone Seq Data

Page 6: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Peak Calling Expression Correlation

Motif Discovery Occupancy Validation

Methylation Seq

Transcription Factors

Occupied Sites Number Overlapping

Exp. Overlapping

Z-Score P-Value

ERG 36166 966 1983 -20.80 2.16e-96

FLI1 19601 348 1075 -21.32 3.70e-101

GATA2 9234 278 507 -9.87 2.81e-23

GFI1B 8853 235 486 -11.04 1.23e-28

...

RUNX1 5269 97 290 -11.11 5.61e-29

SCL 7096 146 389 -12.26 7.42e-35

Page 7: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Peak Calling Methylation Correlation

Functional Analysis Motif Discovery

ChIP Seq

ERY (Meth.) MEP (Meth.)

Total 1187 587

Dist. Prom. 210 102

Prox. Prom. 29 21

Downstream 345 207

RefSeq 983 513

● EKLF control in MEP can be found in the first intron (Siatecka and Bieker, Blood, 2011)● During erythropoiesis EKLF is restricted to hematopoietic organs (Siatecka and Bieker,

Blood, 2011)● Down-regulation of EKLF expression in MEP cells leads megakaryopoiesis (Siatecka

and Bieker, Blood, 2011)

Page 8: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Histone SeqPeak Calling EKLF/Methylation Correlation

Functional Analysis Motif Discovery

MEME (OOPS) MEME (ZOOPS)

TomTom Lookup:● THI2, ZincFinger● NKx2-5, Homeobox● NKx2-6, Homeobox

TomTom Lookup:● THI2, ZincFinger● NKx2-3, Homeobox● NKx2-5, Homeobox● NKx2-6, Homeobox● NKx3-1, Homeobox

Page 9: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

RNA SeqPeak Calling Functional Analysis

mRNA Differentiation Motif Discovery

Pathway Name ERY, MEP, MEG MEG, MEP ERY, MEG ERY, MEP

ERK/MAPK Sig. 1.83E-09 4.47E-16 5.01E-10

IGF-1 Sig. 1.04E-15 1.25E-10

MolMech. Cancer

3.72E-10 1.59E-22 1.13E-10 3.72E-10

...

PI3K/AKT Sig. 3.22E-20 2.84E-24 6.24E-18 1.33E-15ERY MEP

MEG

3338

47

966

1308

216 2408

241

Increase Decrease

MEP -> MEG 1238 7323

MEP -> ERY 1198 9307

Page 10: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Comprehensive Approach

Current Status● Perl Framework

○ Commonly used applications and repositories● Next-Generation Sequencing

○ Read Mapping■ UCSC Genomic Data

○ Peak Calling/Partitioning■ UCSC Genomic Data

○ Transcript Quantification■ UCSC/Ensembl Genomic Data

● Functional Genomics○ Expression Correlation

■ BloodExpress Database○ Pathway Analysis

■ KEGG/IPA○ Ontology Analysis

■ GO/IPA

● Regulatory Genomics○ Enumerative motif discovery

■ Transfac/Jaspar Database

○ Occupancy validation■ Literature specific data

sets

Page 11: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Future Issues

Data● Complete case study for Protein SeqImplementation● Complete implementation of all analysis facets● Transition Perl framework to C++ architecture● Parallelize software architecture for higher

performance/throughputSupport● Update web-interface and documentation to allow

unassisted data analysis

Page 12: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

Conclusions and Availability

● A comprehensive approach is possible

● Meaningful results can be extracted using the approach

● Regulatory genomics can be used as a suitable post-processing analysis

● Comprehensive hematopoiesis study is feasible

● http://code.google.com/p/nextgen-signatures (GNU General Public License, version 3.0)

Page 13: J Lichtenberg - Discovery of motif-based regulatory signatures in NextGen Sequencing Experiments

NHGRI - GMBB - Hematopoiesis SectionDavid Bodine and Amber Hogart

Acknowledgements

NHGRI Intramural Training Program