ChIP-seq and related applications

Post on 24-Feb-2016

43 views 0 download

Tags:

description

ChIP-seq and related applications. I dentifying regulatory functions in genomes. Chr5: 133,876,119 – 134,876,119. Genes. Transcription. Regulatory elements are not easily detected by sequence analysis Examine biochemical correlates of RE activity in cells/tissues: - PowerPoint PPT Presentation

Transcript of ChIP-seq and related applications

ChIP-seq and related applications

Identifying regulatory functions in genomes

Chr5: 133,876,119 – 134,876,119

Genes

Transcription

• Regulatory elements are not easily detected by sequence analysis

• Examine biochemical correlates of RE activity in cells/tissues:• Chromatin Immunoprecipitation (ChIP-seq)• DNase-seq and FAIRE• Methylated DNA immunoprecipitation (MeDIP)

Noonan and McCallion, Ann Rev Genomics Hum Genet 11:1 (2010)

Identifying regulatory functions in genomes

1. TF binding

Biochemical indicators of regulatory function

2. Histonemodification • H3K27ac • H3K4me3

3. Chromatinmodifiers &coactivators

p300 MLL

4. DNA loopingfactors cohesin

Regulatory functions are tissue/cell type/time point-specific

From Visel et al. (2009) Nature 461:199

Identifying regulatory functions in genomes

Chr5: 133,876,119 – 134,876,119

Genes

Transcription

TF bindingHistone mods

Methods

ChIP-seq Chromatin accessibility

TFs Histone mods DNase FAIRE

From Furey (2012) Nat Rev Genet 13:840

ChIP-seq

ChIP

Input

Peak call Signal

Align reads to referenceUse peaks of mapped reads to

identify binding events

PCR

ChIP-seq is an enrichment methodRequires a statistical framework for determining the significance of enrichment

ChIP-seq ‘peaks’ are regions of enriched read density relative to an input controlInput = sonicated chromatin collected prior to immunoprecipitation

ChIP

Input

Peak call Enrichment relative to control

Calling peaks in ChIP-seq data

Wilbanks and Facciotti PLoS ONE 5:e11471 (2010)

There are many ChIP-seq peak callers available

From Park (2009) Nat Rev Genet 10:669

Generating ChIP-seq peak profiles

Artifacts:

• Repeats• PCR duplicates

Assessing statistical significance

# of reads at a site (S)

Empirical FDR: Call peaks in input (using ChIP as control)FDR = ratio of # of peaks of given enrichment value called in input vs ChIP

Assume read distribution follows a Poisson distribution

Many sites in input data will have some reads by chance

Some sites will have many reads

From Pepke et al (2009) Nat Meth 6:S22

Assessing statistical significance

# of reads at a site (S)

From Park (2009) Nat Rev Genet 10:669

Sequencing depth matters:

ChIP-seq signal profiles vary depending on factor

Transcriptionfactors

Pol II

Histonemods

From Park (2009) Nat Rev Genet 10:669

Quantitative analysis of ChIP-seq signal profiles

ChIP-seq signal

Sign

al a

t 20,

000

boun

d sit

es

HeLaHeLa K562

Sites strongly marked in HeLa

Sites strongly marked in K562

Clustering

Sites strongly marked

in both

ChIP-seq analysis workflow

From Park (2009) Nat Rev Genet 10:669

Interpreting ChIP-seq datasets

Requires some prior knowledge• TF function• Histone modification• Potential target genes

Exploit existing annotation• Promoter locations• Known binding sites• Known histone modification maps

Example from PS1: CTCF and RAD21 (cohesin)

CTCF and cohesin co-occupy many sites

Promoters

Insulators

Enhancers

From Kagey et al (2010) Nature 467:430

CTCF: marks insulators and promotersRAD21 (cohesin): marks insulators, promoters and enhancers

Promoter Enhancers?

Limb Brain

Discovering regulatory functions specific to a biological state

Function?

Assign enhancers to genes based on proximity (not ideal)

GREAT: bejerano.stanford.edu/great/Gene ontology annotation assigned to regulatory sequences

TF motif elicitation from ChIP-seq data

CTCF

~20,000 binding sites identified by ChIP:

From Furey (2012) Nat Rev Genet 13:840

MEME suite:http://meme.nbcr.net/meme/

Enhancer-associatedhistone modification

Single TF binding events may not indicate regulatory function

• Many TFs are present at high concentrationsin the nucleus

• TF motifs are abundant in the genome

• Single TF binding events may be incidental

DNase I FAIRE

Mapping chromatin accessibility

From Furey (2012) Nat Rev Genet 13:840

DNase I hypersensitivity identifies TF binding events

From Furey (2012) Nat Rev Genet 13:840

Song et al., Genome Res 21:1757 (2011)

DNase I hypersensitivity identifies regulatory elements

DNase I hypersensitive sites

De novo TF motif discovery by DNase I hypersensitivity mapping

In human ES cells:

From Neph (2012) Nature 489:83

De novo TF motif discovery by DNase I hypersensitivity mappingAcross tissue types:

From Neph (2012) Nature 489:83

Summary

• Relevant overview papers on ChIP-seq and DNase-seq posted on class wiki

• Monday: Epigenetics and the histone code

• Wednesday: Regulatory architecture of the genome