Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log...

73

Transcript of Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log...

Page 1: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 2: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Introduction

•! Protein-DNA interaction

•! Chromatin immunoprecipitation & tiling arrays

•! Models

•! Target abundance

•! Fluorescence signal

•! Analysis methods

•! Overview

•! Comparisons

•! Normalization

Page 3: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 4: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Proteins interact with DNA to •! Carry out transcription of “activated” genes.

•! Carry out DNA replication.

•! Repair damaged DNA.

•! Mediate recombination in meiosis.

•! Modify or “remodel” the chromatin.

•! Enhance or suppress gene transcription.

•! Etc.

•! Transcription factor proteins regulate gene expression, and recognize short, degenerate motifs in the DNA.

Page 5: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! ChIP-chip permits in vivo, genome-wide localization of transcription factor binding sites.

•! Other applications:

•! Localization of transcriptional machinery.

•! Histone modifying or chromatin remodeling proteins, or the modified (e.g., methylated) forms themselves.

•! Origin recognition complexes.

Page 6: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 7: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! In vitro?

•! Oligo-selection or gel-shift assays are often poor predictors of in vivo binding.

•! With expression arrays?

•! Change in expression may be through intermediaries.

•! If required co-factors aren’t present, genes which are direct targets may not

exhibit differential expression.

•! In silico?

•! Consensus sites appear far too often.

•! Motifs are degenerate.

Page 8: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 9: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 10: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 11: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 12: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 13: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 14: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 15: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 16: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Cross-link all proteins to genomic DNA in vivo.

•! Extract chromatin and fragment by sonication.

•! ChIP: preferentially filter TF-associated fragments.

•! Purify DNA and amplify.

•! Prepare control DNA by

•! Omitting the immunoprecipitation step, or

•! Using a non-specific antibody for IP.

•! Hybridize treatment and control DNA to separate tiling microarrays.

•! Wash, stain, scan.

Page 17: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Cross-link all proteins to genomic DNA in vivo.

•! Extract chromatin and fragment by sonication.

•! ChIP: preferentially filter TF-associated fragments.

•! Purify DNA and amplify.

•! Affymetrix D. melanogaster tiling arrays

•! 6 million 25-mer oligo probes (PM/MM pairs).

•! Median distance between probe starts: 36 bp.

•! Probes targeting repetitive sequence, or with expected hybridization or

synthesis problems, are omitted.

Page 18: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! BAC spike-in

•! Genomic input control arrays (2x)

•! Artificially enriched treatment arrays (2x): regions from chr2 and chr3 (!150kb in length) added at known relative concentrations.

•! Anti-Pol II

•! Genomic input control arrays (2x)

•! Mock-IP control arrays (IgG, 2x)

•! ChIP arrays (2x)

•! Anti-Zeste

•! Genomic input control arrays (3x)

•! Mock-IP control arrays (IgG, 2x biological, 3x technical)

•! ChIP arrays (2x biological, 3x technical)

Page 19: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 20: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 21: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 22: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

All fragments with no TF

binding site pass with

low probability: !

Page 23: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

All fragments with no TF

binding site pass with

low probability: !

Page 24: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

!! Under the model, the

expected fragment length

after sonication is 1/".

!! Unobservable target

abundance (Ai) will form

peaks around binding sites.

!! For an average fragment

length of 500 bases,

!! is appreciable over a large

number of probes in the

tiling.

( , )Corr( , ) (1 )

d i j

i jA A ! " #

Page 25: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 26: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Target abundance (Aij)

•! The unobservable number of DNA fragments in sample j which contain sequence

complementary to the probes in feature i.

•! Fluorescence intensity (Iij)

•! The observable, scanned intensity reading for feature i, sample j.

•! Abundance and intensity are related, but not in a simple way…

Page 27: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 28: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 29: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! For probe i of sample j, assume that

•! When control data are available, we can eliminate the probe affinity effects

with a ratio of intensities:

= ! " .ij i ij ijI A

Target abundance Probe affinity

Multiplicative error (# > 0)

= ! + "log logT C

i i i iLR A A

Page 30: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 31: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Assume additive background is

removed during pre-processing.

•!

•! The expected log-ratio signal

also exhibits peaks.

•! Peak amplitude and width depend on the efficiency ratio:

•! Note: !$ is binding-site specific.

!"

".

Page 32: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! D. melanogaster chromosome 2L

•! Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/

PCR/hybridization groups.

Page 33: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 34: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Under the model, we expect some spatial correlation in the log-ratios,

even in null regions…

Page 35: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! For simplicity, ignore

irregularity of probe spacing.

•! Compute auto-correlation at

various lags.

•! For both data sets, there is statistically significant auto-

correlation up to a lag of %15

positions.

Page 36: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

Consider…

•! A constant, non-zero

background (B = 1).

•! Fixed enrichment ratio.

•! No noise (" = 1).

•! Varying probe response (#).

A better model:

Iij= !

iA

ij"

ij+ B

ij.

Additive background

Page 37: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 38: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 39: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Varying probe affinity and additive background are important issues.

•! The model predicts peak-like signal response near binding sites.

•! The model predicts spatial correlation in both target abundance and log-ratio.

Target sequence for neighboring probes tends to end up on the same fragment. IP

and amplification take place at the fragment level.

Page 40: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 41: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Actual binding site signal spans multiple positions.

•! Single probes are…

•! Prone to gross error.

•! Frequently either lazy or promiscuous hybridizers.

•! Statistical approaches:

•! Two-state hidden Markov models.

Li, Meyer and Liu, Bioinformatics, 2005; TileMap, Ji and Wong, Bioinformatics, 2005.

•! Smoothed or windowed probe-level statistics

Cawley et al., Cell, 2004; Keles et al., 2004; MAT, Johnson et al., PNAS, 2006;

Buck, Nobel and Lieb, Genome Biology 2005; Toedling et al., BMC Bioinformatics 2008

•! Ad hoc post-processing of probe-level calls

•! Peak fitting

Kim et al., Nature, 2005; Keles, Biometrics, 2007; Zheng et al., Biometrics, 2007.

Page 42: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Actual binding site signal spans multiple positions.

•! Single probes are…

•! Prone to gross error.

•! Frequently either lazy or promiscuous hybridizers.

•! Statistical approaches:

•! Two-state hidden Markov models.

Li, Meyer and Liu, Bioinformatics, 2005; TileMap, Ji and Wong, Bioinformatics, 2005.

•! Smoothed or windowed probe-level statistics

Cawley et al., Cell, 2004; Keles et al., 2004; MAT, Johnson et al., PNAS, 2006;

Buck, Nobel and Lieb, Genome Biology 2005; Toedling et al., BMC Bioinformatics 2008

•! Ad hoc post-processing of probe-level calls

•! Peak fitting

Kim et al., Nature, 2005; Zheng et al., Biometrics 2007.

Page 43: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Quantile normalize all slides together

•! Compute a difference of average log-intensities (equivalent to a logged ratio

of the geometric mean intensity).

•! Smooth by mean or trimmed mean over a moving window (typically 675 to

1000 bp).

•! Compute a non-parametric p-value for the smoothed, window-level scores.

•! Adjust for multiple testing to control FDR, by Storey q-value method.

Applications in Drosophila melanogaster:

•! Polycomb targets. YB Schwartz et al., Nature Genetics, 2006.

•! Myb-MuvB/dREAM complex. D Georlette et al., Genes Dev., 2007.

•! Maternal and gap factors. X Li and S MacArthur, et al., PLoS Biology, 2008.

•! Dosage compensation complex. J Kind and JM Vaquerizas, et al., Cell, 2008.

Page 44: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Assuming…

•! Enrichment occurs at

only a small fraction of genomic positions.

•! At positions of no

enrichment, the log-ratios are symmetrically

distributed

•! Note! This last assumption

can fail badly if data are not

properly normalized.

Page 45: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Gibbons et al., Genome Biology, 2005; and Toedling et al., BMC

Bioinformatics, 2008 are similar. Also see Efron, JASA, 2004.

Page 46: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

ENCODE data: Pol2, 00hr, B1 vs. B1,4,5 pooled.

Page 47: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 48: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Sequence-based variability in probe response

•! When presented with the same target concentrations, probes respond very

differently. How do we deal with this?

•! Additive background

•! Is it appreciable? Do we gain by correcting for it?

•! Estimation of probe-level variance

Variability across replicates is different for different probes. Is it possible/beneficial

to address this?

Page 49: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! All methods…

•! compute test statistics, then

•! select a threshold for making positive enrichment calls

•! To avoid confounding the two issues, we focus on test statistics only and use

ROC (or pseudo-ROC) performance metrics: all possible thresholds are considered simultaneously.

Page 50: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

Method Background correction

TiMAT ! —

MAT ! —

TileMap ! —

Li ’05 HMM Mismatch subtraction

Kele$ ’06 ! —

TAS (Affymetrix) ! Mismatch subtraction, adjust non-positive values to 1.

HGMM —

Chipper/vsn ! Affine adjustment, plus variance-stabilizing transform

GC-RMA “affinities” ! Sequence-based background estimation.

GC-RMA “full model” ! Smooth between MM sub. and sequence based

Page 51: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 52: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

H0= regions with no enrichment{ }

H1= regions with enrichment{ }

Page 53: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

S0= regions less likely to have enrichment{ }

S1= regions more likely to have enrichment{ }

Page 54: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Pseudo-positives: 125 bp intervals upstream from annotated

transcription start sites (!14K).

•! Pseudo-negatives: 125 bp intergenic intervals, matched to ps-

positives for (i) GC content and (ii) probe density (!13K).

Page 55: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 56: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

Method Background correction

TiMAT ! Log ratio

MAT ! Sequence based estimate, followed by log ratio

TileMap Log ratio

Li ’05 HMM Empirically estimated from putative null experiments

Kele$ ’06 Log ratio

TAS (Affymetrix) None: probes are treated as interchangeable

HGMM Hierarchical model w/ probe-specific distributions

Chipper/vsn NA

GC-RMA “affinities” NA

GC-RMA “full model” NA

Page 57: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 58: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 59: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

Method Background correction

TiMAT ! Single global estimate

MAT Binned estimates, for probe with similar affinities

TileMap ! Empirical Bayes smoothed estimate

Li ’05 HMM Empirically estimated from putative null experiments

Kele$ ’06 ! Probe specific (standard two-sample t statistic)

TAS (Affymetrix) NA

HGMM Probe-specific (CV assumed constant)

Chipper/vsn NA

GC-RMA “affinities” NA

GC-RMA “full model” NA

Page 60: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 61: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 62: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

Genome Research 18:393-403 (2008)

Page 63: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! 100 cloned human fragments, average size ! 500 bp, from ENCODE regions.

•! Enrichment (relative to genomic DNA) from 1.25 to ! 200 fold.

•! Direct hybridization and diluted mixutures (! 25:1).

•! Three different amplification protocols:

•! Ligation-mediated PCR

•! Random-priming PCR

•! Whole-genome amplification

•! Nimblegen (50-mer), Affymetrix (25-mer), and Agilent (44-mer to 60-mer

isothermal) arrays.

•! 13 analysis algorithms.

Page 64: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Best results on the three platforms, for unamplified DNA, were comparable.

•! “Variance between experiments within the same platform is similar to, if not greater than, the variance observed between the different platforms.”

•! “The NimbleGen platform (4 replicates) is the most sensitive at lower levels of

enrichment (< 3 fold), followed closely by Agilent (2 replicates).”*

•! “The WGA method was used only on NimbleGen, but produced results with

very little reduction in AUC.”

•! GC content was not correlated with errors, but simple tandem repeats and

segmental duplications — not caught by RepeatMasker — were responsible for a large fraction of false positives and/or false negatives.

Page 65: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 66: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 67: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Common ChIP-chip

normalization scheme:

•! Quantile within treatment

condition.

•! Median scaling between

treatment and control.

•! Here, distributional differences

are too strong for median

scaling.

ENCODE Pol2: B1 (2x), B2 (2x), B3 (2x).

Page 68: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Derived statistics may have unexpected properties:

Page 69: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 70: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

8 10 12 14 16

!4

!2

02

4

H3K4me3, brain, array 1

A

M

H3K4me3, brain, array 1

M

Frequency

!4 !2 0 2 40

5000

10000

20000

30000

Page 71: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! Background

•! Additive background is clearly present at the probe level

•! Correction using MM probes degraded detection performance. Other

approaches had little detectable effect.

•! Probe response

•! Correcting for probe response by taking ratios is effective.

•! Sequence-based corrections alone are insufficient, and don’t seem to

add anything to ratio-based corrections.

•! Variance estimation

•! Small n (e.g., 2 vs. 2): standard t-statistics perform worse.

•! Small or moderate n: moderated t-statistics don’t hurt or help.

Page 72: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization
Page 73: Brixen ChIP-chip, 2008users.unimi.it/marray/2008/material/lectures/day2/Brixen, ChIP-chip... · Log ratios (unsmoothed) from 3 vs. 3 comparisons, two different IP/ PCR/hybridization

•! U.C. Berkeley

Terry Speed

•! LBNL

Mike Eisen, Mark Biggin, Xiaoyong Li, Stewart MacArthur

•! Affymetrix

Simon Cawley, Tom Gingeras, Antonio Piccolboni,

Stefan Bekiranov, Srinka Ghosh, David Nix

•! EBI

Wolfgang Huber