CBM-90-IRD-X33-850nm Mosaic Array Series Infrared Chip On ...
Tiling Array and ChIPTiling Array and ChIP-chip
Transcript of Tiling Array and ChIPTiling Array and ChIP-chip
![Page 1: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/1.jpg)
Tiling Array and ChIP-chipTiling Array and ChIP chip
![Page 2: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/2.jpg)
Gene Regulation
Expression No Expression SpatiallyExpression No Expression
X
Y
Spatially
X
Z
Temporally
A
B
A
B
A
B
Y
Z
C C CX
Y
Z
X
Y
Z
![Page 3: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/3.jpg)
Transcription Factors and Their Binding Sites
TF1 TF2Transcription factors (TF): TF1 TF2Transcription factors (TF):
Transcription factor binding sites (TFBS): CCACCCAC, TAATAAAAT
TF1TF1TF2
TF1TTATGTAACCTGCACTTACTACCACCCACAACATAATAAAATCTAAACCACTGAATGAAATACAAAATCTATGTATGA...
TF2TTATGTAACCTGCACTTACTACCACCCACAACATAATAAAATCTAAACCACTGAATGAAATACAAAATCTATGTATGA...
![Page 4: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/4.jpg)
Transcription factor binding motif
GTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGA
TAACATGTGACTCCTATAACCTCTTTGGGTGGTACATGAA
TF
TF
TF
123456789
TGGGTGGTC
TGGGTGGTA
1 2 3 4 5 6 7 8 9
A 0 0 1 0 1 0 0 0 1
CTGGGAGGTCCTCGGTTCAGAGTCACAGAGCAGATAATCA
TTAGAGGCACAATTGCTTGGGTGGTGCACAAAAAAACAAG
TF
TF
TF
TGGGTGGTA
TGGGAGGTC
TGGGTGGTG
TGAGTGGTC
C 0 0 0 0 0 0 0 0 4
G 0 6 5 6 0 6 6 0 1
T 6 0 0 0 5 0 0 6 0AACAGCCTTGGATTAGCTGCTGGGGGGGTGAGTGGTCCAC
ATCAGAATGGGTGGTCCATATATCCCAAAGAAGAGGGTAGTF
TGAGTGGTC
TGGGTGGTC
Transcription Factor Binding Sites (TFBS)
1 2 3 4 5 6 7 8 9
A 0.00 0.00 0.17 0.00 0.17 0.00 0.00 0.00 0.17
a sc pt o acto d g S tes ( S)
C 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.66
G 0.00 1.00 0.83 1.00 0.00 1.00 1.00 0.00 0.17
T 1.00 0.00 0.00 0.00 0.83 0.00 0.00 1.00 0.00
Motif
![Page 5: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/5.jpg)
Finding motifs from co-regulated genes
(R th t l 1998 H h t l 2000 t )(Roth et al., 1998; Hughes et al., 2000; etc.)
GTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGAGTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGA G 1 GTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGA
CTGGGAGGTCCTCGGTTCAGAGTCACAGAGCAGATAATCA
TAACATGTGACTCCTATAACCTCTTTGGGTGGTACATGAA
GTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGA
CTGGGAGGTCCTCGGTTCAGAGTCACAGAGCAGATAATCA
TAACATGTGACTCCTATAACCTCTTTGGGTGGTACATGAA
Gene1
Gene2
Gene3
Gene 1Gene 2Gene 3
Condition1 Condition2
…
Gene N
![Page 6: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/6.jpg)
Motif discovery is difficult in mammalian genomes due to a low signal to noise ratiogenomes due to a low signal-to-noise ratio
Gene1100~1000 bp
G 2100~1000 bp
t Gene2
Gene3100~1000 bp
yeast
10k~1000k bpGene1
Gene210k~1000k bp
10k~1000k bp
human
Gene3p
![Page 7: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/7.jpg)
ChIP-chipp
![Page 8: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/8.jpg)
Genome Tiling Arraysg y
• Affymetrix genome tiling microarraysy g g y– Tile the genome non-repeat regions– Chr21/22 tiling (earlier version): 1 million probe pairsChr21/22 tiling (earlier version): 1 million probe pairs
(PM & MM) at 35 bp resolution on 3 arrays– Whole genome: 42 million PM probes on 7 arraysg p y
PM CGACATTGATTCAAGACTACATACAPM CGACATTGATTCAAGACTACATACAMM CGACATTGATTCTAGACTACATACA
Probes
ChromosomeChromosome
By Xiaole Shirley Liu at Harvard
![Page 9: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/9.jpg)
Genome Tiling Arraysg y
# Arrays # Probes # Total Probe Probehuman genome
# Probes / Array
# Total Probes
Probe Length
Probe Resolution Price
Affymetrix 7 6M 42.0M 25mer 35 bp $2,000
Ni bl 38 390K 14 8M 50 110 b $30 000Nimblegen 38 390K 14.8M 50mer 110 bp $30,000
300 bp in
Agilent 21 244K 5.1M 60mer
300 bp in genes;
500 bp in $11,000
intergenic
By Xiaole Shirley Liu at Harvard
![Page 10: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/10.jpg)
ChIP-chip Array Hybridizationp y y
• Map high intensity probes back to the genomep g y p g• Locate TF binding location
ChIP-DNA
Noise
Probes
ChromosomeChromosome
By Xiaole Shirley Liu at Harvard
![Page 11: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/11.jpg)
Identify ChIP-enriched Regiony g
• Controls: sonicated genomic Input DNAg p• Often 3 ChIP, 3 Ctrl replicates are needed
ChIPChIP
Ctrl
By Xiaole Shirley Liu at Harvard
![Page 12: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/12.jpg)
Other Applicationspp
• Transcription factor binding (ChIP-chip)
• Chromatin modifications
• DNA methylation
• Transcriptome
• Nucleosome positioning
• Copy number variations
![Page 13: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/13.jpg)
Back to ChIP-chipp
![Page 14: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/14.jpg)
Data Analysisy
P i &Preprocessing & Normalization
Peak DetectionPeak Detection
DownstreamDownstream Analyses
![Page 15: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/15.jpg)
Raw dataw d
ChIP Control
![Page 16: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/16.jpg)
Mann-Whitney U-testfor ChIP region Detectionfor ChIP-region Detection
• Affy TAS, Cawley et al (Cell 2004):Affy TAS, Cawley et al (Cell 2004): – Each probe: rank probes (either PM-MM or
PM) within [ 500bp +500bp] windowPM) within [-500bp, +500bp] window– Check whether sum of ChIP ranks is much
llsmaller
By Xiaole Shirley Liu at Harvard
![Page 17: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/17.jpg)
TileMap (Ji d W Bi i f ti 2005)(Ji and Wong, Bioinformatics 2005)
STEP 1:Compute a test statistic for each probe toCompute a test statistic for each probe to
summarize probe level information
STEP 2:C bi b l l i i fCombine probe level test statistics of
neighboring probes to help infer binding regions
![Page 18: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/18.jpg)
Probe level test statistic: empirical Bayes approach
Probe 1 2 3 I
22s 2
3s 2Is2
1s …
Probe
Sample Variance (df)
1 2 3 … I
Mean Sum of Squares
∑ −=i i ssS 222 )]([2s
IsIB 1)(212ˆ 22 −+
−=Shrinkage Factor
Ss
dfIdfB )(
22 ++
+=Shrinkage Factor
222 ˆ)ˆ1(ˆ sBsB ii +−=σ
Variance Shrinkage Estimator
21σ̂
22σ̂ 2
3σ̂ 2ˆ Iσ…Variance EstimatesA modified t-statistic
i
iii
KK
xxtσ̂11
~
21
21
+
−=
1~t 2
~t 3~t It
~…Probe level test statistics 1t 2t 3t ItProbe level test statistics
![Page 19: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/19.jpg)
Combining neighboring probesg g g p
TileMap (MA)TileMap (MA)
1. Compute the probe level test statistic t for each probe;
2. Compute a moving average statistic to measure enrichment;
3. Estimate FDR.
TileMap (HMM)
1. Compute the probe level test statistic t for each probe;
2. Estimate the distribution of t under H0 and H1;
3. Model t by a Hidden Markov Model, and decode the HMM.
![Page 20: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/20.jpg)
Shrinking variance increases statistical power
Moving Average
t-statistic, variance shrinking
Moving Average
t-statistic, canonical
g
Mean(X1)-Mean(X2)
,
Mean(X1) Mean(X2)
![Page 21: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/21.jpg)
Peak 2 (180bp) transgenics( p) g
N l t b i T iNeural tube expression Transgenics
![Page 22: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/22.jpg)
Comparisons between TileMap and previous methodsmethods
cMyc ChIP-chip Data: 6 IP + 6 CT1 + 6 CT2cMyc ChIP chip Data: 6 IP + 6 CT1 + 6 CT2
Gold Standard: Using GTRANS and Keles’ method to analyze all 18 arrays
Test data: 4 arrays, 2 IP vs 2 CT1 (s2r2)
TileMap-HMM (Ji & Wong, 2005)
GTRANS or TAS (Kampa et al., 2004)
1. Set a window;
2. Perform a Wilcoxon signed rank test for . e o a W co o s g ed a test oeach window.
Keles et al. (2004)
1. Compute a t-statistic t for each probe p p(no shrinking, two sample only);
2. Rank probes by a moving average.
![Page 23: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/23.jpg)
Shrinking variance saves money
Using non-shrinking method (Keles’ method) to analyze all probesUsing non shrinking method (Keles method) to analyze all probes
Using shrinking method to analyze half of the probes, i.e., reduce information by half
![Page 24: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/24.jpg)
MAT(J h W E t l PNAS 2006)(Johnson W.E. et al. PNAS, 2006)
• Model-based Analysis of Tiling arrays for ChIP-chip
• Goal: – Find ChIP-regions without replicates
Find ChIP region without controls– Find ChIP-region without controls– Find ChIP-regions without MM probes– Can analyze data array by arrayy y y y
By Xiaole Shirley Liu at Harvard
![Page 25: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/25.jpg)
MAT
• Estimate probe behavior by checking other probes with similar sequence on the same array
• Probe sequence plays a big role in signal value
• Most of the probes inMost of the probes in ChIP-chip measures
ifinon-specific hybridization
By Xiaole Shirley Liu at Harvard
![Page 26: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/26.jpg)
Probe Behavior Model
Baseline on number of Ts
A,C,G at each position
A,C,G,T Count Square
25mer Copy NumberA,C,G at each position of the 25mer
25mer Copy Number along the Genome
By Xiaole Shirley Liu at Harvard
![Page 27: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/27.jpg)
Probe Standardization
• Fit the probe model array by arrayFit the probe model array by array• Divide array probes to bins (3k probes/bin)
B k d bt ti d t d di ti• Background-subtraction and standardization (normalization) on a single array;
Model predicted probe intensity
Observed probe intensity
iii s
mPMLogt ˆ)( −=
Observed probebinaffinityis Observed probe
variance within each bin
By Xiaole Shirley Liu at Harvard
![Page 28: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/28.jpg)
Eliminate Normalization
• Probe log(PM) values before and after g( )standardization
• If normalize before model fitting– Predicted same ChIP-regions, although less confident
By Xiaole Shirley Liu at Harvard
![Page 29: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/29.jpg)
ChIP-region Detectiong
• Window-based MATscore– ChIP without Ctrl
iiTMiMAT )'()(– TM: trimmed mean
ChIPnregioninstTMregionMAT )'()( =
TM: trimmed mean– Multiple ChIP with multiple Ctrl
ChIPInput
nInputinstTMChIPinstTMregionMATσ
)'()'()( −=
– More probes, higher t values in ChIP, less variance (fluctuation) more confident(fluctuation) more confident
By Xiaole Shirley Liu at Harvard
![Page 30: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/30.jpg)
Raw probe values at two spike-in regions with concentration 2X2X 2X
ChIP_1 Log(PM)
Input 1 Log(PM)Input_1 Log(PM)
Sequence-based probe behavior standardizationq pChIP_1 t-value
I t 1 t lInput_1 t-value
Window-based neighboring probe combination for ChIP-region detectionWindow based neighboring probe combination for ChIP region detectionChIP_1 MATscore
ChIP 1/Input 1ChIP_1/Input_1MATscore
3 Reps ChIP/InputMATscore
By Xiaole Shirley Liu at Harvard
![Page 31: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/31.jpg)
Statistical Significance of Hitsg
Background
<1% enriched
Enriched DNA
• P-value and FDR cutoff:– P-value from MATscore distribution– Estimate negative peaks under the same P value cutoff– Regional FDR = #negative_peaks / #positive_peaks
By Xiaole Shirley Liu at Harvard
![Page 32: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/32.jpg)
MAT summary y
• Open source python p pyhttp://chip.dfci.harvard.edu/~wli/MAT/
• Runs faster than array scannerRuns faster than array scanner• Can work with single ChIP, multiple ChIP, and
multiple ChIP with controls with increasingmultiple ChIP with controls with increasing accuracy
U i l ChIP t t t t tib d– Use single ChIP on promoter arrays to test antibody and protocol before going whole genome
Can identify individual failed samples• Can identify individual failed samples
By Xiaole Shirley Liu at Harvard
![Page 33: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/33.jpg)
Benchmark for ChIP-chip Target Detection(J h D S t l G R h 2008)(Johnson D.S. et al. Genome Research, 2008)
• ENCODE Spike in experiment:• ENCODE Spike-in experiment: both amplified and un-amplified
ChIP96 ENCODE clones,
2 4 8 256X i h t +
Input
t t l i DNA2,4,8,...,256X enrichment + total chromatin DNA
total genomic DNA
• Blind test: Samples hybridized to different tiling arraysSamples hybridized to different tiling arrays, predictions made before the key was released
![Page 34: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/34.jpg)
Comparison of platformsp p
![Page 35: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/35.jpg)
Comparison of algorithmsp g
Combined Johnson D.S. et al. Genome Research 2008 with Ji H. et al. Nature Biotechnology 2008
![Page 36: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/36.jpg)
Residual Probe Effects after MAT
![Page 37: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/37.jpg)
TileProbe (Judy & Ji, Bioinformatics, 2009)
![Page 38: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/38.jpg)
TileProbe vs. MAT (GLI3)( )
1IP 0CT 3IP 0CT
![Page 39: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/39.jpg)
TileProbe vs. MAT (Oct4)( )
1IP 0CT 3IP 0CT
![Page 40: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/40.jpg)
TileProbe vs. MAT (NRSF)( )
1IP 0CT 2IP 0CT
![Page 41: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/41.jpg)
Motif enrichment
![Page 42: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/42.jpg)
MBR: Microarray Blob Removery
By Xiaole Shirley Liu at Harvard
![Page 43: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/43.jpg)
xMAN: eXtreme MApping of oligoNucleotidesoligoNucleotides
• http://chip dfci harvard edu/~wli/xMAN• http://chip.dfci.harvard.edu/~wli/xMAN• xMAN maps ~42 M Affymetrix tiling probes to the newest
human genome assembly in less than 6 CPU hourshuman genome assembly in less than 6 CPU hours– BLAST needs 20 CPU years; BLAT needs 55 CPU days– Probe TCCCAGCACTTTGGGAGGCTGAGGC maps to 50,660 p ,
times in the genome
• Can map long oligos, and paired tag high throughput sequencing fragments
• Store the copy number information of every probe• mXAN filters tiling array probes to ensure one unique
probe measurement per 1 kb, improves peak detection
By Xiaole Shirley Liu at Harvard
![Page 44: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/44.jpg)
CisGenome(Ji H t l N t Bi t h l 2008)(Ji H. et al. Nature Biotechnol., 2008)
Graphic User Interface
CisGenome Browser
Core Data Analysis
Programsg
![Page 45: Tiling Array and ChIPTiling Array and ChIP-chip](https://reader034.fdocuments.net/reader034/viewer/2022050611/62735d13ee3df67cc63cdd1a/html5/thumbnails/45.jpg)
CEAS: Cis-regulatory Element Annotation SystemSystem
• Data Analysis Button for Biologists
http://ceas.cbi.pku.edu.cnBy Xiaole Shirley Liu at Harvard