KATLYND REESE AQUATIC ECOLOGY 9 NOVEMBER 2011 Hypoxia or “Dead Zones” in Aquatic Systems.
How Dead Are the Dead Zones? (16/Sep/2010)
description
Transcript of How Dead Are the Dead Zones? (16/Sep/2010)
1
How Dead Are the Dead Zones? (16/Sep/2010)
Bob Harris
Penn State
Center for Comparative Genomics and Bioinformatics
2
• Looking at ChromHMM and Segway (short-range) segmentations, vs. certain annotated “features”
• Nothing fancy; just a simple base counting process
How Dead are the Dead Zones?
3
Portion of Genome
Most or much of genome is assigned todead zone classes.
ChromHMM assigns 76% of the genome todead zones.
Segway assigns 42%
Dead Zone Classes
Promoter Classes
Enhancer Classes
Other Classes
(fu
ll cl
ass
na
me
s g
ive
n o
n s
lide
11
)
4
Mappable Bases
• Mappability derived from signal tracks– 152 signal tracks are the inputs to the segmentation
– What is considered mappable for a given signal track is dependent on the tag extension length for that track
• I’m using the union of mappable intervals over all the tracks– A base is counted as mappable if it appears in an interval in any
track
• Not to be confused with the “mapability track” (wgEncodeMapability)
5
Mappable Bases
Dead Zone
Promoter
Enhancer
Other
Not dead simply as an artifact ofnot mapping.
6
In Repeats
Dead Zone
Promoter
Enhancer
Other
Repeats for ChromHMM dead zonesAre comparable to other classes.
Ditto for Segway’s DF and DFC.
7
In Genes
Dead Zone
Promoter
Enhancer
Other
Dead zones contain interesting thingslike genes.
8
In Exons
Dead Zone
Promoter
Enhancer
Other
Exon content is low for dead zones.
9
GC Content, CpG Ratio
Dead Zone
Promoter
Enhancer
Other
Dead zones are on theLow end for GC content.
CpG Ratio is low, butcomparable to othernon-promoter classes.
10
Related Work
• Also looked/looking at– SNPs– Sequence composition
• More plots and spreadsheet athttp://www.bx.psu.edu/~rsharris/encode/index.html#dead_zones
• Integration Vignette B02, in progresshttp://encodewiki.ucsc.edu/EncodeDCC/index.php/Integration_Vignette_B02
11
• ChromHMM K562 kitchensink– http://www.broadinstitute.org/~jernst/K562_max_25state_49mark.bed.gz
– Lifted over to hg19
• Segway short-range K562 kitchensink– http://noble.gs.washington.edu/~stasis/public/2010/segtools/round5b/kitchensink/
k562/round5b.kitchensink.k562.1224-0218a.stws1.bed.gz
– Lifted over to hg19
• Signal tracks– http://noble.gs.washington.edu/~stasis/public/2010/encode/round6/rawSignal/
– 152 *.bedGraph.gz files
Data Sources
12
Class Names5P0 14 5'UTR
5P1 24 Promoter - 5' UTR
CNV0 6 Repetitive/CNV high
CNV1 13 Repetitive/CNV medium
CNV2 15 Repetive/CNV low
D0 8 Dead zone (more dead)
D1 20 Dead zone
E0 16 Enhancer strong
E1 17 Enhancer - moderate
E2 11 Enhancer
GE 3 End of transcription
GS 7 Transcription initation
I0 9 CTCF + open chromatin high
I1 12 CTCF + open chromatin medium
I2 0 CTCF + open chromatin low
IG 4 Intergenic
K27me3 23 H3K27me3
K36me3 10 H3K36me3 transcribed
K4me1 5 H3K4me1
R0 2 Specific Repression Strong
R1 1 Specific Repression Weaker
T0 21 Weak transcribed
T1 19 Transcribed 5'
TSS0 22 TSS promoter strong
TSS1 18 Promoter/TSS
BBT 23 0.11 BRF1+BDP1+TR4
D0 1 0.0 D dead zone
D+Alu 18 0.1 K9me1 H3K9me1+H4K20me1
DF 13 0.2 F0 FAIRE
DFC 19 0.3 FC FAIRE+CTCF
E0 0 2.1 GM0 enhancer
E1 2 2.4 GM1 enhancer
FI 16 0.10 FI FAIRE+input
GE0 3 2.7 gene end
GE1 7 0.12 gene end
GM0 21 2.5 gene end
GM1 4 2.6 gene end
GM+K36me3 5 0.7 gene end
GS0 10 2.2 gene body
GS1 15 2.8 GM2 gene middle
I 20 2.3 I insulator
K4me1 12 0.8 H3K4me1+H3K9me1+H4K20me1
R0 6 0.4 R0 repression
R1 9 0.5 R1 repression
R2 22 0.6 R2 repression
R3 11 0.9 R3 repression
RGM 8 0.13 H4K20me1+H3K9me1
RTSS 24 0.14 R4 repressed TSS
TSS0 17 1.0 transcription start site
TSS1 14 2.0 near transcription start site
Seg
way
Chr
omH
mm