Discovery of transcription networks
Biology of transcription factorsLecture5 Dec 2012Regulatory GenomicsWeizmann Institute Prof. Yitzhak Pilpel
1First home-assignmentRead this paper:
Proc Natl Acad Sci U S A. 2006 Oct 3;103(40):14724-31. Epub 2006 Sep 26.Deduced network Properties.00.51-0.5-1
0.20.40.60.8G1G2 Mbp1 Ndt80 Ume6 MCM1' MCB MSE URS1 SCB MCM1' SFF'CorrelationExpressionCoherenceFkh1Swi4SufficiencyNecessity
Ho et al. Nature. 2002TF-TF interactionHierarchyDetect the effect of mutations in a motif
Foxp2 TF: a human regulator involved in speech
In humans, mutations of FOXP2 cause a severe speech and language disorder. Positive selection for variability in human compared to other vertebrates.
Some typical expression patterns
A Bayesian approach (conditional probability)Xi could be1 to denote denot:
The presences of motif m
Its distance from TSS is < N
Its on the coding strand
It neighbors another motif m
Or 0 otherwiseei = being expressed in pattern iExample: two rRNA processing motifs
The two motifs Work togetherThe two motifs orientation mattersThe procedureGiven that P(N|D)=P(N)*P(D|N) / P(D):Search in the space of possible Ns to look for a network that maximizes the above probabilityImpossible to enumerate all possible networks, thus needs an optimization algorithmUse cross validation: partition the data into 5 gene sets, learn the rules based on all but one and test based on the left-out, each time.For example: what does it take to belong to expression patter (4)?
Need to have RRPE and PACIf PAC is not within 140 bps from ATG , but RRPE is within 240 bps then the probability of pattern 4 is 22%If PAC is within 140 and RRPE is within 240 bp then 100% chance
Regulation of basal transcription in the promoter of IL-18 binding protein (Hurgin V, Novick D, Rubinstein M, PNAS 2002 )
LucLucLucLucpGL3(1272 mGAS)pGL3(1272 mIRF-E)pGL3(1272 mC/EBP-E1)pGL3(1272 mC/EBP-E2)
LucpGL3(1272)
LucpGL3(122)(1.0)Luc010203040Basal expression-1500-1000-500-1bpFor basal expression (1 AND 2) AND ((3L AND 3R) OR (NOT3L and NOT3R))RL
Inferring various logical conditions (gates) on motif combinations
The Bayesian network predicts very accurately expression profilesCan make useful predictions in worm
Motif discovery from evolutionary conservation data
S. Cerevisiae S. mikatae, S. kudriavzevii, S. bayanus). S. castellii S. Kluyveri
Their intergenic sequences spanned 40 to 67% identity
Nucleotide conservation in promoters is highest close to the TSSTATA-containing genes
All genesA set of discovered motifs
*******************************1234EC1=0EC2=0.66EC3=0.2EC4=0.2Threshold distance, DExpression coherence score, intuition
?????
NATURE | VOL 434 | 17 MARCH 2005The dataExamined intergenic regions of human mouse rate and dog~18,000 genesPromoters: 4kb centered on TSS3UTRs based on RNA annotations64 Mb, and 15 Mb in total respectively for promoters and 3 UTRsNegative control: Introns of ~120 Mb% of alignable sequence: promoters: 51% (44% upstream and 58% downstream of the TSS), 3 UTR: 73%, Introns:34%, Entire genome: 28%The phylogenetic trees
Questions:How would addition of species affect analyses?What if the sequences were not only mammalian?An example: a known binding site of Err-a in the GABPA promoter
Questions:What is the meaning of the other conserved positions?
Discovery of new motifs: exhaustive enumeration of all 6-mers
Discovery of new motifs: exhaustive enumeration of all 6-mers
Targets of new motifs showed defined expression patterns
Motifs often show clear positional bias close to TSS
Same methods to look for motifs in 3 UTRs reveals strand-specific motifs
Nat Rev Genet. 2009 Apr;10(4):252-63The most studied human TFs: a sever bias towards disease-related regulators
The most TF-regulated biological process: most knowledge comes from model organisms
A few structural families account for most human TFsStructure function relationship:homeodomain-containing TFs are often associated with developmental processes, and those in the interferon regulatory factor family are generally associated with triggering immune responses against viral infections
From motif to TF fold?
TFs expression across tissues: TFs are always more lowly expressed compared to other genes in the same tissue. Why?
Most TFs are either tissue specific or very ubiquitous. What types of combinations between TFs do we expect here?
First home-assignmentRead this paper:
Proc Natl Acad Sci U S A. 2006 Oct 3;103(40):14724-31. Epub 2006 Sep 26.A hierarchy of regulators
The phylogenetic profiles of human TFs
For example: 13% of the human TFs are primate specific, while only 2% of our metabolic enzymes are primate specificThe chromosomal arrangement of the human TFs
High TF densityHox TFs
Top Related